Converting from Subversion to Mercurial

As I said in my last entry, I've been evaluating the various modern DVCSes to try and figure out which of them would give me the most benefit, while at the same time irritate me the least.

I've been using Subversion (SVN) for a few years now on my dev servers (formerly, svn.samhart.net and friends) and have mostly been pleased with it. In fact, the only reason I even considered replacing SVN was because there were certain aspects of DVCS that I felt could make my life easier, namely the ability to have a repo's entire history available locally and the fact that offline work can be done so much easier with them.

Additionally, I've been working with a lot of modern DVCSes lately (namely bzr, git and svk) and I've been very displeased by each of them. They all had at least one critical problem that, for me, made them impractical to even consider for use in my own repos. The end result is that I've spent a lot of time frustratingly researching and testing as many DVCSes as I could to try and figure out if I should switch or just stick with SVN.

But, after the smoke cleared and the fires died down, I discovered that one DVCS, Mercurial (Hg) was left standing on equal ground with SVN in the "has to not irritate me" department.

The problem? Conversion from SVN to Hg isn't as straightforward as one would like. Thus, I'm documenting the steps I had to do to try and help out anyone else who's attempting to go down this path.

For what it's worth, I don't plan on discussing what pro's and con's are involved with each of the DVCSes here. At the end of the day they each have comparable feature-sets and functionalities, and any choice as to which DVCS a person will use will likely be a very personal one (or at least one dictated by someone charge :-) Thus, I am not going to argue the benefits of Hg over any of the others, or even over SVN. I'm merely going to show how you can convert your existing SVN repos into Hg repos, as well as set up Hg to be allow for easy SVN-like pushes/pulls on your server.

System Information

I should mention a what I have been running, as well as what I will be running. I do this only because I know there's a myriad of ways to set up SVN and Hg, and unless you're doing what I'm doing, my notes wont help you much.

Traditionally, I ran SVN using WebDAV in Apache2.x. I wanted to continue to run Hg using Apache2.x (as this server has other needs for Apache2.x), but I no longer needed WebDAV for Hg. I'm also running Debian (with a mix of packages from stable, testing, and unstable).

Every tool that I mention in this guide can currently be found in Debain, their package names are:

  • mercurial
  • hgsvn
  • apache2
  • python-setuptools

Naturally, you can get these things up and running in other *nixes, but I'll leave that up to you to figure out if you decide to follow my guide.

Two questions

What are the issues you had with other distributed SCMs that made them impractical for your purposes? In other words I'm curious what makes Mercurial so special from your point of view. Also have you found a tool that allows both pulling and pushing between Subversion and Mercurial, something like git-svn? As far as I see hgsvn you described supports only pull.

Dropped SVN

For me, pushing back to a SVN repo wasn't important... in fact, I didn't want it. Honestly, the git-svn you mention does some seriously scary black-magic-foo that I'd be too worried to use generally :-) The post was titled "Converting from [SVN] to [Hg]", so I wanted it to be a one-way trip.

If you really want to know why I picked Hg over git, bzr, et al, I have a link that was in the original post (see above). But, some quick bullet points from the thought process are as follows:

BZR

  • Pros:
    • Ease of use (very similar to SVN syntax)
    • Widely used
    • Python (so I can easily hack on it, if I needed)
  • Cons:
    • Dog slow. There were some improvements early in 0.9.x, but it's still profoundly slow to use.
    • Poor integration with web-servers
    • Client/server authentication schemes ridiculously complex

git

  • Pros:
    • Very fast
    • Many crazy/cool functions (rebase is a good example)
  • Cons:
    • Bizarre and counterintuitive interface
    • The crazy/cool functions seem to be very dangerous (rebase is still a good example :-)
    • Elitist development community
    • Client/server authentication schemes ridiculously complex, and community hostile to them ("Why would you want to do something like that? You suck, you're doing it wrong" isn't a great answer when you want to set up a central repository where all changesets will eventually end up)
    • Convoluted and difficult to integrate with a web server

Hg

  • Pros:
    • Blindingly fast
    • Easy to understand interface
    • Integrated web server
    • Easily integrated with external web servers and authentication systems
    • Friendly and active development community
    • Python based
  • Cons:
    • Binary repository chunks highly dependent on the tool not breaking down (git, bzr, svn all have the same problem, but I didn't mention it above :-)
    • Lacks some of the "crazy/cool" functions of git (but, again, they seem dangerous, so maybe that's a good thing :-)