Converting The Repos

Converting single/multiple SVN repos to Hg repos

This was perhaps the trickiest part of the process for me. This was because there's a plethora of tools for doing SVN to Hg conversions, but most of them don't seem to work well. I first tried yasvn2hg, but I couldn't get the damned script to even run. Next I tried Tailor which promised to be the Swiss Army Knife of repo conversion utilities. However, I had hours of headache and no progress using it. Finally, I tried hgsvn, and it worked like a charm.

hgsvn is apt-gettable in Debian. However hgsvn needs the functionality from python-setuptools, but its package does not require it. This means that, unless you already have python-setuptools installed for something else, chances are you will see this error when you install hgsvn and try to run it:

$ hgimportsvn http://url.to.repo/repo
Traceback (most recent call last):
  File "/usr/bin/hgimportsvn", line 5, in 
    from pkg_resources import load_entry_point
ImportError: No module named pkg_resources

If you get this error, simply install the python-setuptools package (or equivalent) and try again.

Once hgsvn and its needed libraries are installed on your system, the basic method to convert a repository is as follows:

$ hgimportsvn http://url.to.repo/repo
...^^^Sets up the import
$ cd repo
...^^^Changes to the freshly created subdir
$ hgpullsvn
...^^^Pulls down all the changes from svn and creates an hg history
$ hg update (optional)

Once you've done these steps, your repo will have been converted to Hg. This works well for single repositories, but what if you have something more complicated?

Splitting a single repo into multiple repos

If you're like me, when you originally set up SVN you did so in the laziest way possible.

Setting up SVN repos is more work than it should be. It involves using commands that you normally never have to touch (svnadmin), setting up new entries for those repos in your http server's configuration files (if you're using Apache and WebDAV), and setting up user permissions to those repos. Thus, the lazy way to set them up is to make one central SVN repo under which you have multiple sub-repos. This has the advantage of making your repository very easy to maintain. However has a big disadvantage in that a user with write access to any sub-repo will have write access to the entire repo.

In Hg, on the other hand, setting up a new repository is much easier, and maintaining multiple repositories more manageable. So, if you're like me, you may be tempted to remedy past sins by splitting your single gargantuan SVN repo into smaller Hg repos. Thankfully, hgsvn makes this very easy.

Let's say that you have one core SVN repo, called "main" which has the following sub-directories which you are treating as sub-repos:

main/
  projecta/
  projectb/
  projectc/

hgsvn can actually handle sub-directories of SVN repos and generate histories of just those sub-directories, effectively splitting the directories into repos of their own. It will even keep track of changes that only affect the individual sub-repo (meaning parent or neighbor changes don't get entered, unless they were otherwise combined in the original SVN).

A method for splitting the above could be:

$ hgimportsvn http://url.to.repo/main/projecta/
...^^^Start with "projecta/"
$ cd projecta
$ hgpullsvn
...^^^Pull the history for "projecta/"
$ hg update
...
$ cd ..
$ hgimportsvn http://url.to.repo/main/projectb/
...^^^Move on to "projectb/"
$ cd projectb
$ hgpullsvn
...^^^Pull the history for "projectb/"
$ hg update
...
etc.

Cleaning up the SVN cruft

When you're done using the hgimportsvn and hgpullsvn tools, you will have repos in a strange half-SVN/half-Hg form. They will be legitimate Hg repos, but they will still have the .svn directories strewn throughout them, and have some .hgignore files telling Hg to ignore said .svn directories. So, if we're going to go 100% Hg, we may as well get rid of this stuff.

$ cd repo/  (whatever the path is to your hgsvn made repo)
$ find . -name .svn | xargs rm -fr
...^^^Get rid of the .svn/ directories
$ find . -name .hgignore | xargs rm -fr
...^^^Get rid of the .hgignore entries

hgsvn now requires python-setuptools

I am just reading your articles and quickly checked on the current Debian repositories. As can be seen on http://packages.debian.org/sid/hgsvn, python-setuptools is now a dependency.
--
Alexander Kriegisch
Certified ScrumMaster

Scrum-Master.de - Agile Project Management
http://scrum-master.de

Hg and empty directories..

One thing I should note here because it was something that tripped me up when I was doing the conversion (but for whatever reason I failed to mention it in this guide), Hg does not track directories, only files.

This means that if your SVN repo had empty directories (which, it probably will) those empty directories wont show up in the converted Hg repo.

You can read more about this here.

Me, I had every SVN repo laid out like the following:

/trunk
/tags
/branches

This is, of course, typical SVN good practice.

However, many times I didn't have anything in the "branches" or "tags" directories. Whenever those directories were empty, they just wouldn't show up in my final Hg repo.