All DVCS suck

Sam Hart

2007-05-22 16:28:30

Well, I've been migrating my development server to a new hoster (actually, they aren't new, I've used them for years for my other stuff, and been very pleased with them). In the process of the move, I've been cleaning things up and re-engineering things somewhat to solve some of the problems I've had traditionally.

One of the things that does keep coming up is a question as to whether or not I should continue to use Subversion for my online code repo. So, I've been looking at other alternatives, especially DVCSes, trying to see whether I would get any real benefit from them, or just be more burdened.

What's my conclusion? My conclusion is that they all suck, and maybe I just need to stick with SVN. Read on for the details...

SVN's elegance

First of all, let me tell you why I've been using SVN all these years. There's many reasons, but the key ones are as follows:

Simple repository maintenance: I know that the brass ring of DVCS is the whole "distributed" model, where you have no central canonical repo, but you know what? That's stupid. If you're ever in a situation where you don't have a central canonical repo you've done something wrong. Period. In the vast majority of projects you'll ever work on, there will always be some central place where the changes from the myriad of developers working on the project will eventually need to publish their changes to. And for repository creation and maintenance, SVN is supremely simple to use.

Easy and low ceremony commands: SVN is insanely easy to use. The commands are all very well documented, it has a helpful command-line help system, and there's no mystical magic that's needed in order to use it effectively.

Very easy to add new contributors to: This is perhaps SVN's biggest strength. By leveraging against existing technologies like Apache, giving someone write access to the repo is amazingly simple. You can give oodles of people write access to your SVN by simply setting them up with an entry in an htpasswd file. This is great when you, you know, don't want to fucking have to give every contributor shell access on the machine that hosts the repo.

The need for DVCS

Where SVN starts not working well is when you start needing to have more of a distributed model than even SVN can allow for. For example, when you need to have multple repos that aren't canonical, but rather will eventually feed the central, canonical repo. Or when you want to branch, merge, etc on a local repo when you're offline.

Because of the various hyped up concepts of DVCS, it starts becoming more and more appealing when you're looking for something more than what SVN can offer.

The question then becomes which DVCS to use. That's where the complication comes in, and that's where you can easily find a myriad of irritations.

So let's look at some of the DVCS offerings. I wont be showing you what their features are, as they all seem to offer essentially the same things as far as general project management needs are concerned. Instead, I'll point out their irritations.

BTW, I'm doing this largely as a personal exercise to try and help me figure out what I'm going to do: Whether I keep SVN or use one of the alternatives. Even though I may sound very irritated by each of these DVCSes (and actually am very irritated by them) I still haven't decided what I'm going to do. Basically, I want to weigh them all and try to see exactly which will irritate me the least (which tends to be how I decide things anyway since everything irritates me).

bzr

Let's start with Bazaar or bzr. This is the DVCS made and used by Canonical in their development on Ubuntu.

bzr's biggest problems are as follows:

Not easy to grant users write access to a central repo: GAH! What the fuck?! Why do I have to grant everyone shell access to my repo server just so that people can fucking push to the repo? And if I don't grant people access, they have to push bundles to me (or to someone I trust with shell access). Well, I'm sorry, but that's unacceptable. I can't be the bottleneck for some project (and I don't want to have others be the bottleneck either). Sure it's great that I can just "push" bzr repos by copying the files over to a dumb HTTP server, but that's a small boon for such a huge fucking loss.

Confusing web-interface that you can't actually pull from: If you do use the smart server, the web-interface is insanely ass-backwards. Not only that, but you can't fucking pull from it. What the hell?

Complicated push mechanism: Okay, so it's great that I can work locally... that's Jim-Fucking-Dandy. But what about when I need to, you know, actually publish my work somewhere so others can get at it? Well, there's no easy way to do it. Every method winds up using multiple additional layers of complication just to get the data online somewhere. This is horrid and unacceptable.

Fucking cache: Unless you set things up very carefully, any remote repos you push to will wind up being empty with the repo contents in an incomprehensible cache. Now, you can easily get the data back out with bzr, but what if bzr somehow breaks or goes bye-bye? Well then you're fucked. You can't get that data back out easily.

git

git is the thing that the kernel folks use. It honestly has some pretty nifty features, and I probably would lean towards it if it didn't have the following gigantic, hairy, man-titty problems:

Insanely confusing user interface: GAH! I don't need another 9 million fucking commands to learn just in order to use my DVCS! I know they have a central command-line wrapper thing that is supposed to make things easier to use. Unfortunately, it seems to be buggy at best, and broken at worst.

No native web-interace: I need to be able to publish the work I do in a fashion that is browsable from the web, and in this day and age, any DVCS that requires some external program to put its repo on the web feels dated. This makes git feel like a dinosaur compared to SVN, and that's not a good thing.

Buggier than sin: At Progeny we had a distribution development toolkit built around git, and I used that fucker extensively. However, there were at a minimum at least one problem per week we had to deal with due to problems in git. Some of these were so bad that we had to lose vast sections of a project's history just to be able to get the current state of the repo back.

No easy way to grant users write access: No surprise here, but git suffers from the same problems as bzr. Granting a user write access to a central git repo means giving them some sort of shell or rsync access to the server the repo runs on. It wasn't acceptable with bzr, and it's even less acceptable with git considering how buggy it is.

Everything is a fucking hash: Just like bzr, remote repos aren't stored in a way you can just browse to on the filesystem. Everything in remote git repos is a fucking hash. If git goes away, you can say goodbye to your data.

Hg

Hg is fucking clever. Hg, Mercury, get it? Get it?! Bah... Unfortunately, all the cleverness for Hg was spent coming up with the fucking name...

Incomprehensible commands: Hg tries to be better at commands than git, and it mostly succeeds. However, by mixing and matching option/verb/action combinations you'll wind up spending a lot of time typing your commands/options in the wrong order and getting errors. GAH! Pick a method and use it consistently!

Holy fuck! More hashes! What the hell is up with hashes these days?! Why can't you just fucking store the current state of the repo in a readable fashion so that I can get at it without using your fucking tool?! GAH!

svk

svk is essentially Subversion with DVCS stuff wedged on top. It uses the SVN filesystem, but then provides the more common DVCS functionalities on top of that. This has a huge benefit of being compatible with SVN clients, but loses some of the functionality you expect from other DVCSes.

So svk problems...

SVK and SVN can confuse eachother: The idea of downwardly compatible with SVN is so spiffy it gives me wood just thinking about it. The problem? It doesn't work. Rather it usually works but manages to fail when you really need it the most. I've had many cases where SVK can confuse SVN and visa versa resulting in a hosed repo. Not cool.

No SVN:Externals: Last time I used SVK, it didn't support svn:externals. Considering how useful the svn:externals property is when doing real-DVCS work lacking it makes you have to jump through more hoops just to get something done that stock-svn can do simply.

When SVK breaks, it really breaks: SVK seems to suffer from the git wedge problem in that when things go south, they really go south.

Many choices, none good

I know there's other DVCSes out there, but the ones I looked at all had enough problems that they didn't even make it onto this list of things for me to bitch at. In the end, I really think it will wind up being one of the above for me.

Right now, I'm leaning strongly towards SVN, bzr and hg. git and svk's wedge problems are simply to onerous for me to consider them seriously.

I will say that Hg looks to have the least problems, but the fact that it has less-than-stellar support in Debian makes me worried.

Anyway... stay tuned.