Mercurial wins me over

6 11 2007

I’m a source-control kind of guy. Anyone that knows me would assume that I’d always insist on a source-control tool of some kind, even for my own “solo” work.

But they’d be wrong – I’ve only just found one I’m happy with, and in the meantime I’ve gone several years without any source-control tool. And frankly, I’ve always been a bit perplexed at how everyone else seems to get along with these tools.

Sure, in the past I’ve worked on teams using PVCS or ClearCase, and before that PANVALET on mainframes (and some other mainframe tool whose name I can’t even remember). I’ve had the odd encounter with CVS, Subversion and Perforce. And when I started setting up my own development environment environment a few years back, source-control was one of the first things I looked at (together with overall directory structures, backup, and security).

But at that time I wasn’t happy with any of the tools I found. Everyone else seemed to be using CVS, but the more I learnt about it the more of a ridiculous nightmare it seemed. I looked at Subversion and Perforce and a few others, but at the time they all seemed far too awkward, limited and problematic to suit my needs – just far more trouble than would be worth. The more expensive tools were beyond my budget (and in any case, given past experiences, I kind of expected them to be worse rather than better).

I think at least part of the problem was that these tools tend to address a broad but ill-defined set of loosely-related issues. It’s as if everybody knows what such source-control tools are supposed to do (unfortunately, often based on CVS, which just seems insane), but this isn’t based on any clear definition of exactly what needs such a tool should and shouldn’t be trying to address. Then each specific tool has its own particular flaws in conception, architecture and implementation. Throw non-standard services, storage mechanisms and networking protocols into the mix, and you end up having to deal with a huge pile of complications and restrictions just to get one or two key benefits.

As an aside, the Google “Tech Talk” video Linus Torvalds on git has plenty of scathing comments about these traditional source-control tools and why they aren’t the answer. If you want some more examples of people who aren’t enjoying their source-control tools, there are also some great comments on the “Coding Horror” article Software Branching and Parallel Universes.

In the end, it looked both simpler and safer for me to live without a source-control tool. That’s heresy in civilized software engineering circles, even for a one-man project. But it has worked fine for me up until now.

In the absence of a source-control tool, I’ve maintained separate and complete copies of each version of each project, and done any merging of code between them manually (or at least, using separate tools). This loses out on the locking, merging and history tracking/recreation that a source-control tool could provide, but to date that hasn’t been of any consequence (and can partly be addressed by other means, e.g. short-term history tracking by my IDE, use of “diff” tools against old backups etc). In return I’ve not had to deal with any of the overheads, complexity or risks of any of these tools, nor had to fit the rest of my environment and procedures around them.

Don’t get me wrong: on a larger team, or more complex projects, some kind of source-control tool would normally be absolutely essential, however problematic and burdensome. But I am not a larger team, and so far it hasn’t been worth my while to shoulder such burdens.

Anyway, I revisit this subject every now and then, to see if the tools have reached the point where any are good enough to meet my needs (and so that I have a rough idea of what to do if I suddenly do need a source control tool after all).

And this time around, at last, everything seems to have changed…

This time, the world suddenly seems full of “distributed” (or perhaps more accurately, “decentralized”) source-control tools. Despite initially fearing that things had just got a whole lot more complicated, these tools have actually turned out to be exactly what I’ve been looking for all this time.

I’m not going to try and explain distributed source-control tools here, but for some general background, see (for example):

Of the currently-available distributed source-control tools, a quick look round suggested that Mercurial might be best for me, and some brief exploration and experimentation with it completely won me over.

At last, a souce-control tool that I’m happy with!

Mercurial gives me precisely the benefits I’m looking for from a source-control tool – in particular, history tracking/recreation and good support for branching and merging. It’s flexible enough to let me add these facilities into my existing development environment and directory structures without otherwise impacting them (even though this isn’t how most teams would normally use it), it doesn’t need any significant adminstration, and it seems simple and reliable.

In addition, Sun has chosen it for the OpenJDK project (as stated, for example, in Mark Reinhold’s blog), and Mozilla is adopting it too (as described in Version Control System Shootout Redux Redux), so I can feel reasonably confident it’ll be around and supported for a while.

Some of the particular things I like about Mercurial are:

  • It all seems simple and reasonably intuitive, and everything “just works”.
  • Branching and tagging, and more importantly merging, all look relatively simple, safe, and effective.
  • Its overall approach makes it very flexible. I especially like the way the internal Mercurial data is held in a single directory structure in the root of the relevant set of files. This keeps it together with the files themselves, with no separate central repository that everything depends on, whilst also not scattering lots of messy extra directories into the “real” directories. It was easy to see how this could be fitted into my existing directory structures, backup, working practices etc without any significant impact or risk, and without other tools and scripts needing to be aware of it. At the same time I don’t feel it ties me down to any one particular structure, and I can see how it could readily accommodate much larger teams or more complex situations.
  • Although this is entirely subjective, it feels rock solid and safe. Retrieving old versions and moving backwards and forwards between versions works quickly and reliably, with no fuss or bother. The documentation’s coverage of its internal architecture and how this has been designed for safety (e.g. writing is “append only” and carried out in an order that ensures “atomic” operation, use of checksums for integrity checks etc) gives me good confidence that corruptions or irretrievable files should be very rare. For extra safety I can still keep my existing directories in place (holding the current “tip” of each version), so that at worst my existing backup regime still covers them even if anything in Mercurial ever gets corrupted.
  • The documentation provided by the Distributed revision control with Mercurial open-source book seems excellent. I found it clear and readable enough to act as an introduction, but extensive and detailed enough to work as a reference. I spent a couple of hours reading through the whole thing and felt like this had given me a real understanding of Mercurial and covered everything I might need to know.
  • Commits are atomic, and can optionally handle added and deleted files automatically. This means that I can pretty much just carry out the relevant work without regard for Mercurial, then simply commit the whole lot at the end of each task, without having to individually notify Mercurial of each new or deleted file. This removes a lot of the need for integration with IDEs, and a lot of the potential source-control implications of using IDE “refactoring” facilities.

Some of these are intrinsic benefits of distributed source control; some are due to Mercurial being a relatively new solution (and able to build on the best of earlier tools whilst avoiding their mistakes and being free of historical baggage); and some are just down to it being well designed and implemented.

For anyone coming from other tools, some conversion/migration tools are listed at Mercurial’s Repository Conversion page, but of course I haven’t tried any of these myself.

The only weaknesses I’ve encountered so far are:

  • Mercurial deals with individual files, and is therefore completely blind to empty directories. The argument seems to be that empty directories aren’t needed and aren’t significant, but I think this is more an artifact of the implementation than anything one would deliberately specify. I don’t think it’s such a tool’s place to decide that empty directories don’t matter. I have directories that exist just to maintain a consistent layout, or as already-named placeholders in readiness for future files. To work around this I’ve had to find all empty directories and give them each a dummy “placeholder” file.
  • Although there’s at least one Eclipse plug-in, at least one NetBeans plug-in, and a TortoiseHg project for an MS-Windows shell extension, these seem to be at a very early stage. I’d expect this situation to improve over time, especially for NetBeans (given Sun’s use of Mercurial for OpenJDK). In the meantime this doesn’t have much impact on my own use of Mercurial, as the command-line commands are simple to use and powerful enough to be practical. During normal day-to-day work, my use of Mercurial has generally been limited to a commit of a complete set of changes when ready, plus explicit “rename”s of files where necessary.
  • On MS Windows you need to obtain a suitable diff/merge tool separately, as this isn’t built into the Mercurial distribution (but the documentation points you at several suitable tools, and shows how to integrate them into Mercurial – and anyway, I’d rather have the choice than be saddled with one I don’t like, or have a half-baked solution as part of the source-control tool itself).

I’ve now been using Mercurial for a couple of months. Despite my general dislike of all the source-control tools I’d looked at beforehand, I have been very pleased with Mercurial.

If you’re looking for a new source control tool, or have always disliked tools such as CVS, Subversion and Perforce, I’d certainly recommend Mercurial as worth taking a look at.


Actions

Information

19 responses

6 11 2007
the j-dog

Can we pleeeeeaaaase stop using those stupid website preview things when we hover over links? Don’t you guys have any idea how annoying those things are? I occasionally accidentally hover over one when I’m scrolling with the mouse wheel and a preview of some random website will jump out at me. Sometimes I like to “feel” the text while I read it with the mouse cursor, and some random website will jump out at me. Sometimes I like to hover over a link, just so I can get an idea where the link goes from the address, and BAM. This stuff is a blight on the web. It’s not quite as bad as the pages that look up words I double click in the dictionary, but still very very very very very very very very very very very very very very very very annoying.

6 11 2007
closingbraces

OK, point taken. I’ve tried it, some like it, some hate it, you hate it – I’ve removed it.

6 11 2007
closingbraces

Over on programming.reddit.com, user “derekslager” has pointed out that Mercurial does have a “batteries included” Windows distribution that includes a preconfigured kdiff3 / hgmerge.

6 11 2007
Chui Tey

I’m a happy Hg user too. I like how it doesn’t spray .svn directories in each subdirectory, making grep difficult.

7 11 2007
Manuzhai

See, the thing with the empty directories is that it massively simplifies the internal model of the tree: instead of file nodes and directory nodes, there are now only file nodes. This means some optimizations can be done that would otherwise be impossible. Seen in that light, I think it’s a good trade-off. Having a few .keep files around isn’t too much of a price to pay.

7 11 2007
rif

I still can’t find a solid criteria to choose between bzr and hg. Heach have some nice features that the other one lack.

7 11 2007
Brian Silberbauer

Manuzhai, I disagree with you completely :)

I want my version control tool to work how I intuitively expect it to work and not have to find work-rounds like creating files in empty directories..

That said, I have been using hg for the last month on single developer projects and am quite happy. I haven’t had the need to use branching yet, or maybe I’ve had too much CVS branching history to attempt it..

kudus to Mike for handling the link-preview thing in the sanest I’ve seen on the net.

7 11 2007
Soren

I’m still waiting for one of the distributed scm’s to get decent tool support, eg. Eclipse: CVS full support, SVN full support. Hg alpha depending on hg binary.

7 11 2007
rif

If the support provided by any of the two svn plugins for eclipse is full support, then I dont’t want it :)

8 11 2007
Adam

I too am now using Mercurial and I really do like it. I have one friend who is trying to tempt me to go the way of git. Have you checked out git at all and have any thing to add on that front?

9 11 2007
closingbraces

Adam,

I looked at Git briefly, but was put off by: impression that it’s perhaps primarily a toolkit for higher-level tools and front-ends; doubts over using it on MS Windows (I use both Windows and Linux); and concerns over space requirements and possible need for regular “housekeeping”.

I might well be doing it an injustice, but my gut reaction was that it wasn’t going to offer anything decisive over Mercurial and/or Bazaar, and it was firing enough alarm bells to put me off spending time looking at it further.

I guess my main decision was that the “distributed” approach is what I’ve been looking for and already has tools that are good enough for serious use. Choosing between those tools seemed a much less critical issue. I suspect there’s little to choose between Git, Mercurial and Bazaar (pros/cons to each, balance likely to shift over time anyway).

Mercurial just “felt” most right to me (very subjective…), and it does meet my requirements very well, so I was quite happy to plump for it rather than spend more time choosing between them.

30 01 2008
Anonymous

Interesting article… take a look at WANdisco (CVS / SVN MultiSite – Active / Active Replication).

21 02 2008
Suraj

I tried Mercurial and Bazaar for doing a big merge (1500+ files) from PVCS repo on Windows. I found Bazaar more user friendly for merges than Mercurial (at least on windows). Other than that both are similar.
Mercurial insisted that I should perform the merge interactively whereas all I wanted was a report of conflicting files. Bazaar went ahead, performed the merge and gave me a list of files which had the merge markers in them.
I wish I could have used Mercurial. With Bazaar I was very impressed with the merge.
Other than that, you need C compiler (either MSCV2003 or MinGW) to install Mercurial source code whereas for Bazaar, all dependencies are available as handy windows installers.

15 04 2008
Arne Babenhauserheide

I’ve been searching now and again for information about the ifferent DVCS, and some time ago I found a nice side-by-side comparision of Mercurial and Bazaar:

http://sayspy.blogspot.com/2006/11/bazaar-vs-mercurial-unscientific.html

It confirmed my gut feeling to choose Mercurial (almost a year ago, now).

And Mercurial also has a Windows installer:
http://www.selenic.com/mercurial/wiki/index.cgi/BinaryPackages
-> http://mercurial.berkwood.com/

Best wishes, and thanks for the nice article!

28 07 2008
Pocho

If you want to automatically create keep files for empty directories, here is a Python script that’ll do just that.


import os
from os.path import join, getsize
for root, dirs, files in os.walk('C:/proyects/myproject'):
print root, "... ",
if len(dirs) == 0 and len(files) == 0:
open(root + "/keep", 'w')
print "creating keep"
else:
print ''

os.system('pause')

28 07 2008
closingbraces

Thanks Pocho.

9 08 2009
Jonny Dee

Hi, if you want an open source tool that manages the creation and deletion of placeholder files automatically just have a look at http://freshmeat.net/projects/markemptydirs.

9 08 2009
closingbraces

Jonny, thanks for pointing out your markemptydirs tool. Looks like a nice simple answer for automatically creating and removing placeholder files. It doesn’t suit me personally at the moment due to being .Net or needing Mono (not worth introducing these into my build process and then administer/maintain/patch etc for something as minor as this), but handy to know about, and I can see it being a neat little answer for anyone who wants to automate this.

13 02 2010
sPOiDar

Unix is cool:

find . -type d -empty |xargs -i{} touch “{}”/.keep

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s




Follow

Get every new post delivered to your Inbox.

%d bloggers like this: