Revision Control and Unit Tests Considered Harmful

I know that a lot of you visit my blog because I am the Internet’s apparent authority on how much git sucks, so I’m sure you’ll appreciate what I have to say here. Revision Control systems, and test-driven development — in particular, unit testing — are harmful to computer programming.

Consider that little app or script you wrote a few weeks ago to solve some immediate problem, or immediate whim you had. It was pretty easy, you opened your favorite text editor with a clean, inviting blank page and a flashing cursor and started typing with no real thought about where was best to start.

If you were writing in a real programming language you probably compiled and ran a pretty early version of some piece of code, with a frivolous main() function to call into it, and you iterated pretty quickly adding functionality as you went. You figured out problems using the debugger, hooking into the code you needed to debug directly so you didn’t have to go through the whole thing just to get to the bug.

If instead you wrote in something like Python or Ruby you probably fired up the interactive interpreter in another window and as well as running your script from time to time, you probably copy and pasted stuff into the interpreter to figure out as you went. I’m terrible for always checking my string chopping code in the Python interpreter as I write it. You don’t have a real debugger, but it helps.

Anyway my point is that you quickly got started, and probably got something working pretty well in a relatively short amount of time and you had fun doing it. Nothing seemed too much of a road-block.

Now consider the last time you wrote some “serious” code “properly”, maybe for work or last time you attended a lecture on the benefits of test-driven development or when you were working on a project that had to go into revision control.

So okay, to get started first you need to make the project directory and run git init.

Well, that’s our first problem. We’ve got to make the project directory, damnit, what is this going to be called? It needs a name! We can’t just call it sunday-play-lens because this is going in revision control, and if I call it that now it’ll forever show up in my revision control history.

A few days later, you’ve got a name, and can finally make the directory. Hurrah! Now we run git init and we can get started. We pull up an editor — hopefully it’s still your favorite, and that your employers don’t force you to use Eclipse or something — and there’s that fresh blank inviting page again.

So where do we start?

Oh god, this is going in revision control. After I’ve finished this, I’m going to have to type git commit. That’s going to be permanent. If I screw up now, my screw-up is forever going to be visible to the world in my revision control history.

Ok, well there is something I can get started on; I know I need a priority queue class for this code, and that’s easy enough to get going with. I even know it’ll be called NIHPrioroityQueue so I don’t have to worry about that, and can even unit test it. It’ll get committed as a nice unit “NIHPriorityQueue – implement priority queue” goes the commit log.

Phew, thought we weren’t going to make it there, but now the project is definitely started. I have a priority queue, and I have some test cases.

What do I work on next? I don’t dare go near the main() routine, I’ll have to keep changing it in the commit log – and how do you unit test it? I know this code will need to grab some data from a RESTful API and parse it, so I guess I’ll start with that.

Ugh, I don’t really know where to begin; I guess I’ll have to just start with an asynchronous network library and a main loop to integrate it with. If I were doing this as a hobby I’d've just hacked it synchronously, but this isn’t a hobby, this is serious – people are going to read my commits and shake their heads if I do that.

And so on.

The only thing that changed is you added the requirement that it be written “properly” and now the code that would’ve taken you an afternoon to write is taking you months and you’re getting nowhere with it.

Why? Because you can’t play anymore.

The need to commit and test your work as you go means you feel like you have to get it right each and every time, it’s going down in stone and that means it has to be perfect. Even worse if you have to get code review as well.

Without the ability to play around, mess about with the code without consequences and privately on your own computer, you can’t be truly creative with it; and if you’re not being creative, it isn’t fun!

Now, that said, if you try and inflict on the world another untested heap of messy code without any trace of tests and with no clear revision history, then you’re going to the special hell.

20 thoughts on “Revision Control and Unit Tests Considered Harmful

  1. Thom May

    I’m glad you rather rescued yourself with the last paragraph otherwise the whole piece would be nothing but flamebait. :)
    That said, I agree with you that the ability to play is important – so the concept of throwing the first iteration away is important. THROW IT AWAY.
    Figure out what you want to write, how it should look, and then bin it. Now you know what you’re calling it (kinda – but git doesn’t care anyway; it only cares what your *remote* is, for the sake of others using it), how it should be structured, and more-or-less how you want to approach it. Now write it properly!

  2. David

    I don’t know; you could still have fun. Just commit to a “clowning_around” branch that you only merge squash into the “serious” branch when you’re ready to share.

    1. scott Post author

      I think you’re missing the point – this is about starting a new project, long before you push a branch, it doesn’t matter whether it’s called “master” or “clowning_around” – the very fact you’re going to commit changes as you go changes your mental attitude to it. Why commit at all? Why not just clown around without it and free your mind?

      1. Decklin Foster

        Because rewriting commits is easy and cheap? You might as well say that deciding to ever hit “save” in your editor is going to freeze you into some sort of indecision because your mistakes will be permanently recorded on disk.

        1. 16aR

          Agree.
          And if ever your first commit sucks, rewrite history from the beginning and that’s it. Delete your old public repo named sunday_rainbow_land and put the new one in my_name_is_serious.

          That’s is the whole point of source control for me. You don’t spend too much time with finding names. You refactor after and you commit, and it is done. What’s the problem with that ? Except, it’s not really well seen to put classes/function named myCoworkersAreJerks, you will still have to avoid that. Hard.

        2. Nix

          Quite. For those of us who use ‘git wip’, *every save* is a commit (in a normally-invisible branch so as not to mess up the history with commits with nearly-random content and no useful log). You know how much this dissuades me from saving? Not at all, that’s how much.

          And as for worrying about the name you give the project at the start — branch names are not baked into git history the way they are baked into Mercurial history, so you can use one of the Seven Deadly Words and nobody will ever know or care unless you choose to keep that name when pushing to a public repo somewhere. And, as others have said, history is rewritable (though rewriting history from the very first commit is actually a little tricky, it can be done and it’s one google away to find out how to do it).

  3. Janne

    Agree on testing for quick hacks and playing around. Don’t agree on VCS.

    You got a name for your script? Use that. For playing around it’s not as if you need to push the repo anywhere; make a directory with the name of your script, git init and just keep everything in there.

    To me, git helps with playing around, rather than hinders it. Branches are so easy to make and remove that they make it trivial to try half a dozen things, compare them and keep what works. I’ve had branches literally named things like “Argh” and “notheotherone” (‘no, the other one’ – I vaguely remember it was about what parameter to optimize). Doesn’t matter as none of it will be seen in public.

    Without easy branches I typically resorted to conditionals, preprocessor directives and copies of my code with similar (but confusing) names, all in an effort – in retrospect – to clumsily replicate real versioned branches.

    Skip testing. But embrace versioning to make hacking speedier and easier, not more difficult or slow.

  4. txwikinger

    Writing test or using revision systems does not stop anybody to be creative or to experiment. In fact, both encourage and facilitate more creativity and experimentation.l

    With the revision system, you never have to be afraid to lose the solution that did work, because it was committed. So, no need for commenting out working code in case it is needed later again. No fear of making things worse, it can always be resetted again.

    Test help even more. With the bisect command in git and unit tests, it is very easy to find the moment when something stopped working. Hence it is possible to experience without consequence of breaking stuff. The working solution can always be found very easily. Testing does not mean everything must always work. Testing means it is easy to see if things work or not.

    Not unit testing, or revision system stop creativity and courage to experiment, but solely the mind frame of the person being fearful of it. In fact testing and revision system are capable to remove the fears if the mind is willing to let go of it!

  5. Dylan McCall

    The name thing really does get me all the time. It’s a huge relief to know I’m not the only one! :)

    (Of course, two of my favourite commands ever are bzr uncommit, and bzr push –overwrite, even though they’re both unbelievably evil things to do).

    So what we need is a tool that automatically generates filenames for early work. If you’re undecided you can run the tool and have your branch named bankrupt-banana-quota. Then we’re all on a level playing field, and you don’t need to worry about the name looking particularly silly.

  6. Grzegorz Gałęzowski

    Wow, this post is pretty amazing, since it describes what I usually _don’t_ do with TDD and version control. My two main points of amazement are:

    1. I don’t ever have this kind of attitude “I have to make it perfect” when I code or commit to VCS.
    2. TDD IS a way to play around. Unit Testing for testing sense is a waste of time IMHO (and, it’s demotivating waste of time).

    I think what you’re criticizing in this post are not techniques, paradigms and tools but the attitude that you talk about. Thinking about putting anything into VCS or doing TDD when one doesn’t even know what the top directory is going to be named (i.e. one doesn’t know what they write) IS the wrong thing to do. Thinking “I’m going to make public I-don’t-know-what-yet” IS the wrong thing to do.

    P.S. By the way, about the synchronous-asynchronous approach in network communication – TDD encourages deferring such decisions. I know this because many times I went with TDD and got to a point where I encapsulated the synchronous vs asynchronous approach and first implemented it as synchronous (because why not? If I don’t need a reason to complicate things, why do I do it?). Then, if sometimes performance tests or other kind of testing proved this is not scalable or something, I’d just pull in a brick and put in another (asynchronous) one, which turned out to be really simple.

  7. Pingback: Scott Remnant enumera i limiti dei DVCS considerando, ad esempio, Git | Indipedia – Indipendenti nella rete

  8. Mackenzie

    Shepherd Book’s “special hell”? The one for child molesters and people who talk at the theatre?

  9. Anonymous Coward

    You just made the perfect case for git which allows one to rewrite history.

  10. Steven Wittens

    Couldn’t disagree more, and I’ve been in your exact situation for the past month: writing something entirely new, being unsure of how it’s supposed to be structured, and figuring things out as I go along. I use branches to fuck around, am not afraid to break anything, and commit liberally. Often I throw stuff away, never having pushed it to github. But just as often I don’t.

    You know what I realized? Nobody ever looks at revision histories long gone. Revision control is about the present, not the past. It’s about how what you have today relates to what came before… not about what commit #1 and commit #2 looked like. And if someone looked anyway, and tried to shame you for your ‘shitty’ experimentation and early gaffes… who cares? The only programmer who writes perfect code on the first go is a programmer who is rewriting something he did before already, with out any requirements changes. And in my experience, that never happens.

  11. Otus

    When using git I don’t care about the project directory name at all. I can always rename it and the old one won’t show up anywhere (AFAIK).

    In the initial phase of a project, I’m mostly prototyping. I do frequent commits, but once I have something that actually works I rewrite the history into a single commit. That prototyping history can be left in another branch for some time, but before a project is published to someone else rebasing and rewriting commits is not a problem.

  12. Horst H. von Brand

    You might hate git’s guts, but one of its principal selling points for me is that it encourages local experimentation (in an unpublished repo, or on a unpublished branch). Plus it has tools (granted, they aren’t exactly easy to find or intuitive to use) to rewrite your private history at heart’s content before publishing anything. To be able to experiment, and save partially working stuff away to go chase some pretty butterfly, is liberating. Not at all the straightjacket you describe.
    Use your tools, don’t let the tools take over your life.

Comments are closed.