Archive for February, 2011

Git Rebasing (cont.)

Saturday, February 26th, 2011

As I said already, git’s interactive rebase tool is seriously useful for preparing a nice, cleanly split up series of patches. And, despite some people’s dire warnings, there’s no reason not to share an in-progress patch series using git so long as you take care to warn others against relying on your tree not rebasing.

Why would a patch series not be complete? One reason might be that a patch introduces a regression. As they say, you often have to break some eggs to make an omelette but, if you value the power of git’s bisection tool, you’ll want each individual patch to be regression free.

Okay, say you’re porting an application from one database framework to another. You might do a bunch of hacking to demonstrate the concept and then send that work out for comment. Only at this point will you go about figuring out polishing the work off and, finally, cleaning the changes up into a nice patch series.

This approach implies that the work will only stop rebasing quite late in the day. Which leaves a problem – how can you possibly collaborate with others if your tree is rebasing? How can you take patches to fix regressions? How can others help you clean up the series?

Here’s one suggestion, based on an approach Stephen Tweedie came up with when we were working together on a series of patches:

  1. Say your branch is called fluffy-piglet. You’ve pushed it out and asked for comments. Don’t rebase this branch again.
  2. Create another branch called fluffy-piglet-rebasing, basing it initially on fluffy-piglet.
  3. Tag both branches with e.g. a -v1 suffix, check that the trees in both tags are identical:
    $> git diff fluffy-piglet fluffy-piglet-rebasing
    $> git show -s --format='%t' fluffy-piglet fluffy-piglet-rebasing
    
  4. Push the rebasing branch and the v1 tags to your repo.
  5. If you wish to rebase and do some cleanup work on the patches, do so and tag and push the result to the fluffy-piglet-rebasing branch in your repo as in (3) and (4), but using a new suffix.
  6. If you receive some patches, pull them into your fluffy-piglet branch, tag the result and rebase the patches onto the rebasing branch e.g.
    $> git tag fluffy-piglet-v3
    $> git rebase --onto fluffy-piglet-rebasing-v2 fluffy-piglet-v2
    $> git tag fluffy-piglet-rebasing-v3
    
  7. If you wish to rebase unto latest upstream, you could first enable git’s “reuse recorded resolution” feature:
    $> git config --global rerere.enabled true
    

    Then you rebase the rebasing branch:

    $> git checkout fluffy-piglet-rebasing
    $> git rebase upstream/master
    $> git tag fluffy-piglet-rebasing-v4
    

    And then, you merge upstream into the non-rebasing branch:

    $> git checkout fluffy-piglet
    $> git merge upstream/master
    $> git tag fluffy-piglet-v4
    

    As in (3), you should be able to verify that the two resulting trees are identical.

    If any conflicts needed to be resolved during rebasing, there’s a good chance that having rerere enabled will mean the conflict will be automatically resolved when merging.

  8. Finally, if anyone wants to help you with any of the series cleanup work, just ‘pass the baton’. You basically say, ‘No more rebasing from me after v4, go ahed’ and the other person can work away on the rebasing branch until they are ready to pass control back again.

This certainly isn’t a straightforward workflow, but it gives you:

  • The ability to work with others since folks have a non-rebasing branch to work against
  • The ability to clean up a series using rebase while still having confidence that nothing is being screwed up because you have the pair of tags with identical tree contents
  • The ability to allow others to help clean up the series too

The fact that this workflow is so awkward has its advantages too – it encourages you to clean up the series early and stop rebasing it. This is not a workflow you’d like to use for an extended period of time.

Git Rebasing

Wednesday, February 23rd, 2011

For me, ‘git rebase -i’ is perhaps git’s killer feature. I’m a big fan of small, self-contained commits both for ease of patch review and for the sake of useful commit history later. I used to do this on CVS using quilt but git takes a huge amount of the pain out of it.

Ever since I discovered the feature a few years ago, I’ve also been vaguely aware of kernel developers advice to people on rebase … often simplified to OMG no. Never rebase..

When pushed to elaborate, I guess most would say:

Once you share a commit with someone, never rebase it. They may base their work on your commit and by rebasing it, you’re screwing everything up.

One memorable comment from Linus on the subject was “Have the f*cking back-bone to be able to stand behind what you did!”.

In context, this all makes sense. If a kernel developer sends a pull request and it gets merged into one tree, then rebases and that gets merged into another tree and both get merged into Linus’s tree … then yes, you have a bit of a disaster on your hands.

However, I think the rules above are too simplistic for most git newbies. Such newbies are unlikely to see their trees pulled into the whirling vortex of kernel trees so there’s no need to terrify them about using rebase.

My advice is:

  1. If you’re learning git, take the time to understand the rebase command and, especially, the interactive option.
  2. If you’re working on a series of patches, it’s perfectly fine for you to share that series with others even if it’s not finished. That means later rebasing a commit you’ve shared with others.
  3. If you’re worried people might base their work on a commit you plan to rebase later, then you warn people by putting e.g. “v1″, “rebases” or “rebasing” in the repository or branch name.
  4. If someone does base their work on a commit you have rebased, then point them at the Recovering From Upstream Rebase part of git-rebase(1). It’s really not the end of the world, especially in the simpler cases.

Our Bizarre Posting Conventions

Thursday, February 17th, 2011

I’ve been around open source mailing lists for so long now that I tend to forget that our conventions around sending plain text emails, quoting replies, threading, etc. aren’t necessarily obvious to everyone out there.

That’s especially true for some of the folks working on RHEV-M, many of whom have come from a Windows development background. I eventually realized that saying “just do what everyone else does” wasn’t really good enough and spent some time documenting what all those little conventions are.

I’ve posted some guidelines to the rhevm-api wiki. What am I missing?

Error Handling

Tuesday, February 15th, 2011

I came across a fairly typical error handling mistake today and thought I’d share it.

The code was something like this:

class Factory {
    static Foo getFoo() {
        try {
            // load a .properties file, read a key from it
            ...
            return new Foo(propValue);
        } catch (Exception ex) {
            log.error("Failed to find key in properties file: " + ex.toString());
            return null;
        }
    }
}

and the caller did e.g.

    Factory.getFoo().doSomething();

The thing to note here is that the caller is assuming that getFoo() method never returns null.

That’s probably a good assumption on the part of the caller. The factory method should always succeed unless there is a programmer error (e.g. the key does not exist in the .properties file) or a system error (e.g. an I/O error accessing the file). Neither of which the caller should be expected to handle gracefully. That’s what exceptions are for!

In the event of an error, we’d get a NullPointerException from the caller site and an error printed to the logfile. If, for example, you are running this code in a debugger, you’ll be looking at a NullPointerException with no clue as to what caused the exception.

The moral of the story here is to think clearly about when errors should be handled gracefully and when you should just let exceptions be passed back up the stack.

In the end, I added a new exception type:

class FactoryException extends RuntimeException {
    
    private static final String ERROR_MSG = "Failed to lookup key {0} in file {1}";

    private String propsFile;
    private String propsKey;

    FactoryException(String propsFile, String key, Throwable cause) {
        super(MessageFormat.format(ERROR_MSG, propsFile, key), cause);
        this.propsFile = propsFile;
        this.propsKey = propsKey;
    }

    String getPropsFile() {
        return propsFile;
    }

    String getPropsKey() {
        return propsKey;
    }
}

and then re-factored the original code:

class Factory {
    static Foo getFoo() {
        try {
            // load a .properties file, read a key from it
            ...
            return new Foo(propValue);
        } catch (Exception ex) {
            throw new FactoryException(propsFile, propsKey, ex);
        }
    }
}

so now if there is an error, rather than getting a fairly useless NullPointerException you get:

FactoryException: Failed to lookup key foo in file bar.properties
    ...
Caused by: java.io.FileNotFoundException: ...