Ride on a White Horse

Filed under the category: “useful programming tips for problems that bit my ass”

The wise men from old said: Thou shalt not check if a directory or file existeth before creating it.

Well, or something like that.

I often see code that looks like this:

if (!g_file_test (some_directory, G_FILE_TEST_EXISTS))
  {
    if (g_mkdir (some_directory, 0700) != 0)
      {
        g_warning ("Unable to create `%s'", some_directory);
        return;
      }
  }
else
  do_something_with_some_directory (some_directory);

or, worse, code that looks like this:

if (!g_file_test (some_file, G_FILE_TEST_EXISTS))
  {
    symlink (some_other_file, some_file);
  }

Despite a BFW on the g_file_test() API reference page.

These code snippes are wrong, because they contains an implicit race: what if the directory/file you are checking suddenly disappears (or is created)? Also, blindly invoking functions like g_mkdir() or symlink without checking the returned value (“hey, I just checked so how can it fail?”) is looking for troubles; finally: when a system call fails, you should always report the error message the OS sets: too many things can go wrong, and if you can get a meaningful message then you’ll be able to debug faster.

The solution for these problems is: shoot first, then ask the questions. Or, in other words, execute the function inside the if block and act according to the results. For this to happen, you’ll often need to rely on errno.h and g_strerror():

#include <errno.h>

...

if (g_mkdir (some_directory, 0700) == -1)
  {
    if (errno != EEXIST)
      {
        g_warning ("Unable to create `%s': %s",
                   some_directory,
                   g_strerror (errno));
        return;
      }
  }

do_something_with_some_directory (some_directory);

and:

#include <errno.h>

...

if (symlink (some_other_file, some_file) == -1)
  {
    if (errno != EEXIST)
      {
        g_warning ("Unable to create a link from`%s' to `%s': %s",
                   some_other_file,
                   some_file,
                   g_strerror (errno));
        return;
      }
  }

At the end of the if block we are guaranteed that the file or the directory does indeed exist either because we successfully created it or because the system call failed and errno was set to EEXIST; inside the if, instead, we can deal with failures, using various error trapping branches based on the value of errno (just remember to store it if you’re going to use it multiple times).

6 Replies to “Ride on a White Horse”

  1. Your solution to what you call an “implicit race condition” is wrong.

    While it is true that someone can come in during that split second and create the directory, they can also delete that directory immediately after you create it. Thus you have not fixed anything.

    The correct solution is to create a private directory system that your program controls. Sure there is no guarantee that a rogue program isn’t going to damage your program’s bit of the file system, but this gets you a lot closer than trying to perform atomic directory creation.

    Also, what is the cost of trying and failing to create a directory versus just checking to see if it exists? In order to show your solution doesn’t cause more harm than good, you need to investigate that aspect.

  2. jonathan: while it’s true that there can be a race inverse to the one that I exposed, the code is nevertheless more robust: the check-then-do moves the burden to userspace and the programmer, while the do-then-deal-with-error moves the burden to the kernel.

    and trying to create a directory and failing is, at most, going to cost you the same as the stat() that g_file_test() performs, only that – if the underlying C library is doing thinks right – you get one less indirection (mkdir -> return EEXIST vs. g_file_test -> stat -> check against flags -> return).

  3. sorry – I must be on crack. my code does not have the race condition: if the syscall fails, then we are inside the error trap and we can gracefully exit; if the syscall succeds then we are safe.

    the race condition of the test-then-do code is that there’s a gap between the test and the actual syscall; trapping the error when the test fails is much more harder and makes the code uglier (you have to check twice: one for the test and one for the syscall).

  4. At the end of the if block we are guaranteed that the file or the directory does indeed exist […]

    It would be more precise to say “did indeed exist at the moment of the syscall”. Someone might have deleted it since then.

    Nevertheless your code is more reliable than the original one.

  5. It would be more precise to say “did indeed exist at the moment of the syscall”. Someone might have deleted it since then.

    It is harder to exploit a race condition requiring the deletion and re-creation of a directory.

    If the parent directory has the “sticky bit” set (like for /tmp), then it will not be possible for other users to delete or rename that sub-directory, so the operation is safe.

  6. I am not arguing that your code isn’t more robust. Rather, I am suggesting that it is solving the wrong problem.

    While your solution does help somewhat, it doesn’t solve the underlying problem of other programs messing with your directories. Solve that, and your solution becomes moot.

Comments are closed.