error handling

Jeff talks about wether apps should be robust against malloc failures. Which reminded me of the reason why I would object to malloc failure robustness. It has more to do with how people would implement it. It would end up messing up the readability of my code, because I’d have to do error handling everywhere. I’ll give an example. How would you read the contents of a file? Right, you’d open it, get its size, allocate a buffer large enough to hold the contents, read the contents into the buffer and close the file. So your code should look something like this:

void
file_get_contents (const char *filename, char **data_out, size_t *len_out)
{
  File *f = file_open (filename);
  size_t len = file_get_length (f);
  char *data = malloc (len);
  file_read (f, &data, &len);
  file_close (f);
  *data_out = data;
  *len_out = len;
}

Of course, this isn’t robust against malloc failures. And there’s also no magic file object that always succeeds. So let’s make sure we end up with a correct version:

boolean
file_get_contents (const char *filename, char **data, size_t *len)
{
  ssize_t len;
  char *data;
  int fd;

  fd = open (filename, O_RDONLY);
  if (fd < 0)
    return false;
  len = get_length (fd); // magic function
  if (len < 0)
    goto error1;
  data = malloc (len);
  if (data == NULL)
    goto error2;
  if (read (fd, data, len) != len)
    goto error2;
  if (close (fd) < 0) {
    free (data);
    return FALSE;
  }
  *data_out = data;
  *len_out = len;
  return TRUE;
error2:
  free (data);
error1:
  close (fd);
  return FALSE;
}

Now, we’ve just blown up the code to rughly three times its previous size. And that’s just because we use GOTO. If we had coded it “properly”, like the g_file_get_contents function, we’d have taken 170 lines. With a helper library like GIO, g_file_load_contents is only 55 lines. But they both don’t handle out-of-memory errors. And it’s both roughly 10x as much code as the first example.

This size growth has multiple problems. First of all, the code is a lot less readable. I’m pretty sure, everyone can easily follow what my example above does. I’m not so sure about g_file_get_contents. Second, most of the code will likely never be executed. Probably it hasn’t ever been executed either, because the people that wrote the code were not able to replicate the error condition that would trigger this (see the comments in the glib code). Third, it’s hard to write this code. It requires a lot more thinking and typing to come up with it. And last but not least, it’s hard to miss error cases. Are you sure you always check the return value for the close function?

Luckily I’ve found a solution for this problem. And that solution is present in cairo. If you’ve used cairo, you’ll have noticed that you can draw all the time and never need to care about errors, neither memory allocation errors nor any other other errors. That’s because the cairo object take does two things: It moves itself into an error state should it encounter an error and it defines a “correct” behavior of its exported API in such an event. Of course the objects provide a function to check for an error so you are free to check for errors after every call. So with this method we could make our code look like this:

boolean
file_get_contents (const char *filename, char **data_out, size_t *len_out)
{
  File *f = file_open (filename); // returns a preallocated "error" object on OOM
  size_t len = file_get_length (f);
  char *data = malloc (len);
  if (data == NULL) {
    file_destroy (file);
    return FALSE;
  }
  file_read (f, &data, &len);
  file_close (f);
  if (!file_in_error (f)) {
    *data_out = data;
    *len_out = len;
    file_destroy (f);
    return TRUE;
  } else {
    file_destroy (f);
    return FALSE;
  }
}

This includes OOM and other error checking and is still pretty well readable. However, it requires design and lots of work by the library designer that provide these objects. And a lot of libraries don’t provide this design. So unless people start to spend a lot of time on API design, I’m not seeing any advantage in making libraries memory allocation failure resistant. And currently this does not seem to be happening.

7 comments ↓

#1 mieo on 02.04.08 at 19:17

Hi,

I’m sorry but this is wrong. You should *always* check for errors. Always. If the file is larger that 2GB? If the code is used in a device with small (non swap) memory?

Also reusing objects for error status, makes them even harder to use.

Just make all your functions return a mandatory int which would be the last error and stick to it.

Yes it is more code (use c++, java if you want exceptions), but it is
a) readable
b) dependable
c) stateless

It might be good idea to use the cairo way, but this is rather specific (high performance) scenario.

#2 buz on 02.04.08 at 19:17

Or we could just use stack based exceptions ;)

#3 Abstract nonsense » Blog Archive » About error handling on 02.04.08 at 20:40

[…] cases embedded in them. Typical examples are functions which return a NULL pointer on error, or how cairo does things, with a more general error handling (basically, the result object has some kind of […]

#4 Bjarne on 02.04.08 at 21:38

Dude, learn a language with exception handling!
You really should check for errors. Just try running your code with zzuf (http://sam.zoy.org/zzuf/) to see what can go wrong.

#5 Havoc Pennington: Out-of-memory Handling - D-Bus Experience on 02.05.08 at 01:34

[…] started a blog thread about handling […]

#6 dvk on 02.05.08 at 10:16

That’s why there are exceptions and more general ways to handle errors like lisp’s conditions/restarts. They all allow to fail fast and loud and not to have resources leaked.

#7 http://tyoc.nonlogic.org/ on 02.09.08 at 18:08

I recommend also this discussion: http://ubuntuforums.org/showthread.php?t=631745 I wrote that the name for that is “coding in possitive and to the point”.

See this:

boolean file_get_contents (const char *filename, char **data, size_t *len)
{
  ssize_t len;
  char *data;
  int fd;

  fd = open (filename, O_RDONLY);
  if (fd > 0){
    len = get_length (fd); // magic function
    if(len > 0){
      data = malloc (len);
      if(data){
        if(read(fd, data, len) == len){
          file_close (f); // you dont need to check "f", becaus already pass
          *data_out = data;
          *len_out = len;
          return TRUE;
        }
      }
    }
  }
  *data_out = NULL;
  *len_out = 0;
  return FALSE;
}

Simple and to the point, no extra hided jumps (at opcode or processor level ->yea the check for errors and return or jump to else clauses).

I consider a very bad practice check for errors in the way that all people do, and redirect us to post/discussions like this one, and the one linked in the ubuntuforums.

The code being possitive (check for success not failure) and to the point (continue until you know you are right, is easy, because you dont need to ask each time “I have failed?” if you know you have not fail, thus when you fail you can start thinking why you have fail).

The good about this is that you can code in this way whatever the problem is, for your second example:

boolean
file_get_contents (const char *filename, char **data_out, size_t *len_out)
{
  File *f = file_open (filename); // returns a preallocated "error" object on OOM
  // I dont know why you didnt check f, thus I dont check it
  size_t len = file_get_length (f);
  char *data = malloc (len);
  if (data) {
    file_read (f, &data, &len);
    file_close (f);
    if (!file_in_error (f)) {
      *data_out = data;
      *len_out = len;
      file_destroy (file);
      return TRUE;
    }
    free(data); // You havent liberated the memory on fail!!! if your file i in error!
  }
  *data_out = NULL; // See the commend above
  *len_out = 0;
  file_destroy (file);
  return FALSE;
}

Also