When Wednesday was quiet

I had a doctor’s appointment on Tuesday; I told Fin and Alex about it on the Monday and asked them to drop me off there. On Tuesday they went to see this cat that I mentioned earlier, and came to pick me up afterwards, and as I also mentioned we decided to go out to eat. I’d thought that meant after the appointment, and I was so sleepy I fell right asleep. When I woke up, we were coming up to the restaurant, and I was faced with the dilemma of whether to say, “But we’re missing my appointment” (and go back, have missed the appointment anyway, and have ruined the meal), or to keep quiet for now. I decided to keep quiet. Either way I’d have to pay for the missed appointment.

The next day I phoned the doctor to reschedule and they were very apologetic: they said, “Oh, we were trying to contact you: the doctor was off sick yesterday. We’ll phone in a refill for you.”

I didn’t go to see the cat after all on Wednesday, because she behaves nothing like our cat did and doesn’t respond to her name, and people were afraid I’d give a false positive because I miss our cat so.

I spent much of the evening writing about GMarkup instead of fixing bugs.

I also had a very interesting email, which I might tell you all about later.

This morning, the company has given us all bags of love heart sweets and teddy bears. Thank you to the people who left comments for me on valentinr and places like that! If anyone sent one of those mutual things where you have to send another of the same kind in order to see them– well, I didn’t send any like that, so I won’t see them.

Fin, Alex: I am going to get you trees. But it is too cold to plant trees, so later in the year I will get them.

Link soup:

XML, GMarkup, and all that jazz

xmlchick.jpgI was asked to talk about how to use GMarkup. This is a brief introduction; there are many people more qualified to talk about it than I am. These are my opinions and not those of the project or my employer. If you want to suggest a change or report a mistake, suggest away.

Firstly, why you shouldn’t use GMarkup.
Don’t use GMarkup if all you want is to store a simple list of settings. Instead, either use gconf, or if what you want is a file on disk, use GKeyFile, which lets you write things like:

[favourites]
icecream=chocolate
film=Better Than Chocolate
poem=Jenny kiss'd me when we met

in the style of .ini files. These are much more user-friendly.

Don’t use GMarkup if you want to parse actual arbitrary XML files. Instead, use libxml, which is beautiful and wonderful and fast and accurate. GMarkup is made to be easy to use.

Do use GMarkup if you want a reasonably complicated way to store files on disk, in a new format you’re making up.

Why GMarkup files are not XML.
XML is big and scary and complicated and spiky. People pretend it is simple. It isn’t. GMarkup files differ in many ways from XML, which makes them easier to use but also less flexible. Here are some ways in which a file can be XML but not GMarkup:

  • There is no character code but Unicode, and UTF-8 is its encoding. GMarkup does not attempt to screw around with UTF-16, ASCII, ISO646, or, heaven help us, EBCDIC. That way madness lies.
  • There are five predefined entities: &amp; for &, &lt; for <, &gt; for >, &quot; for ", and &apos; for '. You cannot define any new ones, but you can use character references (giving the code point explicitly, like &#9731; or &#X2603; for a snowman, ☃).
  • Processing instructions (including doctypes and comments) aren’t specially treated, and there is no validation.

There are also a few subtle ways in which a file can be parsable by GMarkup but not be valid XML. However, these are officially invalid GMarkup even though they work fine, if you can follow that. Many people don’t care, but they should.

Okay, so how do we get going?
There are two ways people deal with XML: either as a tree, or as a series of events. GMarkup always sees them as a series of events. There are five kinds of event which can happen:

  • The start of an element
  • The end of an element
  • Some text (inside an element)
  • Some other stuff (processing instructions, mainly, including comments and doctypes)
  • An error

Let’s imagine we have this file, called simple.xml:

<zoo>
  <animal noise="roar">lion</animal>
  <animal noise="sniffle">bunny</animal>
  <animal noise="lol">cat</animal>
  <keeper/>
</zoo>

This will be seen by the parser as a series of events, as follows:

  • Start of “zoo”.
  • Start of “animal”, with a “noise” attribute of “roar”.
  • The text “lion”.
  • End of “animal”.
  • Start of “animal”, with a “noise” attribute of “sniffle”.
  • The text “bunny”.
  • End of “animal”.
  • Start of “animal”, with a “noise” attribute of “lol”.
  • The text “cat”.
  • End of “animal”.
  • Start of “keeper”.
  • End of “keeper”.
  • End of “zoo”.

(Actually there’ll be some extra text which is just whitespace, but let’s ignore that for now.)

There are two kinds of objects to deal with.
One is a GMarkupParser: it lists what to do in each of the five cases given above. In each case we give a function which knows how to handle opening elements, or closing elements, or whatever. If we don’t care about that case, we can say NULL. The signatures needed for each of these functions are given in the API documentation.

The second kind of object is a GMarkupParseContext. You construct this, feed it text, which it will parse, and then eventually destroy it. It would be nice if there was a function which would just read in a file and deal with it, but there isn’t. Fortunately, we have g_file_get_contents(), which is almost as good, if we can assume there’s memory available to store the whole file at once.

So let’s say we want to print the animals’ noises from the file above.

  1. Decide which kinds of events we need to know about. We need to know when elements open so that we can pick up the animal noise, and when text comes past giving the animal name, so we can print it. It would be possible to free the noise when we need to get the next noise, but it would be easier to free it when we see </animal>, so let’s do it like that. Processing instructions and errors we can ignore for the sake of example.
  2. Write functions to handle each one.
  3. Write a GMarkupParser listing the name of each function.
  4. Write something to load the file into memory and parse it.

Here’s some less-than-beautiful example code to do that.

#include <glib.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

gchar *current_animal_noise = NULL;

/* The handler functions. */

void start_element (GMarkupParseContext *context,
    const gchar         *element_name,
    const gchar        **attribute_names,
    const gchar        **attribute_values,
    gpointer             user_data,
    GError             **error) {

  const gchar **name_cursor = attribute_names;
  const gchar **value_cursor = attribute_values;

  while (*name_cursor) {
    if (strcmp (*name_cursor, "noise") == 0)
      current_animal_noise = g_strdup (*value_cursor);

    name_cursor++;
    value_cursor++;
  }
}

void text(GMarkupParseContext *context,
    const gchar         *text,
    gsize                text_len,
    gpointer             user_data,
    GError             **error)
{
  /* Note that "text" is not a regular C string: it is
   * not null-terminated. This is the reason for the
   * unusual %*s format below.
   */
  if (current_animal_noise)
    printf("I am a %*s and I go %s. Can you do it?\n",
        text_len, text, current_animal_noise);
}

void end_element (GMarkupParseContext *context,
    const gchar         *element_name,
    gpointer             user_data,
    GError             **error)
{
  if (current_animal_noise)
    { 
      g_free (current_animal_noise);
      current_animal_noise = NULL;
    }
}

/* The list of what handler does what. */
static GMarkupParser parser = {
  start_element,
  end_element,
  text,
  NULL,
  NULL
};

/* Code to grab the file into memory and parse it. */
int main() {
  char *text;
  gsize length;
  GMarkupParseContext *context = g_markup_parse_context_new (
      &parser,
      0,
      NULL,
      NULL);

  /* seriously crummy error checking */

  if (g_file_get_contents ("simple.xml", &text, &length, NULL) == FALSE) {
    printf("Couldn't load XML\n");
    exit(255);
  }

  if (g_markup_parse_context_parse (context, text, length, NULL) == FALSE) {
    printf("Parse failed\n");
    exit(255);
  }

  g_free(text);
  g_markup_parse_context_free (context);
}
/* EOF */

Save that as simple.c. If you have the GNOME libraries properly installed, then typing

gcc simple.c $(pkg-config glib-2.0 --cflags --libs) -o simple

will compile the program, and running it with ./simple will give you

I am a lion and I go roar. Can you do it?
I am a bunny and I go sniffle. Can you do it?
I am a cat and I go lol. Can you do it?

I think that was enough to whet your appetite, but there’s a whole lot more to know. You can read more here. If you want to see a real-life example, Metacity uses exactly this sort of arrangement for its theme files. (Later: Julien Puydt shares memories of how schema handling in gconf was written using GMarkup.) Any questions?

Photo: Day-old chick, GFDL, from here, by Fir0002, modified by Dcoetzee, Editor at Large, and tthurman.

Zaratbee

I have had a moderately successful day. I branched metacity for 2.22. (I wish I could write more about work here; it feels silly not writing about the thing I spend most of my time doing.) I made a list of things which need dealing with now we’ve branched, and as a diversion fixed up some map SVGs for Wikipedia (and another project). Someone came around with bagels.

Fin phoned and told me the SPCA had come by and told them they’d found a cat who looked just like our cat Zarate, who ran away shortly before Kirsten came to visit; Fin and Alex and Rio went to see, but the new cat (let’s say her name is Zaratbee) didn’t respond to her name or the feeding song, despite being about the right size and shape and colour. Fin thought it was her; Alex didn’t. I will go along tomorrow and see.

It snowed a lot. I was told I gave the best hugs. I came home and it snowed a lot and Fin in her awesomeness said “Why not let’s go to Los Aztecas?” So we did and the food was even better than it usually is, and almost nobody was there because of the snow. We talked about politics, and came home. I worked on some simple breadboard electronics on the dining-room table with Rio before she went to bed; she wrote up the findings in her lab notebook, which has a picture of Strawberry Shortcake on the front. Did I mention she has dyed her hair grape-soda purple? It really suits her.

I learned, or re-learned, about the __import__ statement in Python for a project I’m doing with the Metacity Journal (more will be revealed later). I asked jdub to move my public planet feed to blogs.gnome.org and I will automagically duplicate all content to LJ and blogo. I have not decided what I will do about comments. I also need to put some time into copying the stylesheets and formatting from marnanel.org to blogs.gnome.org.

Ande: I have started work on the river story. It is progressing nicely.

Things I wanted to share (because they were an interesting read):

Now I will make more tea and then go and get out of these clothes and sit in bed and fix bugs, which I can do because Alex has fixed wireless networking, so hallelujah, say I.

an update

Sharon came over and made us lasagne as a belated birthday tea for me.

I have done a good amount of hacking; it was fun and got stuff done. Yay.

Later, we went to the park and walked by the river, and Riordon and I discussed high-voltage electricity transmission.

Metacity’s translation into Welsh has not been touched in almost two years. It is a shame. If one of you Welsh speakers who reads this would like to give it a shot (there are about 500 phrases, mostly short, about 80% of them already filled in, and most of them guessed by the computer from the others and you only have to make minor adjustments), I will show you how to do so and you will have your name all over the credits. If not, I am happy to try to do the work myself, but I would like it if someone would check my work since I’m hardly confident in my own fluency.

Oh, and I learned how to say “fish and chips” in Welsh. It’s “sgod a sglod”. Makes sense.

I think this does not sound like a very exciting entry. Sorry :(