HTTP resource watcher

I’ve got most of the features of my HTTP resource watching code I was working on for GWeather done. The main benefits over the existing gnome-vfs based code are:

  • Simpler API. Just connect to the updated signal on the resource object, and you get notified when the resource changes.
  • Supports gzip and deflate content encodings, to reduce bandwidth usage.
  • Keeps track of Last-Modified date and Etag value for the resource so that it can do conditional GETs of the resource for simple client side caching.
  • Supports the Expires header. If the update interval is set at 30 minutes but the web server says that the it won’t be updated for an hour, then use the longer timeout til the next check.
  • If a permanent redirect is received, then the new URI is used for future checks.
  • If a 410 Gone response is received, then future checks are not queued (they can be restarted with a refresh() call).

I’ve also got some code to watch the HTTP proxy settings in GConf, but that seems to trigger a hang in libsoup (bug 309867).

While I wrote the code for use in GWeather, it could be quite useful for other tasks that require watching an HTTP resource such as:

  • HTTP calendar backend of evolution-data-server.
  • A stock ticker applet like gtik.
  • Possibly an RSS reader.

The code is available in my Bazaar archive:

baz get http://www.gnome.org/~jamesh/arch/james@jamesh.id.au/http-resource--devel--0

Going to Brazil

On Sunday I will be going to be travelling to São Carlos, Brazil for two weeks of the Launchpad sprint. It will be my first time travelling to either Brazil or South America so should be fun. That leaves just North America as the only major continent I haven’t visited.

Bryan’s Bazaar Tutorial

Bryan: there are a number of steps you can skip in your little tutorial:

  1. You don’t need to set my-default-archive. If you often work with multiple archives, you can treat working copies for all archives pretty much the same. If you are currently inside a working copy, any branch names you use will be relative to your current one, so you can still use short branch names in almost all cases (this is similar to the reason I don’t set $CVSROOT when working with CVS).
  2. If you have a directory which contains only the files you want to import into your Bazaar archive, the following command will add them all, and convert the directory into a Bazaar working copy:
    cd background-channels
    baz import -a bclark@redhat.com--gnomearchive/background-channels--dev--0.1

    No need for init-tree, add or commit.

  3. Running archive-mirror in your working copy will mirror that archive, so doesn’t need my-default-archive set.
  4. Other people probably don’t want to set your archive as their default. Also, they can ommit the register-archive call entirely:
    baz get http://gnome.org/~clarkbw/arch/background-channels--dev--0.1

    This checks out the branch, and registers the archive as a side effect.

  5. If you want to find out what is inside an archive, the following command is quite convenient:
    baz abrowse http://gnome.org/~clarkbw/arch

Some things you might want to do:

  1. If you have a PGP key, create a signed archive. This will cryptographically sign all revisions. When people checkout your branches, the signatures get checked automatically (this is useful if the server hosting your mirror gets broken into and you need to verify that nothing has been tampered with). If you have already created the archive, you can turn on signing with baz change-archive (remember to update the mirror archive too).
  2. If you turn on signing, consider using a PGP agent like gnome-gpg. You can configure it in ~/.arch-params/archives/defaults.
  3. It is customary to name the archive directory the same as the archive name. This has the benefit that the branch name matches the last portion of the URL.
  4. If you haven’t set up a revision library, you should do so:
    mkdir ~/.arch-revlib
    baz my-revision-library ~/.arch-revlib
    baz library-config --greedy --sparse ~/.arch-revlib

pkg-config vs. Cross Compile and Multi-arch

One of the areas where pkg-config can cause some problems is when trying to cross compile some code, or when working with multi-arch systems (such as bi-arch AMD64 Linux distros). While it is possible to use pkg-config in such systems by manipulating $PKG_CONFIG_PATH and/or $PKG_CONFIG_LIBDIR, users can’t just follow the instructions given for the single-arch case.

After some discussion with Wolfgang Wieser, we came up with a proposal for better supporting cross-compile and multi-arch uses. The main changes would be:

  • Add a new --host option pkg-config. This would allow pkg-config to use different default search paths based on the host type, and search for .pc files in host type specific subdirs on the search path.
  • If an unknown host type is given, then no default search path is disabled altogether.
  • The autoconf macro would pass this argument whenever it detected that pkg-config supported it.

For the common case, this should allow most packages to be built for the non default architecture on a bi-arch system, or cross compiled, by just passing --host=foo to configure and (you might still need to set $CC or $CFLAGS, depending on the compiler setup).

For packages that install .pc files, they should continue to work. However it will be worth updating them to install their .pc file into a host type specific sub directory (the autoconf macros will make this easy to do).

If this code is likely to affect you, send comments to the pkg-config mailing list (or leave comments here).

HTTP code in GWeather

One of the things that pisses me off about gweather is that it occasionally hangs and stops updating. It is a bit easier to tell when this has occurred these days, since it is quite obvious something’s wrong if gweather thinks it is night time when it clearly isn’t.

The current code uses gnome-vfs, which isn’t the best choice for this sort of thing. The code is the usual mess you get when turning an algoithm inside out to work through callbacks in C:

  1. One function opens the URL with gnome_vfs_async_open().
  2. The callback that gets triggered on completion of the open calls gnome_vfs_async_read().
  3. The callback that gets triggered on the end of the read checks the status. If it is at the end of the stream, then process the data and close the stream. Otherwise, perform another read (which will loop back to this step).

This logic is repeated 5 times for the different weather data sources. To clean this up, I started looking at libsoup which doesn’t try to be a full file system abstraction, but provides a better API for the kind of things gweather does.

I put together a simple HttpResource class that wraps the relevant parts of libsoup for apps like gweather. It can be used like so:

  1. Create an HttpResource instance for the given URI.
  2. Connect a handler to the resource’s updated signal.
  3. Call the _set_update_interval() method to say how often the resource should be checked.
  4. Call the _refresh() method to kick off periodic freshness checks.
  5. When new data arrives, the updated signal is emitted.

Since the code is designed for periodic updates, I added some simple caching behaviour. If the server reports that the resource hasn’t been modified, we don’t need to emit the updated signal.

There are a few things that still need doing:

  • Some code to keep a SoupSession instance up to date with the proxy configuration settings in GConf.
  • Correct handling of the Expires: response header. If we are checking for updates every 30 minutes, but the server says the current weather report is current for the next hour, then we shouldn’t check again til then.
  • Support gzip and/or deflate content transfer encoding to reduce bandwidth.

This code should be pretty trivial to integrate into gweather when it is done, and should simplify the logic. I guess it would be useful for other applets too, such as gtik. The current code is available in my Bazaar archive:

baz get http://www.gnome.org/~jamesh/arch/james@jamesh.id.au/http-resource--devel--0

Overriding Class Methods in Python

One of the features added back in Python 2.2 was class methods. These differ from traditional methods in the following ways:

  1. They can be called on both the class itself and instances of the class.
  2. Rather than binding to an instance, they bind to the class. This means that the first argument passed to the method is a class object rather than an instance.

For most intents and purposes, class methods are written the same way as normal instance methods. One place that things differ is overriding a class method in a subclass. The following simple example demonstrates the problem:

class SubClass(ParentClass):
    @classmethod
    def create(cls, arg):
        ret = ParentClass.create(cls, arg)
        ret.dosomethingelse()
        return ret

This code is broken because the ParentClass.create() call is calling the version of create() method in the context of ParentClass, rather than calling an unbound method like it would with a normal instance method. The most likely outcome will be a TypeError due to the method receiving too many arguments.

So how do you chain up to the parent class implementation? You use the super() object, which was also added in Python 2.2 as an alternative way to chain to the parent implementation of a method. The above code rewritten as follows:

class SubClass(ParentClass):
    @classmethod
    def create(cls, arg):
        ret = super(SubClass, cls).create(arg)
        ret.dosomethingelse()
        return ret

If you haven’t ever used the super() object, this is what it is doing in the above example:

  1. SubClass is looked up in the list cls.__mro__ (a linearised list of ancestor classes in the order used for method resolution).
  2. The class dict for each ancestor class coming after SubClass in cls.__mro__ is checked to see if it contains “create“.
  3. The super() object returns a version of “create” in the context of cls using the __get__(cls) “descriptor get” method.
  4. When this bound method gets called, cls will be passed in instead of the parent class.

Previously I’d ignored super() for the most part, since I could use the old chaining syntax. This shows a place where the old-style syntax can’t be applied.

pkg-config patches

I uploaded a few patches to the pkg-config bugzilla recently, which will hopefully make their way into the next release.

The first is related to bug 3097, which has to do with the broken dependent library elimination code added to 0.17.

The patch adds a Requires.private field to .pc files that contains a list of required packages like Requires currently does, which has the following properties:

  • When verifying that a particular package name is available with “pkg-config --exists“, dependencies in both Requires and Requires.private are checked.
  • When running “pkg-config --cflags“, flags from dependencies in Requires are included.
  • When running “pkg-config --libs“, flags from dependencies in Requires are included.
  • When running “pkg-config --static --libs“, flags from dependencies in both Requires and Requires.private are included.

The purpose of this is to list dependencies that are not exposed in the API of the library in question while not making users of the library link directly to those dependencies. This means that private dependencies can be upgraded to new incompatible versions without breaking applications that only depend on them indirectly.

This is intended for cases like Cairo, which links to libpng, but doesn’t expose any of the libpng API itself. It is not intended for dependencies like gtk+ depending on pango. Of course, this header will cause the .pc file to be incompatible with pkg-config versions prior to 0.16, because those versions don’t tolerate unknown fields.

The other changes are related to the associated autoconf macros:

  • Add a PKG_CHECK_EXISTS() macro. This would be similar to PKG_CHECK_MODULES(), except that no variables would be set or substitutes — it would simply run the ACTION-IF-FOUND or ACTION-IF-NOT-FOUND arguments. It is basically a less heavy weight macro for cases where you just want to see if a set of modules is available (bug 3530).
  • Get rid of the caching behaviour in PKG_CHECK_MODULES(). Since 0.16, this macro has cached the result of the check based on the variable prefix passed as the first argument. Since pkg-config is quite fast and configure doesn’t store its cache between runs by default, this doesn’t result in any noticable speed improvement and causes build problems for configure scripts that call PKG_CHECK_MODULES multiple times with the same variable name prefix but different package lists (e.g. Eye of Gnome). It seems simplest to just remove the caching, resulting in a simpler and more reliable macro (bug 3550, patch not yet uploaded).

With these changes, hopefully 0.18 will fix up the last few small incompatibilities in the recent releases.

Clipboard Handling

Phillip: your idea about direct client to client clipboard transfers is doable with the current X11 clipboard model:

  1. Clipboard owner advertises that it can convert selection to some special target type such as “client-to-client-transfer” or similar.
  2. If the pasting client supports client to client transfer, it can check the list of supported targets for the “client-to-client-transfer” target type and request conversion to that target.
  3. The clipboard owner returns a string containing details of how to request the data (e.g. hostname/port, or some other scheme that only works for the local host).
  4. Pasting application contacts the owner out of band and receives the data.

Yes, this requires modifications to applications in order to work correctly, but so would switching to a new clipboard architecture.

With respect to your no-transfer cut/paste of a movie example, that’s more of a component architecture problem than a clipboard issue. In the context of Bonobo, it can be done provided that the clipboard owner can provide the data as a Bonobo Embeddable, and the pasting application can embed Bonobo Embeddables in its documents:

  1. Clipboard owner advertises that it can convert the selection to the target “BONOBO_EMBEDDABLE” (or some other agreed upon targer name).
  2. Pasting application requests that the selection be converted to “BONOBO_EMBEDDABLE”, and receives an IOR for the component. Pasting application owns a reference on the component due to the clipboard transfer.
  3. Pasting application queryInterface()‘s the component to the Bonobo::ControlFactory interface, and calls the createControl() method to create a control to embed in the document.
  4. When it comes time to save the data, the component can be converted to one of the Bonobo::Persist interfaces, and written out.

Of course, there are reasons why people don’t do this (apart from not liking Bonobo), including:

  • With the classic X selection model, you don’t need to special case local or remote transfer cases.
  • Works in cases where the two applications can only communicate via the X connection (e.g. in the presence of transparent X proxies such as ssh).
  • It delegates all the permissions/authentication issues to the X server.

Anonymous voting

I put up a proposal for implementing anonymous voting for the foundation elections on the wiki. This is based in part on David’s earlier proposal, but simplifies some things based on the discussion on the list and fleshes out the implementation a bit more.

It doesn’t really add to the security of the elections process (doing so would require a stronger form of authentication than “can read a particular email account”), but does anonymise the election results and lets us do things like tell the voter that their completed ballot was malformed on submission.