<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>James Henstridge &#187; Python</title>
	<atom:link href="http://blogs.gnome.org/jamesh/tag/python/feed/" rel="self" type="application/rss+xml" />
	<link>http://blogs.gnome.org/jamesh</link>
	<description>Random stuff</description>
	<lastBuildDate>Tue, 27 Oct 2009 08:48:18 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.6</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Watching iView with Rygel</title>
		<link>http://blogs.gnome.org/jamesh/2009/07/06/watching-iview-with-rygel/</link>
		<comments>http://blogs.gnome.org/jamesh/2009/07/06/watching-iview-with-rygel/#comments</comments>
		<pubDate>Mon, 06 Jul 2009 08:50:45 +0000</pubDate>
		<dc:creator>James Henstridge</dc:creator>
				<category><![CDATA[Uncategorised]]></category>
		<category><![CDATA[Gnome]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[UPnP]]></category>

		<guid isPermaLink="false">http://blogs.gnome.org/jamesh/?p=439</guid>
		<description><![CDATA[One of the features of Rygel that I found most interesting was the external media server support.  It looked like an easy way to publish information on the network without implementing a full UPnP/DLNA media server (i.e. handling the UPnP multicast traffic, transcoding to a format that the remote system can handle, etc).
As a small [...]]]></description>
			<content:encoded><![CDATA[<p>One of the features of <a href="http://live.gnome.org/Rygel">Rygel</a> that I found most interesting was the <a href="http://live.gnome.org/Rygel/MediaServerSpec">external media server support</a>.  It looked like an easy way to publish information on the network without implementing a full UPnP/DLNA media server (i.e. handling the UPnP multicast traffic, transcoding to a format that the remote system can handle, etc).</p>
<p>As a small test, I put together a server that exposes the <a href="http://www.abc.net.au/">ABC</a>&#8217;s <a href="http://www.abc.net.au/iview/">iView</a> service to UPnP media renderers.  The result is a bit rough around the edges, but the basic functionality works.  The source can be grabbed using Bazaar:</p>
<blockquote>
<pre>bzr branch lp:~jamesh/+junk/rygel-iview</pre>
</blockquote>
<p>It needs Python, <a href="http://twistedmatrix.com/">Twisted</a>, the <a href="http://www.freedesktop.org/wiki/Software/DBusBindings">Python bindings for D-Bus</a> and <a href="http://lkcl.net/rtmp/">rtmpdump</a> to run.  The program exports the guide via D-Bus, and uses rtmpdump to stream the shows via HTTP.  Rygel then publishes the guide via the UPnP media server protocol and provides MPEG2 versions of the streams if clients need them.</p>
<p>There are still a few rough edges though.  The video from iView comes as 640&#215;480 with a 16:9 aspect ratio so has a 4:3 pixel aspect ratio, but there is nothing in the video file to indicate this (I am not sure if flash video supports this metadata).</p>
<p><strong>Getting Twisted and D-Bus to cooperate</strong></p>
<p>Since I&#8217;d decided to use Twisted, I needed to get it to cooperate with the D-Bus bindings for Python.  The first step here was to get both libraries using the same event loop.  This can be achieved by setting Twisted to use the glib2 reactor, and enabling the glib mainloop integration in the D-Bus bindings.</p>
<p>Next was enabling asynchronous D-Bus method implementations.  There is support for this in the D-Bus bindings, but has quite a different (and less convenient) API compared to Twisted.  A small decorator was enough to overcome this impedence:</p>
<blockquote>
<pre><strong>from</strong> functools <strong>import</strong> wraps

<strong>import</strong> dbus.service
<strong>from</strong> twisted.internet <strong>import</strong> defer

<strong>def</strong> dbus_deferred_method(*args, **kwargs):
    <strong>def</strong> decorator(function):
        function = dbus.service.method(*args, **kwargs)(function)
        @wraps(function)
        <strong>def</strong> wrapper(*args, **kwargs):
            dbus_callback = kwargs.pop('_dbus_callback')
            dbus_errback = kwargs.pop('_dbus_errback')
            d = defer.maybeDeferred(function, *args, **kwargs)
            d.addCallbacks(
                dbus_callback, <strong>lambda</strong> failure: dbus_errback(failure.value))
        wrapper._dbus_async_callbacks = ('_dbus_callback', '_dbus_errback')
        <strong>return</strong> wrapper
    <strong>return</strong> decorator</pre>
</blockquote>
<p>This decorator could then be applied to methods in the same way as the <tt>@dbus.service.method</tt> method, but it would correctly handle the case where the method returns a Deferred.  Unfortunately it can&#8217;t be used in conjunction with <tt>@defer.inlineCallbacks</tt>, since the D-Bus bindings don&#8217;t handle varargs functions properly.  You can of course call another function or method that uses <tt>@defer.inlineCallbacks</tt> though.</p>
<p><strong>The iView Guide</strong></p>
<p>After coding this, it became pretty obvious why it takes so long to load up the iView flash player: it splits the guide data over almost 300 XML files.  This might make sense if it relied on most of these files remaining unchanged and stored in cache, however it also uses a cache-busting technique when requesting them (adding a random query component to the URL).</p>
<p>Most of these files are series description files (some for finished series with no published programs).  These files contain a title, a short description, the URL for a thumbnail image and the IDs for the programs belonging to the series.  To find out about those programs, you need to load all the channel guide XML files until you find which one contains the program.  Going in the other direction, if you&#8217;ve got a program description from the channel guide and want to know about the series it belongs to (e.g. to get the thumbnail), you need to load each series description XML file until you find the one that contains the program.  So there aren&#8217;t many opportunities to delay loading of parts of the guide.</p>
<p>The startup time would be a lot easier if this information was collapsed down to a smaller number of larger XML files.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.gnome.org/jamesh/2009/07/06/watching-iview-with-rygel/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>django-openid-auth</title>
		<link>http://blogs.gnome.org/jamesh/2009/04/14/django-openid-auth/</link>
		<comments>http://blogs.gnome.org/jamesh/2009/04/14/django-openid-auth/#comments</comments>
		<pubDate>Tue, 14 Apr 2009 08:25:56 +0000</pubDate>
		<dc:creator>James Henstridge</dc:creator>
				<category><![CDATA[Uncategorised]]></category>
		<category><![CDATA[Django]]></category>
		<category><![CDATA[Launchpad]]></category>
		<category><![CDATA[OpenID]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[Ubuntu]]></category>

		<guid isPermaLink="false">http://blogs.gnome.org/jamesh/?p=426</guid>
		<description><![CDATA[Last week, we released the source code to django-openid-auth.  This is a small library that can add OpenID based authentication to Django applications.  It has been used for a number of internal Canonical projects, including the sprint scheduler Scott wrote for the last Ubuntu Developer Summit, so it is possible you&#8217;ve already used the code.
Rather [...]]]></description>
			<content:encoded><![CDATA[<p>Last week, we released the source code to <a href="https://launchpad.net/django-openid-auth">django-openid-auth</a>.  This is a small library that can add <a href="http://openid.net/">OpenID</a> based authentication to <a href="http://www.djangoproject.com/">Django</a> applications.  It has been used for a number of internal Canonical projects, including the sprint scheduler <a title="Scott James Remnant" href="http://www.netsplit.com/">Scott</a> wrote for the last Ubuntu Developer Summit, so it is possible you&#8217;ve already used the code.</p>
<p>Rather than trying to cover all possible use cases of OpenID, it focuses on providing OpenID Relying Party support to applications using Django&#8217;s <a title="User authentication in Django" href="http://docs.djangoproject.com/en/dev/topics/auth/">django.contrib.auth</a> authentication system.  As such, it is usually enough to edit just two files in an existing application to enable OpenID login.</p>
<p>The library has a number of useful features:</p>
<ul>
<li>As well as the standard method of prompting the user for an identity URL, you can configure a fixed OpenID server URL.  This is useful for deployments where OpenID is being used for single sign on, and you always want users to log in using a particular OpenID provider.  Rather than asking the user for their identity URL, they are sent directly to the provider.</li>
<li>It can be configured to automatically create accounts when new identity URLs are seen.</li>
<li>User names, full names and email addresses can be set on accounts based on data sent via the <a href="http://openid.net/specs/openid-simple-registration-extension-1_1-01.html">OpenID Simple Registration</a> extension.</li>
<li>Support for <a href="https://launchpad.net/">Launchpad</a>&#8217;s Teams OpenID extension, which lets you query membership of Launchpad teams when authenticating against Launchpad&#8217;s OpenID provider.  Team memberships are mapped to Django group membership.</li>
</ul>
<p>While the code can be used for generic OpenID login, we&#8217;ve mostly been using it for single sign on.  The hope is that it will help members of the Ubuntu and Launchpad communities reuse our authentication system in a secure fashion.</p>
<p>The source code can be downloaded using the following <a href="http://bazaar-vcs.org/">Bazaar</a> command:</p>
<blockquote>
<pre>bzr branch lp:django-openid-auth</pre>
</blockquote>
<p>Documentation on how to integrate the library is available in the <tt>README.txt</tt> file.  The library includes some code written by <a href="http://simonwillison.net/">Simon Willison</a> for <a href="http://code.google.com/p/django-openid/">django-openid</a>, and uses the same licensing terms (2 clause BSD) as that project.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.gnome.org/jamesh/2009/04/14/django-openid-auth/feed/</wfw:commentRss>
		<slash:comments>11</slash:comments>
		</item>
		<item>
		<title>Getting &#8220;bzr send&#8221; to work with GMail</title>
		<link>http://blogs.gnome.org/jamesh/2009/01/16/bzr-send-gmail/</link>
		<comments>http://blogs.gnome.org/jamesh/2009/01/16/bzr-send-gmail/#comments</comments>
		<pubDate>Fri, 16 Jan 2009 09:19:31 +0000</pubDate>
		<dc:creator>James Henstridge</dc:creator>
				<category><![CDATA[Uncategorised]]></category>
		<category><![CDATA[Bazaar]]></category>
		<category><![CDATA[Python]]></category>

		<guid isPermaLink="false">http://blogs.gnome.org/jamesh/?p=406</guid>
		<description><![CDATA[One of the nice features of Bazaar is the ability to send a bundle of changes to someone via email.  If you use a supported mail client, it will even open the composer with the changes attached.  If your client isn&#8217;t supported, then it&#8217;ll let you compose a message in your editor and then send [...]]]></description>
			<content:encoded><![CDATA[<p>One of the nice features of <a href="http://bazaar-vcs.org/">Bazaar</a> is the ability to send a bundle of changes to someone via email.  If you use a supported mail client, it will even open the composer with the changes attached.  If your client isn&#8217;t supported, then it&#8217;ll let you compose a message in your editor and then send it to an SMTP server.</p>
<p>GMail is not a supported mail client, but there are a few work arounds <a href="http://bazaar-vcs.org/BzrSendWithGmail">listed on the wiki</a>.  Those really come down to using an alternative mail client (either the editor or Mutt) and sending the mails through the GMail SMTP server.  Neither solution really appealed to me.  There doesn&#8217;t seem to be a programatic way of opening up GMail&#8217;s compose window and adding an attachment (not too surprising for a web app).</p>
<p>What is possible though is connecting via IMAP and adding messages to the drafts folder (assuming IMAP support is enabled).  So I wrote a small plugin to do just that.  It can be installed with the following command:</p>
<blockquote>
<pre>bzr branch lp:~jamesh/+junk/bzr-imapclient ~/.bazaar/plugins/imapclient</pre>
</blockquote>
<p>And then configure the IMAP server, username and mailbox according to the instructions in the README file.  You can then use &#8220;bzr send&#8221; as normal and then complete and send the draft at your leisure.</p>
<p>One nice thing about the plugin implementation is that it didn&#8217;t need any GMail specific features: it should be useful for anyone who has their drafts folder stored on an IMAP server and uses an unsupported mail client.</p>
<p>The main area where this could be improved would be to open up the compose screen in the web browser.  However, this would require knowing the internal message ID for the new message, which I can&#8217;t see how to access via IMAP.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.gnome.org/jamesh/2009/01/16/bzr-send-gmail/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Using Twisted Deferred objects with gio</title>
		<link>http://blogs.gnome.org/jamesh/2009/01/06/twisted-gio/</link>
		<comments>http://blogs.gnome.org/jamesh/2009/01/06/twisted-gio/#comments</comments>
		<pubDate>Tue, 06 Jan 2009 01:18:53 +0000</pubDate>
		<dc:creator>James Henstridge</dc:creator>
				<category><![CDATA[Uncategorised]]></category>
		<category><![CDATA[Gnome]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[Twisted]]></category>

		<guid isPermaLink="false">http://blogs.gnome.org/jamesh/?p=396</guid>
		<description><![CDATA[The gio library provides both synchronous and asynchronous interfaces for performing IO.  Unfortunately, the two APIs require quite different programming styles, making it difficult to convert code written to the simpler synchronous API to the asynchronous one.
For C programs this is unavoidable, but for Python we should be able to do better.  And if you&#8217;re [...]]]></description>
			<content:encoded><![CDATA[<p>The gio library provides both synchronous and asynchronous interfaces for performing IO.  Unfortunately, the two APIs require quite different programming styles, making it difficult to convert code written to the simpler synchronous API to the asynchronous one.</p>
<p>For C programs this is unavoidable, but for Python we should be able to do better.  And if you&#8217;re doing asynchronous event driven code in Python, it makes sense to look at <a href="http://twistedmatrix.com/">Twisted</a>.  In particular, Twisted&#8217;s Deferred objects can be quite helpful.</p>
<p><strong>Deferred</strong></p>
<p>The <a href="http://twistedmatrix.com/documents/8.2.0/api/twisted.internet.defer.Deferred.html">Twisted documentation</a> describes deferred objects as &#8220;a callback which will be put off until later&#8221;.  The deferred will eventually be passed the result of some operation, or information about how it failed.</p>
<p>From the consumer side, you can register one or more callbacks that will be run:</p>
<blockquote>
<pre><strong>def</strong> callback(result):
    # do stuff
    <strong>return</strong> result

deferred.addCallback(callback)</pre>
</blockquote>
<p>The first callback will be called with the original result, while subsequent callbacks will be passed the return value of the previous callback (this is why the above example returns its argument).  If the operation fails, one or more errbacks (error callbacks) will be called:</p>
<blockquote>
<pre><strong>def</strong> errback(failure):
    # do stuff
    <strong>return</strong> failure

deferred.addErrback(errback)</pre>
</blockquote>
<p>If the operation associated with the deferred has already been completed (or already failed) when the callback/errback is added, then it will be called immediately.  So there is no need to check if the operation is complete before hand.</p>
<p><strong>Using Deferred objects with gio</strong></p>
<p>We can easily use gio&#8217;s asynchronous API to implement a new API based on deferred objects.  For example:</p>
<blockquote>
<pre><strong>import</strong> gio
<strong>from</strong> twisted.internet <strong>import</strong> defer

<strong>def</strong> file_read_deferred(file, io_priority=0, cancellable=None):
    d = defer.Deferred()
    <strong>def</strong> callback(file, async_result):
        <strong>try</strong>:
            in_stream = file.read_finish(async_result)
        <strong>except</strong> gio.Error:
            d.errback()
        <strong>else</strong>:
            d.callback(in_stream)
    file.read_async(callback, io_priority, cancellable)
    <strong>return</strong> d

<strong>def</strong> input_stream_read_deferred(in_stream, count, io_priority=0,
                               cancellable=None):
    d = defer.Deferred()
    <strong>def</strong> callback(in_stream, async_result):
        <strong>try</strong>:
            bytes = in_stream.read_finish(async_result)
        <strong>except</strong> gio.Error:
            d.errback()
        <strong>else</strong>:
            d.callback(bytes)
    # the argument order seems a bit weird here ...
    in_stream.read_async(count, callback, io_priority, cancellable)
    <strong>return</strong> d</pre>
</blockquote>
<p>This is a fairly simple transformation, so you might ask what this buys us.  We&#8217;ve gone from an interface where you pass a callback to the method to one where you pass a callback to the result of the method.  The answer is in the tools that Twisted provides for working with deferred objects.</p>
<p><strong>The inlineCallbacks decorator</strong></p>
<p>You&#8217;ve probably seen code examples that use Python&#8217;s generators to implement simple co-routines.  Twisted&#8217;s <tt>inlineCallbacks</tt> decorator basically implements this for generators that yield deferred objects.  It uses the enhanced generators feature from Python 2.5 (<a href="http://www.python.org/dev/peps/pep-0342/">PEP 342</a>) to pass the deferred result or failure back to the generator.  Using it, we can write code like this:</p>
<blockquote>
<pre>@defer.inlineCallbacks
<strong>def</strong> print_contents(file, cancellable=None):
    in_stream = <strong>yield</strong> file_read_deferred(file, cancellable=cancellable)
    bytes = <strong>yield</strong> input_stream_read_deferred(
        in_stream, 4096, cancellable=cancellable)
    <strong>while</strong> bytes:
        # Do something with the data.  For this example, just print to stdout.
        sys.stdout.write(bytes)
        bytes = <strong>yield</strong> input_stream_read_deferred(
            in_stream, 4096, cancellable=cancellable)</pre>
</blockquote>
<p>Other than the use of the yield keyword, the above code looks quite similar to the equivalent synchronous implementation.  The only thing that would improve matters would be if these were real methods rather than helper functions.</p>
<p>Furthermore, the <tt>inlineCallbacks</tt> decorator causes the function to return a deferred that will fire when the function body finally completes or fails.  This makes it possible to use the function from within other asynchronous code in a similar fashion.  And once you&#8217;re using deferred results, you can mix in the gio calls with other Twisted asynchronous calls where it makes sense.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.gnome.org/jamesh/2009/01/06/twisted-gio/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Thoughts on OAuth</title>
		<link>http://blogs.gnome.org/jamesh/2008/10/23/thoughts-on-oauth/</link>
		<comments>http://blogs.gnome.org/jamesh/2008/10/23/thoughts-on-oauth/#comments</comments>
		<pubDate>Thu, 23 Oct 2008 03:46:53 +0000</pubDate>
		<dc:creator>James Henstridge</dc:creator>
				<category><![CDATA[Uncategorised]]></category>
		<category><![CDATA[OAuth]]></category>
		<category><![CDATA[OpenID]]></category>
		<category><![CDATA[Python]]></category>

		<guid isPermaLink="false">http://blogs.gnome.org/jamesh/?p=372</guid>
		<description><![CDATA[I&#8217;ve been playing with OAuth a bit lately.  The OAuth specification fulfills a role that some people saw as a failing of OpenID: programmatic access to websites and authenticated web services.  The expectation that OpenID would handle these cases seems a bit misguided since the two uses cases are quite different:

OpenID is designed [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve been playing with <a href="http://oauth.net/">OAuth</a> a bit lately.  The OAuth specification fulfills a role that some people saw as a failing of <a href="http://openid.net/">OpenID</a>: programmatic access to websites and authenticated web services.  The expectation that OpenID would handle these cases seems a bit misguided since the two uses cases are quite different:</p>
<ul>
<li>OpenID is designed on the principle of letting arbitrary OpenID providers talk to arbitrary relying parties and vice versa.</li>
<li>OpenID is intentionally vague about how the provider authenticates the user.  The only restriction is that the authentication must be able to fit into a web browsing session between the user and provider.</li>
</ul>
<p>While these are quite useful features for a decentralised user authentication scheme, the requirements for web service authentication are quite different:</p>
<ul>
<li>There is a tighter coupling between the service provider and client.  A client designed to talk to a photo sharing service won&#8217;t have much luck if you point it at a micro-blogging service.</li>
<li>Involving a web browser session in the authentication process for individual web service request is not a workable solution: the client might be designed to run offline for instance.</li>
</ul>
<p>While the idea of a universal web services client is not achievable, there are areas of commonality between different the services: gaining authorisation from the user and authenticating individual requests.  This is the area that OAuth targets.</p>
<p>While it has different applications, it is possible to compare some of the choices made in the protocol:</p>
<ol>
<li>The secrets for request and access tokens are sent to the client in the clear.  So at a minimum, a service provider&#8217;s request token URL and access token URL should be served over SSL.  OpenID nominally avoids this by using <span class="info"><a href="http://en.wikipedia.org/wiki/Diffie-Hellman_key_exchange">Diffie-Hellman Key Exchange</a> to avoid evesdropping, but ended up needing it to avoid man in the middle attacks.  So sending them in the clear is probably a more honest approach.</span></li>
<li>Actual web service methods can be authenticated over plain HTTP in a fairly secure means using the HMAC-SHA1 or RSA-SHA1 signature methods.  Although if you&#8217;re using SSL anyway, the PLAINTEXT authentication method is probably not any worse than HMAC-SHA1.</li>
<li>The authentication protocol supports both web applications and desktop applications.  Though any security gained through consumer secrets is invalidated for desktop applications, since anyone with a copy of the application will necessarily have access to the secrets.  A few other points follow on from this:
<ul>
<li><span class="info">The RSA-SHA1 signature method is not appropriate for use by desktop applications. The signature is based only on information available in the web service request and the RSA key associated with the consumer, and the private key will need to be distributed as part of the application.  So if an attacker discovers an access token (not access token secret), they can authenticate.</span></li>
<li>The other two authentication methods — HMAC-SHA1 and PLAINTEXT — depend on an access token secret.  Along with the access token, this is essentially a proxy for the user name and password, so should be protected as such (e.g. via the <a href="http://live.gnome.org/GnomeKeyring">GNOME keyring</a>).  It still sounds better than storing passwords directly, since the token won&#8217;t give access to unrelated sites the user happened to use the same password on, and can be revoked independently of changing the password.</li>
</ul>
</li>
<li>While the OpenID folks found a need for a formal extension mechanism for version 2.0 of that protocol, nothing like that seems to have been added to OAuth.  There are now a number of proposed extensions for OAuth, so it probably would have been a good idea.  Perhaps it isn&#8217;t as big a deal, due to tigher coupling of service providers and consumers, but I could imagine it being useful as the two parties evolve over time.</li>
</ol>
<p>So the standard seems decent enough, and better than trying to design such a system yourself.  Like OpenID, it&#8217;ll probably take until the second release of the specification for some of the ambiguities to be taken care of and for wider adoption.</p>
<p>From the Python programmer point of view, things could be better.  The library available from the OAuth site seems quite immature and lacks support for a few aspects of the protocol.  It looks okay for simpler uses, but may be difficult to extend for use in more complicated projects.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.gnome.org/jamesh/2008/10/23/thoughts-on-oauth/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Django support landed in Storm</title>
		<link>http://blogs.gnome.org/jamesh/2008/09/19/django-support-landed-in-storm/</link>
		<comments>http://blogs.gnome.org/jamesh/2008/09/19/django-support-landed-in-storm/#comments</comments>
		<pubDate>Fri, 19 Sep 2008 06:23:51 +0000</pubDate>
		<dc:creator>James Henstridge</dc:creator>
				<category><![CDATA[Uncategorised]]></category>
		<category><![CDATA[Django]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[Storm]]></category>

		<guid isPermaLink="false">http://blogs.gnome.org/jamesh/2008/09/19/django-support-landed-in-storm/</guid>
		<description><![CDATA[Since my last article on integrating Storm with Django, I&#8217;ve merged my changes to Storm&#8217;s trunk.  This missed the 0.13 release, so you&#8217;ll need to use Bazaar to get the latest trunk or wait for 0.14.
The focus since the last post was to get Storm to cooperate with Django&#8217;s built in ORM.  One of the [...]]]></description>
			<content:encoded><![CDATA[<p>Since <a href="http://blogs.gnome.org/jamesh/2008/08/01/using-storm-with-django/">my last article</a> on integrating <a href="http://storm.canonical.com/">Storm</a> with <a href="http://www.djangoproject.com/">Django</a>, I&#8217;ve merged my changes to Storm&#8217;s trunk.  This missed the 0.13 release, so you&#8217;ll need to use Bazaar to get the latest trunk or wait for 0.14.</p>
<p>The focus since the last post was to get Storm to cooperate with Django&#8217;s built in ORM.  One of the reasons people use Django is the existing components that can be used to build a site.  This ranges from the included user management and administration code to full <a href="http://www.satchmoproject.com/">web shop implementations</a>.  So even if you plan to use Storm for your Django application, your application will most likely use Django&#8217;s ORM for some things.</p>
<p>When I last posted about this code, it was possible to use both ORMs in a single app, but they would use separate database connections.  This had a number of disadvantages:</p>
<ul>
<li>The two connections would be running separate transactions in parallel, so changes made by one connection would not be visible to the other connection until after the transaction was complete.  This is a problem when updating records in one table that reference rows that are being updated on the other connection.</li>
<li>When you have more than one connection, you introduce a new failure mode where one transaction may successfully commit but the other fail, leaving you with only half the changes being recorded.  This can be fixed by using two phase commit, but that is not supported by either Django or Storm at this point in time.</li>
</ul>
<p>So it is desirable to have the two ORMs sharing a single connection.  The way I&#8217;ve implemented this is as a Django database engine backend that uses the connection for a particular named per-thread store and passes transaction commit or rollback requests through to the global transaction manager.  Configuration is as simple as:</p>
<blockquote>
<pre>DATABASE_ENGINE = 'storm.django.backend'
DATABASE_NAME = 'store-name'
STORM_STORES = {'store-name': 'database-uri'}</pre>
</blockquote>
<p>This will work for PostgreSQL or MySQL connections: Django requires some additional set up for SQLite connections that Storm doesn&#8217;t do.</p>
<p>Once this is configured, things mostly just work.  As Django and Storm both maintain caches of data retrieved from the database though, accessing the same table with both ORMs could give unpredictable results.  My code doesn&#8217;t attempt to solve this problem so it is probably best to access tables with only one ORM or the other.</p>
<p>I suppose the next step here would be to implement something similar to Storm&#8217;s <tt>Reference</tt> class to represent links between objects managed by Storm and objects managed by Django and vice versa.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.gnome.org/jamesh/2008/09/19/django-support-landed-in-storm/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Transaction Management in Django</title>
		<link>http://blogs.gnome.org/jamesh/2008/09/01/transaction-management-in-django/</link>
		<comments>http://blogs.gnome.org/jamesh/2008/09/01/transaction-management-in-django/#comments</comments>
		<pubDate>Mon, 01 Sep 2008 07:42:39 +0000</pubDate>
		<dc:creator>James Henstridge</dc:creator>
				<category><![CDATA[Uncategorised]]></category>
		<category><![CDATA[Django]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[Storm]]></category>

		<guid isPermaLink="false">http://blogs.gnome.org/jamesh/2008/09/01/transaction-management-in-django/</guid>
		<description><![CDATA[In my previous post about Django, I mentioned that I found the transaction handling strategy in Django to be a bit surprising.
Like most object relational mappers, it caches information retrieved from the database, since you don&#8217;t want to be constantly issuing SELECT queries for every attribute access.  However, it defaults to commiting after saving [...]]]></description>
			<content:encoded><![CDATA[<p>In <a href="http://blogs.gnome.org/jamesh/2008/08/01/using-storm-with-django/">my previous post about Django</a>, I mentioned that I found the transaction handling strategy in <a href="http://www.djangoproject.com/">Django</a> to be a bit surprising.</p>
<p>Like most <a href="http://en.wikipedia.org/wiki/Object-relational_mapping">object relational mappers</a>, it caches information retrieved from the database, since you don&#8217;t want to be constantly issuing SELECT queries for every attribute access.  However, it defaults to commiting after saving changes to each object.  So a single web request might end up issuing many transactions:</p>
<table valign="middle" align="center" border="0" cellpadding="4" cellspacing="0">
<tr>
<td bgcolor="#dddddd">Change object 1</td>
<td bgcolor="#dddddd">Transaction 1</td>
</tr>
<tr>
<td>Change object 2</td>
<td>Transaction 2</td>
</tr>
<tr>
<td bgcolor="#dddddd">Change object 3</td>
<td bgcolor="#dddddd">Transaction 3</td>
</tr>
<tr>
<td>Change object 4</td>
<td>Transaction 4</td>
</tr>
<tr>
<td bgcolor="#dddddd">Change object 5</td>
<td bgcolor="#dddddd">Transaction 5</td>
</tr>
</table>
<p>Unless no one else is accessing the database, there is a chance that other users could modify objects that the ORM has cached over the transaction boundaries.  This also makes it difficult to test your application in any meaningful way, since it is hard to predict what changes will occur at those points.  Django does provide a few ways to provide better transactional behaviour.</p>
<p><strong>The @commit_on_success Decorator</strong></p>
<p>The first is a decorator that turns on manual transaction management for the duration of the function and does a commit or rollback when it completes depending on whether an exception was raised. In the above example, if the middle three operations were made inside a <tt>@commit_on_success</tt> function, it would look something like this:</p>
<table valign="middle" align="center" border="0" cellpadding="4" cellspacing="0">
<tr>
<td bgcolor="#dddddd">Change object 1</td>
<td bgcolor="#dddddd">Transaction 1</td>
</tr>
<tr>
<td>Change object 2</td>
<td rowspan="3">Transaction 2</td>
</tr>
<tr>
<td>Change object 3</td>
</tr>
<tr>
<td>Change object 4</td>
</tr>
<tr>
<td bgcolor="#dddddd">Change object 5</td>
<td bgcolor="#dddddd">Transaction 3</td>
</tr>
</table>
<p>Note that the decorator is usually used on view functions, so it will usually cover most of the request.  That said, there are a number of cases where extra work might be done outside of the function.  Some examples include work done in middleware classes and views that call other view functions.</p>
<p><strong>The TransactionMiddleware class</strong></p>
<p>Another alternative is to install the <tt>TransactionMiddleware</tt> middleware class for the site.  This turns on transaction management for the duration of each request, similar to what you&#8217;d see with other frameworks giving results something like this:</p>
<table valign="middle" align="center" border="0" cellpadding="4" cellspacing="0">
<tr>
<td bgcolor="#dddddd">Change object 1</td>
<td rowspan="5" bgcolor="#dddddd">Transaction 1</td>
</tr>
<tr>
<td bgcolor="#dddddd">Change object 2</td>
</tr>
<tr>
<td bgcolor="#dddddd">Change object 3</td>
</tr>
<tr>
<td bgcolor="#dddddd">Change object 4</td>
</tr>
<tr>
<td bgcolor="#dddddd">Change object 5</td>
</tr>
</table>
<p><strong>Combining @commit_on_success and TransactionMiddleware</strong></p>
<p>At first, it would appear that these two approaches cover pretty much everything you&#8217;d want.  But there are problems when you combine the two.  If we use the <tt>@commit_on_success</tt> decorator as before and <tt>TransactionMiddleware</tt>, we get the following set of transactions:</p>
<table valign="middle" align="center" border="0" cellpadding="4" cellspacing="0">
<tr>
<td bgcolor="#dddddd">Change object 1</td>
<td rowspan="4" bgcolor="#dddddd">Transaction 1</td>
</tr>
<tr>
<td bgcolor="#dddddd">Change object 2</td>
</tr>
<tr>
<td bgcolor="#dddddd">Change object 3</td>
</tr>
<tr>
<td bgcolor="#dddddd">Change object 4</td>
</tr>
<tr>
<td>Change object 5</td>
<td>Transaction 2</td>
</tr>
</table>
<p>The transaction for the <tt>@commit_on_success</tt> function has extended to cover the operations made before hand.  This also means that operations #1 and #5 are now in separate transactions despite the use of <tt>TransactionMiddleware</tt>.  The problem also occurs with nested use of <tt>@commit_on_success</tt>, as reported in <a href="http://code.djangoproject.com/ticket/2227">Django bug 2227</a>.</p>
<p>A better behaviour for nested transaction management would be something like this:</p>
<ol>
<li>On success, do nothing.  The changes will be committed by the outside caller.</li>
<li>On failure, do not abort the transaction, but instead mark it as uncommittable.  This would have similar semantics to the Zope <tt>transaction.doom()</tt> function.</li>
</ol>
<p>It is important that the nested call does not abort the transaction because that would cause a new transaction to be started by subsequent code: that should be left to the code that began the transaction.</p>
<p><strong>The @autocommit decorator</strong></p>
<p>While the above interaction looks like a simple bug, the <tt>@autocommit</tt> decorator is another matter.  It turns autocommit on for the duration of a function call, no matter what the transaction mode for the caller was.  If we took the original example and wrapped the middle three operations with <tt>@autocommit</tt> and used <tt>TransactionMiddleware</tt>, we&#8217;d get 4 transactions: one for the first two operations, then one for each of the remaining operations.</p>
<p>I can&#8217;t think of a situation where it would make sense to use, and wonder if it was just added for completeness.</p>
<p><strong>Conclusion</strong></p>
<p>While the nesting bugs remain, my recommendation would be to go for the <tt>TransactionMiddleware</tt> and avoid use of the decorators (both in your own code and third party components).  If you are writing reusable code that requires transactions, it is probably better to assert that <tt>django.db.transaction.is_managed()</tt> is true so that you get a failure for improperly configured systems while not introducing unwanted transaction boundaries.</p>
<p>For the <a href="http://storm.canonical.com/">Storm</a> integration work I&#8217;m doing, I&#8217;ve set it to use managed transaction mode to avoid most of the unwanted commits, but it still falls prey to the extra commits when using the decorators.  So I guess inspecting the code is still necessary.  If anyone has other tips, I&#8217;d be glad to hear them.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.gnome.org/jamesh/2008/09/01/transaction-management-in-django/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Storm 0.13</title>
		<link>http://blogs.gnome.org/jamesh/2008/08/29/storm-013/</link>
		<comments>http://blogs.gnome.org/jamesh/2008/08/29/storm-013/#comments</comments>
		<pubDate>Fri, 29 Aug 2008 08:21:20 +0000</pubDate>
		<dc:creator>James Henstridge</dc:creator>
				<category><![CDATA[Uncategorised]]></category>
		<category><![CDATA[Launchpad]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[Storm]]></category>
		<category><![CDATA[Zope]]></category>

		<guid isPermaLink="false">http://blogs.gnome.org/jamesh/2008/08/29/storm-013/</guid>
		<description><![CDATA[Yesterday, Thomas rolled the 0.13 release of Storm, which can be downloaded from Launchpad.  Storm is the object relational mapper for Python used by Launchpad and Landscape, so it is capable of supporting quite large scale applications.  It is seven months since the last release, so there is a lot of improvements.  Here are a [...]]]></description>
			<content:encoded><![CDATA[<p>Yesterday, Thomas rolled the 0.13 release of <a href="http://storm.canonical.com/">Storm</a>, which can be <a href="https://launchpad.net/storm/trunk/0.13">downloaded from Launchpad</a>.  Storm is the object relational mapper for <a href="http://www.python.org/">Python</a> used by <a href="https://launchpad.net/">Launchpad</a> and <a href="http://www.canonical.com/projects/landscape">Landscape</a>, so it is capable of supporting quite large scale applications.  It is seven months since the last release, so there is a lot of improvements.  Here are a few simple statistics:</p>
<table border="0" cellpadding="3">
<tr>
<td></td>
<th align="right">0.12</th>
<th align="right">0.13</th>
<th align="right">Change</th>
</tr>
<tr>
<td>Tarball size (KB)</td>
<td align="right">117</td>
<td align="right">155</td>
<td align="right">38</td>
</tr>
<tr>
<td>Mainline revisions</td>
<td align="right">213</td>
<td align="right">262</td>
<td align="right">49</td>
</tr>
<tr>
<td>Revisions in ancestry</td>
<td align="right">552</td>
<td align="right">875</td>
<td align="right">323</td>
</tr>
</table>
<p>So it is a fairly significant update by any of these metrics.  Among the new features are:</p>
<ul>
<li>Infrastructure for tracing the SQL statements issued by Storm.  Sample tracer implementations are provided to implement bounded statement run times and for logging statements (both features used for QA of Launchpad).</li>
<li>A validation framework.  The property constructors take a validator keyword argument, which should be a function taking arguments (object, attr_name, value) and return the value to set.  If the function raises an exception, it can prevent a value from being set.  By returning something different to its third argument it can transform values.</li>
<li>The <tt>find()</tt> and <tt>ResultSet</tt> API has been extended to make it possible to generate queries that use <tt>GROUP BY</tt> and <tt>HAVING</tt>.  The primary use case for result sets that contain an object plus some aggregates associated with that object.</li>
<li>Some core parts of Storm have been accelerated through a C extension.  This code is turned off by default, but can be enabled by defining the <tt>STORM_CEXTENSIONS</tt> environment variable to 1.  While it is disabled by default, it is pretty stable.  Barring any serious problems reported over the next release cycle, I&#8217;d expect it to be enabled by default for the next release.</li>
<li>The minimum dependencies of the <tt>storm.zope.zstorm</tt> module have been reduced to just the <tt>zope.interface</tt> and <tt>transaction</tt> modules.  This makes it easier to use the per-thread store management code and global transaction management outside of Zope apps (e.g. for <a href="http://blogs.gnome.org/jamesh/2008/08/01/using-storm-with-django/">integrating with Django</a>).</li>
</ul>
<p>It doesn&#8217;t include my Django integration code though, since that isn&#8217;t fully baked.  I&#8217;ll post some more about that later.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.gnome.org/jamesh/2008/08/29/storm-013/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Using Storm with Django</title>
		<link>http://blogs.gnome.org/jamesh/2008/08/01/using-storm-with-django/</link>
		<comments>http://blogs.gnome.org/jamesh/2008/08/01/using-storm-with-django/#comments</comments>
		<pubDate>Fri, 01 Aug 2008 09:23:16 +0000</pubDate>
		<dc:creator>James Henstridge</dc:creator>
				<category><![CDATA[Uncategorised]]></category>
		<category><![CDATA[Django]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[Storm]]></category>
		<category><![CDATA[Zope]]></category>

		<guid isPermaLink="false">http://blogs.gnome.org/jamesh/2008/08/01/using-storm-with-django/</guid>
		<description><![CDATA[I&#8217;ve been playing around with Django a bit for work recently, which has been interesting to see what choices they&#8217;ve made differently to Zope 3.  There were a few things that surprised me:

The ORM and database layer defaults to autocommit mode rather than using transactions.  This seems like an odd choice given that all the [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve been playing around with <a href="http://www.djangoproject.com/">Django</a> a bit for work recently, which has been interesting to see what choices they&#8217;ve made differently to <a href="http://wiki.zope.org/zope3/">Zope 3</a>.  There were a few things that surprised me:</p>
<ul>
<li>The ORM and database layer defaults to autocommit mode rather than using transactions.  This seems like an odd choice given that all the major free databases support transactions these days.  While autocommit might work fine when a web application is under light use, it is a recipe for problems at higher loads.  By using transactions that last for the duration of the request, the testing you do is more likely to help with the high load situations.</li>
<li>While there is a middleware class to enable request-duration transactions, it only covers the database connection.  There is no global transaction manager to coordinate multiple DB connections or other resources.</li>
<li>The ORM appears to only support a single connection for a request.  While this is the most common case and should be easy to code with, allowing an application to expand past this limit seems prudent.</li>
<li>The tutorial promotes schema generation from Python models, which I feel is the wrong choice for any application that is likely to evolve over time (i.e. pretty much every application).  I&#8217;ve <a href="http://blogs.gnome.org/jamesh/2007/09/28/orm-schema-generation/">written about this previously</a> and believe that migration based schema management is a more workable solution.</li>
<li>It poorly <a href="http://blogs.gnome.org/jamesh/2008/06/11/tls-python/">reinvents thread local storage</a> in a few places.  This isn&#8217;t too surprising for things that existed prior to Python 2.4, and probably isn&#8217;t a problem for its default mode of operation.</li>
</ul>
<p>Other than these things I&#8217;ve noticed so far, it looks like a nice framework.</p>
<p><strong>Integrating Storm</strong></p>
<p>I&#8217;ve been doing a bit of work to make it easy to use <a href="http://storm.canonical.com/">Storm</a> with Django.  I posted some initial details <a href="http://thread.gmane.org/gmane.comp.python.storm/673">on the mailing list</a>.  The initial code has been <a href="https://code.launchpad.net/~jamesh/storm/django-support">published on Launchpad</a> but is not yet ready to merge. Some of the main details include:</p>
<ul>
<li>A middleware class that integrates the Zope global transaction manager (which requires just the zope.interface and transaction packages).  There doesn&#8217;t appear to be any equivalent functionality in Django, and this made it possible to reuse the existing integration code (an approach that has been taken to use Storm with <a href="http://pylonshq.com/">Pylons</a>).  It will also make it easier to take advantage of other future improvements (e.g. only committing stores that are used in a transaction, two phase commit).</li>
<li>Stores can be configured through the application&#8217;s Django settings file, and are managed as long lived per-thread connections.</li>
<li>A simple get_store(name) function is provided for accessing per-thread stores within view code.</li>
</ul>
<p>What this doesn&#8217;t do yet is provide much integration with existing Django functionality (e.g. django.contrib.admin).  I plan to try and get some of these bits working in the near future.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.gnome.org/jamesh/2008/08/01/using-storm-with-django/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>How not to do thread local storage with Python</title>
		<link>http://blogs.gnome.org/jamesh/2008/06/11/tls-python/</link>
		<comments>http://blogs.gnome.org/jamesh/2008/06/11/tls-python/#comments</comments>
		<pubDate>Wed, 11 Jun 2008 10:00:21 +0000</pubDate>
		<dc:creator>James Henstridge</dc:creator>
				<category><![CDATA[Uncategorised]]></category>
		<category><![CDATA[Python]]></category>

		<guid isPermaLink="false">http://blogs.gnome.org/jamesh/2008/06/11/tls-python/</guid>
		<description><![CDATA[The Python standard library contains a function called thread.get_ident().  It will return an integer that uniquely identifies the current thread at that point in time.  On most UNIX systems, this will be the pthread_t value returned by pthread_self().  At first look, this might seem like a good value to key a thread local storage [...]]]></description>
			<content:encoded><![CDATA[<p>The <a href="http://www.python.org/">Python</a> standard library contains a function called <tt>thread.get_ident()</tt>.  It will return an integer that uniquely identifies the current thread at that point in time.  On most UNIX systems, this will be the <tt>pthread_t</tt> value returned by <tt>pthread_self()</tt>.  At first look, this might seem like a good value to key a thread local storage dictionary with.  <em>Please</em> don&#8217;t do that.</p>
<p>The value uniquely identifies the thread only as long as it is running.  The value can be reused after the thread exits.  On my system, this happens quite reliably with the following sample program printing the same ID ten times:</p>
<blockquote>
<pre><strong>import</strong> thread, threading

<strong>def</strong> foo():
    <strong>print</strong> 'Thread ID:', thread.get_ident()

<strong>for</strong> i <strong>in</strong> range(10):
    t = threading.Thread(target=foo)
    t.start()
    t.join()</pre>
</blockquote>
<p>If the return value of <tt>thread.get_ident()</tt> was used to key thread local storage, all ten threads would share the same storage.  This is not generally considered to be desirable behaviour.</p>
<p>Assuming that you can depend on Python 2.4 (released 3.5 years ago), then just use a <a href="http://www.python.org/doc/current/lib/module-threading.html"><tt>threading.local</tt></a> object.  It will result in simpler code, correctly handle serially created threads, and you won&#8217;t hold onto TLS data past the exit of a thread.</p>
<p>You will save yourself (or another developer) a lot of time at some point in the future.  Debugging these problems is not fun when you combine code doing proper TLS with other code doing broken TLS.</p>
]]></content:encoded>
			<wfw:commentRss>http://blogs.gnome.org/jamesh/2008/06/11/tls-python/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
	</channel>
</rss>
