django-openid-auth

Last week, we released the source code to django-openid-auth.  This is a small library that can add OpenID based authentication to Django applications.  It has been used for a number of internal Canonical projects, including the sprint scheduler Scott wrote for the last Ubuntu Developer Summit, so it is possible you’ve already used the code.

Rather than trying to cover all possible use cases of OpenID, it focuses on providing OpenID Relying Party support to applications using Django’s django.contrib.auth authentication system.  As such, it is usually enough to edit just two files in an existing application to enable OpenID login.

The library has a number of useful features:

  • As well as the standard method of prompting the user for an identity URL, you can configure a fixed OpenID server URL.  This is useful for deployments where OpenID is being used for single sign on, and you always want users to log in using a particular OpenID provider.  Rather than asking the user for their identity URL, they are sent directly to the provider.
  • It can be configured to automatically create accounts when new identity URLs are seen.
  • User names, full names and email addresses can be set on accounts based on data sent via the OpenID Simple Registration extension.
  • Support for Launchpad‘s Teams OpenID extension, which lets you query membership of Launchpad teams when authenticating against Launchpad’s OpenID provider.  Team memberships are mapped to Django group membership.

While the code can be used for generic OpenID login, we’ve mostly been using it for single sign on.  The hope is that it will help members of the Ubuntu and Launchpad communities reuse our authentication system in a secure fashion.

The source code can be downloaded using the following Bazaar command:

bzr branch lp:django-openid-auth

Documentation on how to integrate the library is available in the README.txt file.  The library includes some code written by Simon Willison for django-openid, and uses the same licensing terms (2 clause BSD) as that project.

Re: Continuing to Not Quite Get It at Google…

David: taking a quick look at Google’s documentation, it sure looks like OpenID to me.  The main items of note are:

  1. It documents the use of OpenID 2.0’s directed identity mode.  Yes this is “a departure from the process outlined in OpenID 1.0”, but that could be considered true of all new features found in 2.0.  Google certainly isn’t the first to implement this feature:
    • Yahoo’s OpenID page recommends users enter “yahoo.com” in the identity box on web sites, which will initiate a directed identity authentication request.
    • We’ve been using directed identity with Launchpad to implement single sign on for various Canonical/Ubuntu sites.

    Given that Google account holders identify themselves by email address, users aren’t likely to know a URL to enter, so this kind of makes sense.

  2. The identity URLs returned by the OpenID provider do not directly reveal information about the user, containing a long random string to differentiate between users.  If the relying party wants any user details, they must request them via the standard OpenID Attribute Exchange protocol.
  3. They are performing access control based on the OpenID realm of the relying party.  I can understand doing this in the short term, as it gives them a way to handle a migration should they make an incompatible change during the beta.  If they continue to restrict access after the beta, you might have a valid concern.

It looks like there would be no problem talking to their provider using existing off the shelf OpenID libraries (like the ones from JanRain).

If you have an existing site using OpenID for login, chances are that after registering the realm with Google you’d be able to log in by entering Google’s OP server URL.  At that point, it’d be fairly trivial to add another button to the login page – sites seem pretty happy to plaster provider-specific radio buttons and entry boxes all over the page already …

Thoughts on OAuth

I’ve been playing with OAuth a bit lately. The OAuth specification fulfills a role that some people saw as a failing of OpenID: programmatic access to websites and authenticated web services. The expectation that OpenID would handle these cases seems a bit misguided since the two uses cases are quite different:

  • OpenID is designed on the principle of letting arbitrary OpenID providers talk to arbitrary relying parties and vice versa.
  • OpenID is intentionally vague about how the provider authenticates the user. The only restriction is that the authentication must be able to fit into a web browsing session between the user and provider.

While these are quite useful features for a decentralised user authentication scheme, the requirements for web service authentication are quite different:

  • There is a tighter coupling between the service provider and client. A client designed to talk to a photo sharing service won’t have much luck if you point it at a micro-blogging service.
  • Involving a web browser session in the authentication process for individual web service request is not a workable solution: the client might be designed to run offline for instance.

While the idea of a universal web services client is not achievable, there are areas of commonality between different the services: gaining authorisation from the user and authenticating individual requests. This is the area that OAuth targets.

While it has different applications, it is possible to compare some of the choices made in the protocol:

  1. The secrets for request and access tokens are sent to the client in the clear. So at a minimum, a service provider’s request token URL and access token URL should be served over SSL. OpenID nominally avoids this by using Diffie-Hellman Key Exchange to avoid evesdropping, but ended up needing it to avoid man in the middle attacks. So sending them in the clear is probably a more honest approach.
  2. Actual web service methods can be authenticated over plain HTTP in a fairly secure means using the HMAC-SHA1 or RSA-SHA1 signature methods. Although if you’re using SSL anyway, the PLAINTEXT authentication method is probably not any worse than HMAC-SHA1.
  3. The authentication protocol supports both web applications and desktop applications. Though any security gained through consumer secrets is invalidated for desktop applications, since anyone with a copy of the application will necessarily have access to the secrets. A few other points follow on from this:
    • The RSA-SHA1 signature method is not appropriate for use by desktop applications. The signature is based only on information available in the web service request and the RSA key associated with the consumer, and the private key will need to be distributed as part of the application. So if an attacker discovers an access token (not access token secret), they can authenticate.
    • The other two authentication methods — HMAC-SHA1 and PLAINTEXT — depend on an access token secret. Along with the access token, this is essentially a proxy for the user name and password, so should be protected as such (e.g. via the GNOME keyring).  It still sounds better than storing passwords directly, since the token won’t give access to unrelated sites the user happened to use the same password on, and can be revoked independently of changing the password.
  4. While the OpenID folks found a need for a formal extension mechanism for version 2.0 of that protocol, nothing like that seems to have been added to OAuth.  There are now a number of proposed extensions for OAuth, so it probably would have been a good idea.  Perhaps it isn’t as big a deal, due to tigher coupling of service providers and consumers, but I could imagine it being useful as the two parties evolve over time.

So the standard seems decent enough, and better than trying to design such a system yourself.  Like OpenID, it’ll probably take until the second release of the specification for some of the ambiguities to be taken care of and for wider adoption.

From the Python programmer point of view, things could be better.  The library available from the OAuth site seems quite immature and lacks support for a few aspects of the protocol.  It looks okay for simpler uses, but may be difficult to extend for use in more complicated projects.

Using email addresses as OpenID identities (almost)

On the OpenID specs mailing list, there was another discussion about using email addresses as OpenID identifiers. So far it has mostly covered existing ground, but there was one comment that interested me: a report that you can log in to many OpenID RPs by entering a Yahoo email address.

Now there certainly isn’t any Yahoo-specific code in the standard OpenID libraries, so you might wonder what is going on here. We can get some idea by using the python-openid library:

>>> from openid.consumer.discover import discover
>>> claimed_id, services = discover('example@yahoo.com')
>>> claimed_id
'http://www.yahoo.com/'
>>> services[0].type_uris
['http://specs.openid.net/auth/2.0/server',
 'http://specs.openid.net/extensions/pape/1.0']
>>> services[0].server_url
'https://open.login.yahooapis.com/openid/op/auth'
>>> services[0].isOPIdentifier()
True

So we can see that running the discovery algorithm on the email address has resulted in Yahoo’s standard identifier select endpoint. What we’ve actually seen here is the effect of Section 7.2 at work:

3. Otherwise, the input SHOULD be treated as an http URL; if it does not include a “http” or “https” scheme, the Identifier MUST be prefixed with the string “http://”.

So the email address is normalised to the URL http://example@yahoo.com (which is treated the same as http://yahoo.com/), which is then used for discovery. As shown above, this results in an identifier select request so works for all Yahoo users.

I wonder if the Yahoo developers realised that this would happen and set things up accordingly? If not, then this is a happy accident. It isn’t quite the same as having support for email addresses in OpenID since the user may end up having to enter their email address a second time in the OP if they don’t already have a session cookie.

It is certainly better than the RP presenting an error if the user accidentally enters an email address into the identity field. It seems like something that any OP offering email addresses to its users should implement.

Client Side OpenID

The following article discusses ideas that I wouldn’t even class as vapourware, as I am not proposing to implement them myself. That said, the ideas should still be implementable if anyone is interested.

One well known security weakness in OpenID is its weakness to phishing attacks. An OpenID authentication request is initiated by the user entering their identifier into the Relying Party, which then hands control to the user’s OpenID Provider through an HTTP redirect or form post. A malicious RP may instead forward the user to a site that looks like the user’s OP and record any information they enter. As the user provided their identifier, the RP knows exactly what site to forge.

Out Of Band Authorisation

One way around this is for the OP to authenticate the user and get authorisation out of band — just because the authentication message begins and ends with HTTP requests does not mean that the actual authentication/authorisation need be done through the web browser.

Possibilities include performing the authorisation via a Jabber message or SMS, or some special purpose protocol. Once authorisation is granted, the OP would need to send the OpenID response. Two ways for the web browser to detect this would be polling via AJAX, or using a server-push technique like Comet.

Using a Browser Extension

While the above method adds security it takes the user outside of their web browser, which could be disconcerting. We should be able to provide an improved user experience by using a web browser extension. So what is the best way for the extension to know when to do its thing?

One answer is whenever the user visits the server URL of their OP. Reading through the specification there are no other times when the user is required to visit that URL. So if the web browser extension can intercept GET and POST requests to a particular URL, it should be able to reliably detect when an authentication request is being initiated.

At this point, the extension can take over up to the point where it redirects the user back to the RP. It will need to communicate with the OP in some way to get the response signed, but we have the option of using some previously established back channel.

Moving the OP Client Side

Using the browser extension from the previous section as a starting point, we’ve moved some of the processing to the client side. We might now ask how much work can be moved to the client, and how much work needs to remain on the server?

From the specification, there are three points at which the RP needs to make a direct connection to the OP (or a related server):

  1. When performing discover, the RP needs to be able to read an HTML or XRDS file off some server.
  2. The associate request, used to generate an association that lets the RP verify authentication responses.
  3. The check_authentication request, used to verify a response in the case where an association was not provided in the request (or the OP said the association was invalid).

In all other cases, communication is mediated through the user’s browser (so are being intercepted by the browser extension). Furthermore, these three cases should only occur after the user initiates an OpenID authentication request. This means that the browser extension should be active and talking to the server.

So one option would be to radically simplify the server side so that it simply proxies the associate and check_authentication requests to the browser extension via a secure channel. This way pretty much the entire OP implementation resides in the browser extension with no state being handled by the server.

Conclusion

So it certainly looks like it is possible to migrate almost everything to the client side. That still leaves open the question of whether you’d actually want to do this, since it effectively makes your identity unavailable when away from a computer with the extension installed (a similar problem to use of self asserted infocards with Microsoft’s CardSpace).

Perhaps the intermediate form that still performs most of the OP processing on the server is more useful, providing a level of phishing resistance that would be difficult to fake (not only does it prevent rogue RPs from capturing credentials, the “proxied OP” attack will fail to activate the extension all together).