Archive for June, 2013

June 27th OpenStack Foundation Board Meeting

Saturday, June 29th, 2013

On June 27, the OpenStack Foundation Board of Directors met for two hours via conference call. As usual, the agenda was published in advance and the meeting was open to anyone who wanted to observe the proceedings.

These notes are my perspective of the meeting. Jonathan Bryce published an official summary and official minutes will be posted in due course.

Roll Call and Meeting Confidentiality Policy

As you can imagine, a conference call with 20-30 attendees takes a while to get going. We began with a roll call and, after perhaps 15 minutes, were ready to get started.

First item on the agenda was a review of our meeting confidentiality policy. Directors agree to refrain from commenting on the meeting proceedings until Jonathan posts his summary. The only official record of the meeting is Jonathan’s summary and the official minutes. Anything discussed during executive session is confidential. Nothing new here.

Amendment to our Certificate of Incorporation

Next up was a motion to approve a relatively minor amendment to the foundation’s certificate of incorporation.

The details of the amendment is fairly obscure, but essentially the Foundation is applying for U.S. 501(c) status which means it will be a tax-exempt, non-profit organization. There are various different organization types, but the
two most relevant are 501(c)(3) and 501(c)(6). OpenStack is filing for 501(c)(6) status.

Jonathan explained that, while preparing for this filing, it was noted that the original certificate of incorporation only allows (on winding up of the foundation) the foundation’s assets to be transferred to a 501(c)(3) organization. This amendment simply allows for the possibility of transferring assets to 501(c)(6) organizations.

There was some brief discussion clarifying exactly which status we were filing for and the motion was passed unanimously.

Transparency Policy

Next up, Lauren Sell explained the context for a formal transparency policy document which had been circulated to the board members and would require further discussion before being approved.

Lauren reminded us that at our meeting in April, the Transparency Working Group presented the principles for a transparency policy. Lauren had since worked with legal counsel to draft a more formal policy based on those original principles.

The main question Lauren felt needed to be cleared up was the issue of document management. The board has (and will continue to have) documents which must remain confidential. At our April meeting we had agreed that the OpenStack Infrastructure team would investigate hosting an OwnCloud instance which would act as our document store. While this was still on the team’s to-do list, it had not been prioritized over other work.

I suggested that, in the team time, we create a new mailing list for the the board (e.g. foundation-directors?) which would be open to everyone for read-only subscription and we would use the current mailing list to share confidential documents. Once the document management system is in place, we could then shut down the private foundation-board mailing list.

Rather than discuss in any great detail, it was agreed the Lauren would start a discussion on the foundation mailing list.

Summits & Marketing

Next up, Mark Collier and Lauren gave us an update on the Marketing front and how summit planning is progressing.

Lauren first talked through some excellent slides with a tonne of details about the Icehouse Design Summit in Hong Kong from Nov 5-8, 2013. This is the first time that the summit will be held at an “international venue” (an amusing term if you’re not U.S. based :) and we again expect record attendance.

Included in Lauren’s slides were some really helpful maps and aerial shots showing the venue, the geography of Hong Kong and the location of the recommended hotels. The venue is located near the airport which is a 25 minute train journey from down town Hong Kong. There are a couple of hotels adjacent to the venue and most of the other recommended hotels are down town. The foundation staff have worked hard to come up with a good range of hotel options, including hotels with a rate of under $150 per night.

In terms of travel advice, it was noted that visitors must have a passport valid for at least one month after their planned stay and that flights from SFO to HKG are currently averaging between $1000 and $1400. Jonathan recommends that people book their flights early, because fares will increase very significantly closer to the event. Lauren also pointed out that it’s sensible to make hotel books now, since the hotels closest to the venue are already selling out.

Lauren then talked through the planned format for the summit, which has been heavily influenced by feedback received through the survey results from the previous summit.

This time around there will be two types of passes. A more affordable limited access pass will give access to the expo hall, general sessions and a single track of breakout sessions on Tuesday and Wednesday. The hope is that this will help control the numbers at the breakout sessions, but also make the event more accessible to folks who just want to come along for the first time and learn about OpenStack.

The primary language of the event will be English, but there will be simultaneous translation into Mandarin in the main hall.

The call for sponsors is already open and we have 21 sponsors to date. The headline sponsorship sold out in an astonishing 7 minutes.

For the first time, there will be a travel support program designed to ensure that lack funding won’t prevent key technical contributors from attending the event. Details of this will be announced very soon. We had a brief discussion about how this program should be run and it was pointed out that we could learn from similar programs for PyCon and UDS.

In terms of learnings from the previous summit, some of the things the team will be working hard to improve is the quality of network connectivity, the size breakout rooms and the variety of beverages and snacks.

It was noted that feedback from ATCs who completed the survey was 2.5:1 in favour of keeping the design summit collocated with the conference. In Hong Kong, the design summit rooms will be well separate from the breakout session rooms, ATC status will be properly indicated on name badges and it will be much more clear on the schedule which sessions are part of the design summit and which are breakout sessions.

After some interesting discussion about the plans for Hong Kong, Lauren gave a brief overview of how plans are proceeding for the 2014 summits. The spring summit is planned for the week of May 14 with Atlanta and Chicago under consideration. The autumn/fall summit will be one of the first two weeks of November with Paris and Berlin currently under consideration. Decisions on the venues for both these summits are expected to be made soon.

Finally, Lauren ran through some updates on the progress of the marketing community more generally. Version 1 of the OpenStack marketing portal has been made available. The mailing list is gaining new subscribers all the time. The monthly meetings are also seeing growing numbers attending.

Patent Cooperation Proposal

Next on the agenda was a presentation from several members of the Legal Affairs Committee on the three options they recommend the foundation should consider for increased cooperation on patent sharing or cross-licensing between foundation members.

Frankly, I don’t really have the energy to try and summarise these proposals in great detail, so this is short … however, this is certainly a complex and important topic.

The Apache License v2.0 has a patent provision which means you grant a license to any of your patents which are infringed upon by any contributions you make. If any licensee of the software claims that the software infringes on their patent, then they lose any patent rights granted to them under the license.

Two options were presented to the board for how we might encourage further sharing of patents related to OpenStack between the companies involved. The idea is that we could put OpenStack in a better defensive position by sharing a wider range of applicable patents.

The first option proposed was to closely copy Google’s Open Patent Non-Assertion Pledge. The idea is that companies involved in OpenStack would pledge to not assert specific sets of relevant patents against OpenStack users.

The alternative option proposed was the adoption of an OIN-style patent cross-licensing scheme. The primary difference of this scheme is that an actual patent license is granted to users, rather than just a non-assertion pledge.

The slides outlining these options will be posed to the foundation wiki page. It is hoped the board will be in a position to come to a decision on this in November.

Closing Topics

Alice King is stepping down from her role on the Legal Affairs Committee, so the board voted to approve a motion to appoint Van Lindberg to the committee.

Rob Hirschfeld gave an update on his work to bring about a productive discussion on the question of “what is core?”. Rob has held a couple of meetings with other board members and drafted six position statements which he hopes will drive the discussion towards a consensus-based decision. Rob wishes the board to come to a good level of consensus on these position statements before opening the discussion up to the rest of the community.

Finally, there was a brief discussion about training and how members of the User Committee are actively working on training materials.

May 30th OpenStack Foundation Board Meeting

Wednesday, June 5th, 2013

Last Thursday, on May 30, the OpenStack Foundation Board met over the phone for two hours to discuss a number of topics. The date for the meeting was set well in advance, the call was open to anyone who cared to listen in and agenda was posted in advance to our wiki page.

Below is my summary of our discussions. The official summary was posted earlier by Jonathan Bryce.

For this meeting, Lew Tucker acted as chairman in place of Alan Clark.

Training

The first big topic on the agenda was an update from Mark Collier on the Foundation’s work-in-progress plans for official OpenStack training and certification programs.

The idea is that, with the huge interest globally in OpenStack, we’re hearing consistently that there is a shortage of people with OpenStack expertise and training. The Foundation would like to address this in a way that adds new Openstack experts, grows our community and establishes a base knowledge set that everyone is united around.

The OpenStack ecosystem is already ramping up its training offerings and classes are available today from a number of companies all over the world. The Foundation wants to encourage this and help accelerate the availability of training.

Crucially, the Foundation proposes to introduce a new trademark program which would require all OpenStack training and credentialling providers to include a base set of training material and tests which would be developed by the Foundation itself. The hope is that we would protect the OpenStack brand by ensuring that all official training courses would have the same basic content and quality levels.

This proposal of a new trademark program triggered a significant amount of debate which continued on over email after the call. On the call, Jim Curry and Boris Renski kicked off the discussion by expressing similar concerns about whether the trademark program would actually hinder the growth of training offerings in the ecosystem. Jonathan clarified that intent isn’t to prevent training programs from competing with additional content, but rather to ensure all programs have a common baseline. Others like Nick Barcet and Todd Moore chimed in with the view that OpenStack really needs this and Nick drew an analogy with Linux Professional Institute certification.

After much discussion, the conclusion was that the topic needed further discussion before any concrete steps could be taken. Mark Collier closed out the topic by making the point that the “Built for OpenStack” trademark was currently being used for OpenStack training and this would continue until an alternative plan was put in place.

Expect to see plenty more discussion about this soon on the foundation mailing list!

Next Board Meeting

We had some discussion about when our next meeting should be held and, based on the availability of board members, it looks like it will be held on Thursday, June 27 between 9am and 11am Pacific time.

Gold Member Committee

Next up, Simon Anderson discussed a new committee that we agreed to set up at our previous meeting – the Prospective Gold Member Committee.

The idea with this committee is that when companies approach the Foundation and express an interest in becoming a Gold Member, the Committee will work with the Foundation staff and the prospective member to ensure their application is properly prepared before it comes before the board.

The committee will essentially act as a mentor for prospective new members and help them understand what is expected of Gold Members. The hope is that this will result in applicants being better prepared than we’ve seen previously.

One concern expressed by Lauren Sell and others is that the committee shouldn’t become a vetting committee. The committee doesn’t have the mandate to turn away unsuitable candidates. If a candidate chooses to ignore the committee’s advice and mentorship, they should still be able to have their application heard by the board even this ultimately means the application is likely to be received unfavourably by the board. This point was accepted and everyone was in agreement on the mandate for the committee.

The members of the committee will be Simon Anderson, Devin Carlen, Rob Hirschfeld, Joseph George, Sean Roberts and Nick Barcet.

Update from the Executive Director

Next, Jonathan gave the board a quick update.

Jonathan talked about his attending a number of OpenStack events internationally recently and how there is still tremendous opportunity to grow the OpenStack community and engage new people. He also talked about how he met with a number of significant OpenStack users and hopes to be able to use their stories to illustrate the truly global nature of the OpenStack user community.

He also talked about the success of the OpenStack Summit in Portland and how we had 2600 attendees compared to the 1300 attendees six months previously. Feedback on the summit has been overwhelmingly positive with the most common negative comments related to the Wi-Fi network and how some of the breakout sessions were massively over-subscribed. The Foundation will continue to invest significantly in networking infrastructure at the conference and there is work underway to restructure some of the room layouts at the upcoming Summit in Hong Kong based on the feedback from Portland.

Jonathan also talked about how our financials are in good shape. We had a surplus from the Portland Summit which will allow us to make the Hong Kong Summit a kick-ass event. The Foundation is also well advanced in completing its first audit and we expect to see fully audited financials for 2012 published in August. Apparently this will be a good milestone in our progress towards non-profit status.

Role of Core

After Jonathan’s update, board members had an opportunity to briefly raise topics of interest.

First of those was Rob Hirschfeld who wants to bring together a group of board members to discuss the what “Core” project status means and should mean in future.

Rob talked about how he feels that the issue of whether core projects need a plugin architecture will be the key to unlocking the discussion and making progress.

Working With Standards Bodies

Randy Bias mentioned that he had been approached by someone from the IEEE who talked about the possibility of the IEEE working together with OpenStack on interoperability and standardization issues. Randy was mostly just passing on the message but also made the point that we can probably use the the experience of other bodies to help us ensure interoperability between OpenStack clouds.

Joshua McKenty quickly took a firm and contrary view to Randy’s – that standardization bodies are typically pretty ineffective and would actually slow down our progress on OpenStack interoperability.

The discussion concluded with general agreement that while individuals are welcome to talk to and learn from whoever they wish as part of their efforts to help make progress on OpenStack interoperability. Josh also agreed to provide the board with an update on his work on refstack.

Meeting Summaries

The final brief topic was a joy indeed. As had already been discussed on the mailing list, Josh felt that my previous meeting summary breached a policy agreed by the board (before my joining) that directors would make no public comment board meetings until after Jonathan had published an official summary of the meeting. Josh also felt that my reference to the agenda of the executive session had breached the confidentiality of the session.

Jonathan and I repeated the point that Jonathan had not gotten around to doing an official summary in a reasonable time and explicitly given me the go-ahead to post my summary. In some follow-up emails, I also made the point that – in this case – we actually made no effort to keep the agenda of the executive session private during the public part of the meeting so it was completely appropriate to mention it in my summary.

Async I/O and Python

Tuesday, June 4th, 2013

When you’re working on OpenStack, you’ll probably hear a lot of references to ‘async I/O’ and how eventlet is the library we use for this in OpenStack.

But, well … what exactly is this mysterious ‘asynchronous I/O’ thing?

The first thing to think about is what happens when a process calls a system call like write(). If there’s room in the write buffer, then the data gets copied into kernel space and the system call returns immediately.

But if there isn’t room in the write buffer, what happens then? The default behaviour is that the kernel will put the process to sleep until there is room available. In the case of sockets and pipes, space in the buffer usually becomes available when the other side reads the data you’ve sent.

The trouble with this is that we usually would prefer the process to be doing something useful while waiting for space to become available, rather than just sleeping. Maybe this is an API server and there are new connections waiting to be accepted. How can we process those new connections rather than sleeping?

One answer is to use multiple threads or processes – maybe it doesn’t matter if a single thread or process is blocked on some I/O if you have lots of other threads or processes doing work in parallel.

But, actually, the most common answer is to use non-blocking I/O operations. The idea is that rather than having the kernel put the process to sleep when no space is available in the write buffer, the kernel should just return a “try again later” error. We then using the select() system call to find out when space has become available and the file is writable again.

Below are a number of examples of how to implement a non-blocking write. For each example, you can run a simple socket server on a remote machine to test against:

$> ssh -L 1234:localhost:1234 some.remote.host 'ncat -l 1234 | dd of=/dev/null'

The way this works is that the client connects to port 1234 on the local machine, the connection is forwarded over SSH to port 1234 on some.remote.host where ncat reads the input, writes the output over a pipe to dd which, in turn, writes the output to /dev/null. I use dd to give us some information about how much data was received when the connection closes. Using a distant some.remote.host will help illustrate the blocking behaviour because data clearly can’t be transferred as quickly as the client can copy it into the kernel.

Blocking I/O

To start with, let’s look at the example of using straightforward blocking I/O:

import socket

sock = socket.socket()
sock.connect(('localhost', 1234))
sock.send('foo\n' * 10 * 1024 * 1024)

This is really nice and straightforward, but the point is that this process will spend a tonne of time sleeping while the send() method completes transferring all of the data.

Non-Blocking I/O

In order to avoid this blocking behaviour, we can set the socket to non-blocking and use select() to find out when the socket is writable:

import errno
import select
import socket

sock = socket.socket()
sock.connect(('localhost', 1234))
sock.setblocking(0)

buf = buffer('foo\n' * 10 * 1024 * 1024)
print "starting"
while len(buf):
    try:
        buf = buf[sock.send(buf):]
    except socket.error, e:
        if e.errno != errno.EAGAIN:
            raise e
        print "blocking with", len(buf), "remaining"
        select.select([], [sock], [])
        print "unblocked"
print "finished"

As you can see, when send() returns an EAGAIN error, we call select() and will sleep until the socket is writable. This is a basic example of an event loop. It’s obviously a loop, but the “event” part refers to our waiting on the “socket is writable” event.

This example doesn’t look terribly useful because we’re still spending the same amount of time sleeping but we could in fact be doing useful rather than sleeping in select(). For example, if we had a listening socket, we could also pass it to select() and select() would tell us when a new connection is available. That way we could easily alternate between handling new connections and writing data to our socket.

To prove this “do something useful while we’re waiting” idea, how about we add a little busy loop to the I/O loop:

        if e.errno != errno.EAGAIN:
            raise e

        i = 0
        while i < 5000000:
            i += 1

        print "blocking with", len(buf), "remaining"
        select.select([], [sock], [], 0)
        print "unblocked"

The difference is we’ve passed a timeout of zero to select() – this means select() never actually block – and any time send() would have blocked, we do a bunch of computation in user-space. If we run this using the ‘time’ command you’ll see something like:

$> time python ./test-nonblocking-write.py 
starting
blocking with 8028160 remaining
unblocked
blocking with 5259264 remaining
unblocked
blocking with 4456448 remaining
unblocked
blocking with 3915776 remaining
unblocked
blocking with 3768320 remaining
unblocked
blocking with 3768320 remaining
unblocked
blocking with 3670016 remaining
unblocked
blocking with 3670016 remaining
...
real    0m10.901s
user    0m10.465s
sys     0m0.016s

The fact that there’s very little difference between the ‘real’ and ‘user’ times means we spent very little time sleeping. We can also see that sometimes we get to run the busy loop multiple times while waiting for the socket to become writable.

Eventlet

Ok, so how about eventlet? Presumably eventlet makes it a lot easier to implement non-blocking I/O than the above example? Here’s what it looks like with eventlet:

from eventlet.green import socket

sock = socket.socket()
sock.connect(('localhost', 1234))
sock.send('foo\n' * 10 * 1024 * 1024)

Yes, that does look very like the first example. What has happened here is that by creating the socket using eventlet.green.socket.socket() we have put the socket into non-blocking mode and when the write to the socket blocks, eventlet will schedule any other work that might be pending. Hitting Ctrl-C while this
is running is actually pretty instructive:

$> python test-eventlet-write.py 
^CTraceback (most recent call last):
  File "test-eventlet-write.py", line 6, in 
    sock.send('foo\n' * 10 * 1024 * 1024)
  File ".../eventlet/greenio.py", line 289, in send
    timeout_exc=socket.timeout("timed out"))
  File ".../eventlet/hubs/__init__.py", line 121, in trampoline
    return hub.switch()
  File ".../eventlet/hubs/hub.py", line 187, in switch
    return self.greenlet.switch()
  File ".../eventlet/hubs/hub.py", line 236, in run
    self.wait(sleep_time)
  File ".../eventlet/hubs/poll.py", line 84, in wait
    presult = self.do_poll(seconds)
  File ".../eventlet/hubs/epolls.py", line 61, in do_poll
    return self.poll.poll(seconds)
KeyboardInterrupt

Yes, indeed, there’s a whole lot going on behind that innocuous looking send() call. You see mention of a ‘hub’ which is eventlet’s name for an event loop. You also see this trampoline() call which means “put the current code to sleep until the socket is writable”. And, there at the very end, we’re still sleeping in a call to poll() which is basically the same thing as select().

To show the example of doing some “useful” work rather than sleeping all the time we run a busy loop greenthread:

import eventlet
from eventlet.green import socket

def busy_loop():
    while True:
        i = 0
        while i < 5000000:
            i += 1
        print "yielding"
        eventlet.sleep()
eventlet.spawn(busy_loop)

sock = socket.socket()
sock.connect(('localhost', 1234))
sock.send('foo\n' * 10 * 1024 * 1024)

Now every time the socket isn’t writable, we switch to the busy_loop() greenthread and do some work. Greenthreads must cooperatively yield to one another so we call eventlet.sleep() in busy_loop() to once again poll the socket to see if its writable. Again, if we use the ‘time’ command to run this:

$> time python ./test-eventlet-write.py 
yielding
yielding
yielding
...
real    0m5.386s
user    0m5.081s
sys     0m0.088s

you can see we’re spending very little time sleeping.

(As an aside, I was going to take a look at gevent, but it doesn’t seem fundamentally different from eventlet. Am I wrong?)

Twisted

Long, long ago, in times of old, Nova switched from twisted to eventlet so it makes sense to take a quick look at twisted:

from twisted.internet import protocol
from twisted.internet import reactor

class Test(protocol.Protocol):
    def connectionMade(self):
        self.transport.write('foo\n' * 2 * 1024 * 1024)

class TestClientFactory(protocol.ClientFactory):
    def buildProtocol(self, addr):
        return Test()

reactor.connectTCP('localhost', 1234, TestClientFactory())
reactor.run()

What complicates the example most is twisted protocol abstraction which we need to use simply to write to the socket. The ‘reactor’ abstraction is simply twisted’s name for an event loop. So, we create a on-blocking socket, block in the event loop (using e.g. select()) until the connection completes and then
write to the socket. The transport.write() call will actually queue a writer in the reactor, return immediately and whenever the socket is writable, the writer will continue its work.

To show how you can run something in parallel, here’s how to run some code in a deferred callback:

def busy_loop():
    i = 0
    while i < 5000000:
        i += 1
    reactor.callLater(0, busy_loop)

reactor.connectTCP(...)
reactor.callLater(0, busy_loop)
reactor.run()

I’m using a timeout of zero here and it shows up a weakness in both twisted and eventlet – we want this busy_loop() code to only run when the socket isn’t writeable. In other words, we want the task to have a lower priority than the writer task. In both twisted and eventlet, the timed tasks are run before the
I/O tasks and there is no way to add a task which is only run if there are no runnable I/O tasks.

GLib

My introduction to async I/O was back when I was working on GNOME (beginning with GNOME’s CORBA ORB, called ORBit) so I can’t help comparing the above abstractions to GLib’s main loop. Here’s some equivalent code:

/* build with gcc -g -O0 -Wall $(pkg-config --libs --cflags glib-2.0) test-glib-write.c -o test-glib-write */

#include <errno.h>
#include <fcntl.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>

#include <glib.h>

GMainLoop    *main_loop = NULL;
static gchar *strv[10 * 1024 * 1024];
static gchar *data = NULL;
int           remaining = -1;

static gboolean
socket_writable(GIOChannel   *source,
                GIOCondition  condition,
                gpointer      user_data)
{
  int fd, sent;

  fd = g_io_channel_unix_get_fd(source);
  do
    {
      sent = write(fd, data, remaining);
      if (sent == -1)
        {
          if (errno != EAGAIN)
            {
              fprintf(stderr, "Write error: %s\n", strerror(errno));
              goto finished;
            }
          return TRUE;
        }

      data = &data[sent];
      remaining -= sent;
    }
  while (sent > 0 && remaining > 0);

  if (remaining <= 0)
    goto finished;

  return TRUE;

 finished:
  g_main_loop_quit(main_loop);
  return FALSE;
}

static gboolean
busy_loop(gpointer data)
{
  int i = 0;
  while (i < 5000000)
    i += 1;
  return TRUE;
}

int
main(int argc, char **argv)
{
  GIOChannel         *io_channel;
  guint               io_watch;
  int                 fd;
  struct sockaddr_in  addr;
  int                 i;
  gchar              *to_free;

  for (i = 0; i < G_N_ELEMENTS(strv)-1; i++)
    strv[i] = "foo\n";
  strv[G_N_ELEMENTS(strv)-1] = NULL;

  data = to_free = g_strjoinv(NULL, strv);
  remaining = strlen(data);

  fd = socket(AF_INET, SOCK_STREAM, 0);

  memset(&addr, 0, sizeof(struct sockaddr_in));
  addr.sin_family      = AF_INET;
  addr.sin_port        = htons(1234);
  addr.sin_addr.s_addr = htonl(INADDR_LOOPBACK);

  if (connect(fd, (struct sockaddr *)&addr, sizeof(addr)) == -1)
    {
      fprintf(stderr, "Error connecting to server: %s\n", strerror(errno));
      return 1;
    }

  fcntl(fd, F_SETFL, O_NONBLOCK);

  io_channel = g_io_channel_unix_new(fd);
  io_watch = g_io_add_watch(io_channel,
                            G_IO_OUT,
                            (GIOFunc)socket_writable,
                            GINT_TO_POINTER(fd));

  g_idle_add(busy_loop, NULL);

  main_loop = g_main_loop_new(NULL, FALSE);

  g_main_loop_run(main_loop);
  g_main_loop_unref(main_loop);

  g_source_remove(io_watch);
  g_io_channel_unref(io_channel);

  close(fd);

  g_free(to_free);

  return 0;
}

Here I create a non-blocking socket, set up an ‘I/O watch’ to tell me when the socket is writable and, when it is, I keep blasting data into the socket until I get an EAGAIN. This is the point at which write() would block if it was a blocking socket and I return TRUE from the callback to say “call me again when the socket is writable”. Only when I’ve finished writing all of the data do I return FALSE and quit the main loop causing the g_main_loop_run() call to return.

The point about task priorities is illustrated nicely here. GLib does have the concept of priorities and has a “idle callback” facility you can use to run some code when no higher priority task is waiting to run. In this case, the busy_loop() function will *only* run when the socket is not writable.

Tulip

There’s a lot of talk lately about Guido’s Asynchronous IO Support Rebooted (PEP3156) efforts so, of course, we’ve got to have a look at that.

One interesting aspect of this effort is that it aims to support both the coroutine and callbacks style programming models. We’ll try out both models below.

Tulip, of course, has an event loop, time-based callbacks, I/O callbacks and I/O helper functions. We can build a simple variant of our non-blocking I/O example above using tulip’s event loop and I/O callback:

import errno
import select
import socket

import tulip

sock = socket.socket()
sock.connect(('localhost', 1234))
sock.setblocking(0)

buf = memoryview(str.encode('foo\n' * 2 * 1024 * 1024))
def do_write():
    global buf
    while True:
        try:
            buf = buf[sock.send(buf):]
        except socket.error as e:
            if e.errno != errno.EAGAIN:
                raise e
            return

def busy_loop():
    i = 0
    while i < 5000000:
        i += 1
    event_loop.call_soon(busy_loop)

event_loop = tulip.get_event_loop()
event_loop.add_writer(sock, do_write)
event_loop.call_soon(busy_loop)
event_loop.run_forever()

We can go a step further and use tulip’s Protocol abstraction and connection helper:

import errno
import select
import socket

import tulip

class Protocol(tulip.Protocol):

    buf = b'foo\n' * 10 * 1024 * 1024

    def connection_made(self, transport):
        event_loop.call_soon(busy_loop)
        transport.write(self.buf)
        transport.close()

    def connection_lost(self, exc):
        event_loop.stop()
 
def busy_loop():
    i = 0
    while i < 5000000:
        i += 1
    event_loop.call_soon(busy_loop)

event_loop = tulip.get_event_loop()
tulip.Task(event_loop.create_connection(Protocol, 'localhost', 1234))
event_loop.run_forever()

This is pretty similar to the twisted example and shows up yet another example of the lack of task prioritization being an issue. If we added the busy loop to the event loop before the connection completed, the scheduler would run the busy loop every time the connection task yields.

Coroutines, Generators and Subgenerators

Under the hood, tulip depends heavily on generators to implement coroutines. It’s worth digging into that concept a bit to understand what’s going on.

Firstly, remind yourself how a generator works:

def gen():
    i = 0
    while i < 2:
        print(i)
        yield
        i += 1

i = gen()
print("yo!")
next(i)
print("hello!")
next(i)
print("bye!")
try:
    next(i)
except StopIteration:
    print("stopped")

This will print:

yo!
0
hello!
1
bye!
stopped

Now imagine a generator function which writes to a non-blocking socket and calls yield every time the write would block. You have the beginnings of coroutine based async I/O. To flesh out the idea, here’s our familiar example with some generator based infrastructure around it:

import collections
import errno
import select
import socket

sock = socket.socket()
sock.connect(('localhost', 1234))
sock.setblocking(0)

def busy_loop():
    while True:
        i = 0
        while i < 5000000:
            i += 1
        yield

def write():
    buf = memoryview(b'foo\n' * 2 * 1024 * 1024)
    while len(buf):
        try:
            buf = buf[sock.send(buf):]
        except socket.error as e:
            if e.errno != errno.EAGAIN:
                raise e
            yield
    quit()

Task = collections.namedtuple('Task', ['generator', 'wfd', 'idle'])

tasks = [
    Task(busy_loop(), wfd=None, idle=True),
    Task(write(), wfd=sock, idle=False)
]

running = True

def quit():
    global running
    running = False

while running:
    finished = []
    for n, t in enumerate(tasks):
        try:
            next(t.generator)
        except StopIteration:
            finished.append(n)
    map(tasks.pop, finished)

    wfds = [t.wfd for t in tasks if t.wfd]
    timeout = 0 if [t for t in tasks if t.idle] else None

    select.select([], wfds, [], timeout)

You can see how the generator-based write() and busy_loop() coroutines are cooperatively yielding to one another just like greenthreads in eventlet would do. But, there’s a pretty fundamental flaw here – if we wanted to refactor the code above to re-use that write() method to e.g. call it multiple times with
different input, we’d need to do something like:

def write_stuff():
    for i in write(b'foo' * 10 * 1024 * 1024):
        yield
    for i in write(b'bar' * 10 * 1024 * 1024):
        yield

but that’s pretty darn nasty! Well, that’s the whole idea behind Syntax for Delegating to a Subgenerator (PEP380). Since python 3.3, a generator can now yield to another generator using the ‘yield from’ syntax. This allows us to do:

...
def write(data):
    buf = memoryview(data)
    while len(buf):
        try:
            buf = buf[sock.send(buf):]
        except socket.error as e:
            if e.errno != errno.EAGAIN:
                raise e
            yield

def write_stuff():
    yield from write(b'foo\n' * 2 * 1024 * 1024)
    yield from write(b'bar\n' * 2 * 1024 * 1024)
    quit()

Task = collections.namedtuple('Task', ['generator', 'wfd', 'idle'])

tasks = [
    Task(busy_loop(), wfd=None, idle=True),
    Task(write_stuff(), wfd=sock, idle=False)
]
...

Conclusions?

Yeah, this is the point where I’ve figured out what we should do in OpenStack. Or not.

I really like the explicit nature of Tulip’s model – for each async task, you explicitly decide whether to block the current coroutine on its completion (or put another way, yield to another coroutine until the task has completed) or you register a callback to be notified of the tasks completion. I’d much prefer this to rather cavalier “don’t worry your little head” approach of hiding the async nature of what’s going on.

However, the prospect of porting something like Nova to this model is more than a little dauting. If you think about the call stack of an REST API request being handled and ultimately doing an rpc.cast() and that the entire call stack would need to be ported to ‘yield from’ in order for us to yield and handle another API request while waiting for the result of rpc.cast() …. as I said, daunting.

What I’m most interested in is how to design our new messaging API to be able to support any and all of these models in future. I haven’t quite figured that out either, but it feels pretty doable.