IRC Whiteboard

August 24, 2004

Let the record show…

I’d just like to state, for the record, that Owen Taylor has sullied his
fancy-pants GTK engineering self. Not content merely to perpetuate and even initiate nasty hacks on python internals,
his lust for for evil not sated by working on an IRC bot. No! Owen had to go
and work on an X-Chat plugin. Is this really a man you’d trust your widgets
with?

Whiteboard!

So a number of us (owen, colin, jrb, bryan, blizzard, j5 and myself) hacked this weekend on an allegedly multiprotocol whiteboard that currently supports direct TCP connections and, most notably, IRC. Hopefully we’ll get jabber support and gossip integration too. There’s an X-Chat plugin for it. There’s also a plugin for SupyBot for keeping a whiteboard with persistent state sitting on a channel. It doesn’t look pretty atm, but its a pretty good technology foundation.

Despite my constant bitching and moaning about having to implement the whiteboard protocol in the model, its actually pretty cool. Clients broadcast actions to create/delete generic objects or modify their properties. Currently we only support text and stroke objects, but it should be pretty easy to add others to the system now that the base infrastructure is in place. The protocol looks something like this:

  WHITEBOARD [channelname] 0+ <create ><text requestId="[uuid]" x="0" y="0", text="Hello"/>

If you’re using the SupyBot, it serves as the authoritative “master client” and echoes back actions if it accepts them or rejects them. The client-side model (this is the part I’m obsessed with because its where I spent most of my time) journals actions it initiates, and can snoop the channel when other clients broadcast (so you don’t have to wait for the server to echo, reduces latency which is important w/ IRC rate limiting) but only commits the changes as authoritative when the master client confirms them (otherwise they are rolled back).

I’m particularly proud to be able to say that I’m doing transaction stream compression by smooshing sequential modifications together before comitting them to the journal. “Look mommy, I’m Hans Reiser!”. OK, so its not really that hard, but it sounds 31337. Humor me, ok?

Code is in CVS module ‘whiteboard’. Its all written in python with pygtk and shouldn’t need anything special to work except for Cairo and pycairo. Now that we’ve done a good pass at the base pieces I think the actual drawing bits will get some more love/features in the next few days. High on my list are: erasing (*cough*), variable line width, hand-drawn-shape smoothing, and a highlighter. I’ve also got most of the pieces done for adding graphics tablet support. All that should be pretty easy except maybe shape smoothing (just don’t know how hard the algorithms for doing this are).

The Unix Credo

July 21, 2004

The Unix Credo: We strive to never improve, hacks excluded.

Waking Up is Hard to Do

July 16, 2004

  • 2:00 AM: Set alarm for 10 am (physical switch)
  • 2:03 AM: Tape piece of paper over alarm with the text “Why are you ruining my life?”
  • 2:07 AM: Go to bed
  • 2:15 AM: Fall asleep
  • ??? AM: Alarm is switched off, and the piece of paper is retaped over the alarm by a mysterious force. Abducted by aliens? Gremlins? Cruel alter ego?
  • 12:17 PM: Wake up with no memory of the alarm being disabled. Paper is still taped over the alarm like the alarm was never turned off (?!?)
  • 12:18 PM: Perform thorough exam for signs of alien abduction: scars, incisions, chips in the back of my neck, probes in various orifices. Results, negative
  • 12:19 PM: Inspect apartment security fixtures. Deadbolt: in place. Physical chain slider thing: in place. Pole blocking sliding glass door: in place. Grill over fan vent in bathroom: in place. Gremlin trap: empty

Conclusion: I have a cruel alter ego who wakes up when the alarm goes off, disables it for who knows what reason, laughs mischeviously, and then goes back to bed.

Solution: Tie myself up before going to bed.

Problem: How do I get out of bed when I’m back to my calm mild mannered normal self?

I get a lot of messages asking me to compare and contrast Storage,
WinFS, and sometimes Dashboard and Medusa. More recently, I’ve gotten a lot of
questions about Spotlight and Beagle. I’ve generally avoided commenting
(which usually means not answering the e-mail…) on these things both because
its impossible for me to do an unbiased comparison, and because the
goals seem to be quite different.

  • Medusa, Beagle & Spotlight are similar, though of course Spotlight is
    much more mature. I would call them metadata index systems.
  • Storage & WinFS are similar, though of course WinFS is much more mature.
    I would call them document stores.

Caveat: If indexing and search were the
primary goals, a document store would be a ridiculously overengineered
approach
. The medusa/beagle/spotlight model is much more sane if
this is your only or primary goal. I’m not saying this to suggest
document stores are better or worse than metadata indexing systems,
only to point out that there’s an element of apple-orange
comparison at work here.

Metadata Index Systems

Medusa:

Medusa was originally written by Eazel integrated tightly with Nautilus
1.0 and was slated for inclusion with the GNOME 1.4 release. It was
primarily written by Rebecca Schulman, but also had major contributions
from Maciej Stachowiak & some by myself. Medusa ran as root, which
worried some people (but of course, so does updatedb for slocate…),
but unfortunately had a major bug that caused it to be pulled from GNOME
1.4 at the last minute. Rebecca fixed the bug after the release, and
re-architected Medusa to run as a normal user. But unfortunately Eazel
collapsed before GNOME 2.0 and nobody promoted its inclusion. Curtis
Hovey & I ported it to GNOME 2.x platform later, and Curtis is currently
maintaining it and adding lots of new features / fixes. In particular he
seems to be working on a UI for it. Medusa allowed very fast searches
over large indexes. Indexes were built by scanning the disk every night
(like slocate, unlike Spotlight which does things better). It also
provided a search: URI scheme that allowed creation of dynamic “search
folders”. So you could have a “Spreadsheets” folder for example that
always contained any spreadsheets on your system. The biggest hurdle for
Medusa today is that the set of indexers is not very extensible, and so
it doesn’t know how to index very many different file types.

Spotlight:

Of course I haven’t looked at Spotlight’s code or used it, so what I
know about it is from what Apple has published and discussions with
friends at Apple. Spotlight appears to be a sophisticated well
implemented approach to building a metadata layer an top of an existing
file system. Changes to files appear to be noticed at the kernel layer,
and indexers are quickly run to update the metadata cache (with
information about filename, album name, size, file contents, keywords,
etc). I don’t know whether it is guaranteed that indexers will be run
before the data can be accessed, but it is supposed to happen very
quickly in any case so it appears instant to the user. Spotlight is the
work of (among others, there are probably more people I just don’t know)
Pavel Cisler (BeOS tracker & Eazel Nautilus) & Dominic Giampaolo (BeOS
BFS, which had a similar sophisticated metadata system). Spotlight also
has a lot of work gone into the UI, for doing grouping, measuring
relevance, etc. Its easy to underestimate how much work this is, in some
ways the “indexing” is the easy part. Spotlight appears to index a lot
more than just the filesystem, including things like calendar and mail,
but I don’t know the full extent of what it can do.

Beagle:

My knowledge of Beagle is based on playing with it and reading through
a fair bit of the code, but I could definitely be missing large aspects
because I haven’t talked with Jon. Beagle’s code appears to be fairly immature at the moment, but I would
expect it to grow. It uses a port of Apache Jarkarta’s Lucene. Lucene
primarily provides a way to *store* indexed metadata and do fast
*searches* over lots of metadata (including full text, of course), but
it doesn’t provide the indexers for specific file types. In some sense,
Lucene as a specialized “database” for storing the results of indexers.
Currently Beagle has indexers for HTML, JPEG, MP3, OpenOffice.org (very
cool) and Text. Unlike Medusa (I have no idea about Spotlight for this)
Beagle is designed to index “byte streams” rather than files, so it can
index, e.g. “The current page you are looking at in Epiphany”. This
makes it very compatible w/ Dashboard, since Dashboard wants to index
any and all contextual data, not just things on the hard disk. At the
moment Beagle appears to contain only very simple UI, so its primarily a
document indexing system.

On the filesystem side, Beagle currently works
like Medusa and requires a “crawler” to update its metadata cache (say
nightly), vs. spotlight which updates instantly. Beagle also has
crawlers for Mail and IM logs. Beagle also includes a renderer system
for displaying the relevant metadata etc for different file type
results. AFAIK, Jon Trowbridge at Novell is the person mainly hacking on
Beagle atm, but I think the code was refactored out of Dashboard, and a
number of other contributors are listed.

Document Stores

Both WinFS & Storage are aimed at doing a lot more than document
indexing… in many ways document indexing is only a nice side effect of
their larger aims. Storage and (AFAICT) to a lesser extent WinFS both
intend to store the actual documents themselves inside the store. That
means that more than just metadata is inside the store. Both WinFS &
Storage provide a query system, though WinFS’ has developed a nice
object oriented language (which I think they compile to SQL) whereas
Storage currently uses straight SQL which is harder for other developers
to use.

Storage:

I know most about this so I’ll talk about it most of course 😉 Storage
is fairly immature, and the architecture has shifted a lot in the past
few months.

“storage-store” provides a DBus service that allows fetching objects
over the FreeDesktop DBus
getting their attributes, relating them to eachother, running queries
etc. “storage-store” uses postgresql to store the structured objects and
perform queries. Because objects are accessed “live” rather than as
“buffers”, changes are instantly propagated across the bus, so multiple
applications or users can work on the same document and instantly see
changes other people make.

I’m currently working on architecture to
storage-store into standard IM presence information so you will be able
to see buddy icons of other people and what part of the document they
are working on inside storage applications. I have a lot of user
experience goals for Storage (or more accurately, for applications and
desktop that use storage). You can find information about most of them
on my blog and at
the storage homepage. Though these goals are more
important to me than document indexing
, and have a lot more
impact on Storage’s architecture as a result, I will focus on document
indexing in order to compare and contrast with the other systems.

libstorage-translators provides a framework for translators that can
take structured object data in the store (metadata and the actual data
itself) and translate it to and from byte streams (such as files). The
goal is not indexing files, but for providing a way to move files in and
out of the store. So for example, if your friend sent you a PDF file by
e-mail, you could drag that file into your local store and the
libstorage-translators will automatically decompose the information for
placing in the store (and of course extract lots of metadata like album
name, description, image width, etc etc in the process). Currently I
have only worked on the “importer” side of translators, not the
“exporter”, so they are effectively like indexers. There are currently
importers for: DocBook, HTML, any image format supported by gdk-pixbuf
(JPEG, PNG, BMP, GIF, and several more obscure formats), PDF, text, and
any format supported by gstreamer (MP3, OGG, AVI, MPEG2, etc). Importers
can also create thumbnails for the data for convenient display later.
Storage also includes a renderer system for displaying the relevant
metadata etc for different sorts of results to a query. A major drawback
is that I don’t have translators for common document formats like
Gnumeric or OO.o at the moment.

Queries can either be performed using an SQL-like format (slightly
higher level than SQL but not much, it gets translated to SQL) or using
natural language queries. A large chunk of storage code is currently in
its NL system which uses very sophisticated HPSG grammars and other
techniques to translate human language phrases into the SQL query
format.

A storage:/// VFS URI is provided which automatically invokes
translators when files are dragged into the store. That means you can,
e.g. open a nautilus window to storage:/// and drag files in to add them
to the store. It also provides query folders like Medusa. So for example
you can have a folder “spreadsheets” or “songs by John Lennon that don’t
have the word ‘love’ in them” that is live updated to contain objects
matching those criteria.

WinFS:

I know the least about WinFS of any of the systems
discussed here. I need to read up on it more… but the last time I looked
at it heavily was more than a year ago when MS was still very ellusive.
It looks like a lot of info is up on the web now, so what I’m saying
could be out of date. WinFS is backed by both NTFS & Microsoft’s SQL server.
It provides a very nice API for querying and working with objects.
Currently the set of object types it can used is fixed and predefined by
MS (but the list is long). In the future they will probably open this up
and allow anyone to define new object types. AFAICT, WinFS is currently
targeting primarily the storage of metadata, though it is tightly
coupled to the files themselves stored as byte streams in NTFS. It does
look like in the future they intend to more completely store things in
WinFS. WinFS provides a very cool set of hooks for performing actions in
response to changes in the store. WinFS uses this to provide indexing
services, but users can also define their own actions (e.g. you could
say, “whenever an e-mail from George is created, copy it into my “to
burn” directory”).

Storage Talk

June 29, 2004

Unfortunately my ankle was fractured pretty badly and it was important I have
surgery on Wednesday. Unfortunately this precluded my flying to Norway for GUADEC
on saturday. I actually proposed that I fly to Norway on Saturday to my orthopaedic surgeon.
He gave me a look that was darker than oil at midnight, and went back to what he was doing
without saying anything. Some people tell me I should have interpreted this as a “sounds ok”.
However, he later said some things about our goal being to “reduce the chance of having
arthritis in the ankle for the rest of your life”. That scared me into behaving.

There’s a more formalish storage paper for the occasion here. But honestly, I think the speaking notes are
more informative for getting at the soul of the material. In my experience that’s often true
of talks vs accompanying papers. So I’m including my speaking notes here. I blame oxycontin for any incoherent bits. They’re a little random but I hope you press through
because some of the good stuff is near the middle/end ;-). Maybe I’ll do sketches on whiteboards for all the places I was going to do live sketches and take pictures, but for now the notes are all booooring woooords. Unfortunately in many cases the sketches are the meat of the thing, but I think you can get some idea what I’m talking about from the text. I’ve fleshed it out past the notes in some places where it was totally incomprehensible:

  • Storage is designed to support a
    more general user experience than just “find files more easily�.
    Storage isn’t a silver bullet, but it can serve as a toolkit for
    making new user experiences easier to extend across the desktop. In
    the process it helps dissolve the application/desktop boundary a
    little.

The Experience

  1. Intro:
    Related to many existing systems

    1. Wiki –
      anybody can edit or work with information. Information is not super
      formal to start with, but can become “formalized�. Unlike wiki,
      allow for rich in place editing and better tie in to the OS for
      noticing changes and tracking “change threads� (which are
      themselves communication often).

    2. Whiteboard –
      support quick informal live collaborations. Don’t force things into
      a particular “format� or medium but allow people to mix it up.
      Share a space with lots of presence information, etc. Also envision
      this working when people are in the same place.

    3. Groupware –
      handle objects that people need to deal with to get their job done.
      People, teams, projects, tasks, deadlines. These are more central
      to knowledge workers than even documents. Like groupware, track
      threads of communication, but don’t tie people down to text
      messages. Let them respond with people, projects, tasks, etc.
      Rather than “posting to lists� you just append items to a topic
      in the (or a) central store.

    4. Bugzilla –
      tasks, and schedules, process, status, owner, etc. Track more
      interesting metadata in a way that people can shape to their
      organization.

  2. Build “objects people care
    about�

    1. This is more about what gets
      built on top of Storage, but its a major part of the overall
      experience. The file manager (atop the filesystem) is about
      managing formal documents and folders to group documents in large
      concrete chunks. The <some name here> (atop storage) should
      focus on objects that fill people’s daily lives
      .

    2. People, Projects, Teams, Tasks,
      Messages, Topics, Discussions, Managers, Proposals, etc, etc, etc
      (and yes, Documents too) are objects people care about. Many others
      that are specific to particular industries and job roles. Some of
      these objects currently live in specialized applications like
      evolution, and most of these will still be handled primarily
      through a specialized interface. <sketch the two specialized
      interfaces>.

    3. Its usually a good idea to have
      specialized tools for targeting specific use cases.

    4. OTOH, although we work on text
      documents mostly in the office suite, we still expose common
      operations to the base OS (the filemanager mostly in this case).
      How can we extend the set of useful things that can be done with
      information across the information boundary? In a less generic
      sense, can we build support for the objects people deal with on
      a day to day basis more deeply into the OS
      . It doesn’t have to
      be done by a univeral component system, but base libraries like
      storage can make it easier to support the important “one off�
      optimizations in the base OS (such as for projects).

  3. Support informal work

    1. Most office applications are
      focused on producing deliverables: formal documents. But
      deliverables are the exception. Most knowledge workers spend most
      of their time processing, sharing, and extending information not
      producing deliverables. We want to build interfaces that allow for
      some degree of information soup. <sketch the process flow for
      organzing SubsByTheInch2005>

    2. Informal work can eventually turn
      into formal deliverables. Make this process as convenient as
      possible.

  4. Information is information, don’t
    force large chunks

    1. We currently have odd
      granularities of information. “Files� in the case of “formal
      documentsâ€? (but since we don’t have informal constructs, many
      things are pushed into this).

  5. Access items within large bodies
    of information

    1. The storage “research-y�
      solution to this is object reference using human language phrases

    2. This aspect of storage still
      interests me, and has been where most of the work has gone until
      now…. but it is more researchy because it is prone to
      being technically infeasible (jury is still out ;-). As such, other
      parts of storage are not predicated on it.

  6. Provide the components for
    collaboration

    1. If storage is the physics, social
      interaction is the chemistry. Storage needs to provide some very
      basic structures that will give rise (when people, environments,
      tasks, etc) are thrown into the mix to social interactions. Rather
      than trying to control things rigidly, as traditional computer
      environments have done, we allow social behaviors to regulate
      things more (as things work normally outside computer world).

    2. Presence information is the
      substrate for coordinating social interactions. Who is where and
      doing what is the most relevant context for social interactions.

    3. Access by multiple
      threads/computers/people. Rather than “versioning� documents
      and the associated problems (e.g.
      merging is a nearly insoluable
      UI
      problem) we allow “live� (or at least effectively live) access
      to documents.

    4. Fine granularity. If we have
      access from multiple places, the temptation is to use locking of
      “documents�. Even inside formal documents, however, this will
      greatly limit collaborative ability. If we have rich fine grained
      presence information, combined with very fine grained data access,
      we can provide the ability to socially manage interactions rather
      than requiring “forced� lockouts.

  7. Track information flow

    1. E-mail showed the importance of
      threads of communication between people. An e-mail
      thread morphs into a task (like a bug), which morphs into a few
      more tasks (which might have discussions associated with them),
      which turns into a full fledged project with an associated team,
      which eventually produces a policy document. All this stays tied
      together. <show interface idea>

A Brief History (aka
excuse):

  • Storage was initially
    implemented as project Gargamel by a team of Stanford CS (and one
    EE, and yours truly) students as a senior project. Brian Quistorf,
    James Farwell, Khalil Bey, Josh Radel. It gets to a nice demo-able
    point before they graduate.

  • It gets even more
    finished as I work on it after graduation while not looking for
    work. Web page is written, screenshots made, etc.

  • I foolishly decide to
    rewrite the NL parser (and lose the old CVS history when importing
    to cvs.gnome.org). I get sidetracked writing the NL parser.

  • Slashdot etc hit.
    Lots of developer interest, but I’m snowed for other reasons and
    don’t succesfully get development moving with other people. Plus I
    still have to finish the NL rewrite before things will function
    again.

  • The summer is
    completely crazy, and I stop working on Storage for 8 months.

  • Today: NL rewrite is
    now done. Its a much stronger foundation, but the semantic grammar
    is still small. However, even with the small grammar it can do very
    sophisticated (correct) interpretations of phrases like “songs
    that aren’t by ‘John Lennon’ but have the word ‘love’ in themâ€?.
    This would be very difficult to parse with a traditional “naive�
    scavanging search interpretation. Marco is also contributing to
    Storage, as well as some other Epiphany dudes. Things are starting
    to pick up, and I’m determined to not kill storage by bottlenecking
    again. I’m looking for a “project managerâ€?.

What’s there today:

Non-NL

  • storage-store
    – manages the postgresql server, handles notification

  • libstorage
    – GObject interface to store items

  • libstorage-translators
    – serializes / deserializes data streams from / to storage items’

  • GnomeVFS
    module
    – automatically invokes translators on
    read/write into the store allowing existing GNOME apps to use the
    store like a normal filesystem

  • NL

  • PET
    – parses sentences into Head-Phrase Structure Grammar (HPSG)
    trees, by Dr. Ullrich Callmeier.

  • libmrs
    – interface to the Minimal Recursion ‘Semantics’ information in
    the HPSG tree

  • libmrs-converters
    – translates MRS into a more meaningful XML statement using a
    client chosen semantic grammar

  • libstorage-nl
    – translates using storage-specific semantic grammar into the
    intermediate XML form, and then to an SQL query

What’s in the near future:

  • Currently libstorage,
    the VFS module, and some translators directly access the postgresql
    server. This is undesirable: it means permissions on a shared store
    would have to be enforced using a collection of SQL views, it means
    locking becomes very tricky, and it means that libstorage and other
    things link directly against postgresql libraries (though this could
    be addressed by gnome-db).

  • Support for NL
    searches in select non-English languages (probably Spanish first,
    but perhaps Japanese). Storage is built on a “language neutral
    framework�, but grammar engineering is a very
    difficult task. Some of the availability of NL searches will depend
    on what the linguistics community produces and distributes freely.

  • A
    nifty collaborative application to provide a test bed for the
    collaboration/locking framework. <sketch collaborative
    whiteboard/wiki design> (also shows informal work) Ideas? 😉

<demo NL search interface>

<show NL slides and explain basic NL
process>

Adventures in Gimpin’

June 20, 2004

A slivver more than two weeks ago I “sprained” my left ankle playing barefoot soccer. The ankle felt like it was getting better, but as swelling receded my foot felt awful. I could step on it, but the first few steps hurt like crazy. Other steps hurt but I didn’t have to brace myself for them. I finally caved and decided to see a doctor.

Unfortunately, as predicted, this turned out to be a very frustrating affair. I know some people like lots of choice in Doctor, etc. But I sort of like the Kaiser-Permanente (HMO? in CA) model where they have their own big buildings with everything in it. You show up, and they’ll figure out what to do with you. Anyway, I called the Blue Cross Blue Shield of North Carolina (*sigh*, since RH is based in Raleigh) advice nurse line twice. They were both confident I should go to an urgent care facility, and were basically unwilling to believe there are no urgent care facilities here.

There are tons in Conneticut, there are tons in Rhode Island, there are tons in North Carolina, and there are tons in California. There are almost no urgent care facilities in New Hampshire or Massachusetts. Some puritans probably made a law against them a few hundred years ago. Or maybe, its because human life is oh so critically important that even non-emergencies should go to the emergency room “just in case”. Or maybe its because the NE sucks in general. I dunno.

I finally decided to just drive to Rhode Island, because I really hate the thought of going to an emergency room for a non-emergency. Despite having to clutch with my hurt ankle/foot (which fortunately you don’t have to do much on an interstate… just stay in 5th), it was actually a very positive drive. I was feeling pretty blue, and driving into the sunset in the outdoors is really nice. So I drive across Massachusetts and show up at this small town urgent care center. They X-Ray my foot, and my ankle. Nothing seems wrong, which surprises them given how my foot looks. Anyway, they X-Rayed my leg and it turns out my fibula (the small bone of the lower leg) is pretty badly fractured. So they splinted the area, and told me to go to an orthopaedic surgeon.

Lovely. Very strange that the pain was in my foot. I’m still a little paranoid that there’s an occult fracture of the fifth metatarsal causing the foot pain. So the good side to all this is they gave me the X-Rays to take to the orthopaedic surgeon. I’ve been studying them and reading medical research papers from medline about what I see. I’m finding this very interesting. Ankle fractures (and sprains) turn out to be extremely varied. Looking at the damage from lots of different angles has also made it possible to reconstruct in more detail how I must have fallen.

Anyway, I don’t have a scanner but I (very appropriately) gimped an online X-Ray of a healthy ankle to be a fairly good replica of mine. I cheated a little because I made it look like my posterior projection of my left foot. The online image is, I believe, a front projection of a right foot. From other projections it looks like this may be a spiral fracture, but from this projection it looks mostly like an oblique fracture. I’m not really sure either way, they apparently often look very similar from non-axial projections. I also labelled some stuff to give bearings.

The yellow areas are the (from left to right) lateral and medial malleolus. That’s the boney bump on the left and right of your ankle. The pink area is the tibiofibular syndesmosis, which connects the fibula (smaller bone) and tibia (the larger weight bearing bone) together. Sprains are often a result of stretching this. Anyway, because the fracture is proximal to the tibiofibular syndesmosis, this is probably a suppination with external rotation (Weber B). That means the injury probably occurred with the weight leaned on the outside edge of the foot, and then the foot was rotated. It is possible that its pronation with external rotation (a form of Weber C).

So the bad news is that most Weber C injuries require open reduction (reduction is placing the bones so they align for healing). That would mean cutting my poor ankle open, and possibly even using syndesmotic screws that would have to be removed some weeks later :-/ The other problem with open reduction, besides the fact I’d need surgery, is that studies of outcomes suggest that open reduction results in a far slower recovery and goes awry far more often. With any luck its a Weber B.

What I Did Today

June 10, 2004

Its 2:30 pm, I’ve been awake for a little over an hour, and this is turning out to be a very miserable day. Actually, take that back, its an agressively bad day. The context for this, is that I managed to sprain my ankle pretty badly over the weekend. So I figured there are two common sets of “bad things” in daily life: things that are annoying, and things that hurt. Sprained ankles have a way of taking the set of things that are annoying and making them also lie in the set of things that hurt. Trying to fall asleep is one of those things. I’ve been having trouble sleeping because, despite the ankle not hurting much during the day anymore, it always manages to throb at night. So I wake up to get a drink, hobble over to the sink, and then lie awake for another couple hours.

So to start my day, last night I forgot to reset my alarm clock which had been wiped by a power outage. Anyone who knows me knows the results of this: rather than waking up at 10am, I woke up at 1:15pm (and I’m lucky it wasn’t 4pm) and missed an important meeting. I feel totally shitty about this. So I stormed off to the shower (well, limped agressively), and turned it on w/o thinking. It was freezing cold because I forgot to warm it a little first. In my panic, I put more weight than I should have on my hurt ankle, and fell. My head just missed the faucet, but I did manage to hit my head into the wall and felt dazed for a minute or two.

After a hasty shower (which I hate and makes my eyes feel sleepy the rest of the day, but I hold out hope that I won’t entirely miss all the meeting) I go to get tylenol and a drink of water from the kitchen. In the process, I knock over the knife block and it falls to the floor. One of the knives hits handle first with the weight of the block behind it and the blade bends. Fortunately one of the cheaper crappy knives, but I still have enough scotsman in me to be very grumpy about this.

And to add insult to injury, I get stuck behind a dump truck going 20 mph for 2/3 of my commute. Normally I’m very intentional about not getting worked up about this sort of thing, because it doesn’t really matter, but I get really annoyed. This only makes things worse. Of course, its also one of those ratty diesel things, so I’m stuck between sucking down fumes or roasting in the car with the windows shut and only internal ventilation on (no A/C).

So its now 2:41pm. I’ll probably be at work at work until midnight, and head straight to bed. That gives the day 9 more hours to take me down.

Apparently, according to an article in the Economist, cicadas have prime-numbered life cycles of 17 or 13 years. Simplistically, when the number of prey increase after some time lag the number of predators increases, driving the number of prey down, resulting in equalibrium. Call this smoothly varying population. Prey that have a cycle where you spike every n years rather than smoothly varying have a leg up on a smoothly varying predator. When cicadas bloom there is a number of predators appropriate to no cicada. They reproduce before predator numbers rise, and then disappear. Effectively non-smoothly varying popultions can avail of the time lag before predator numbers rise to match prey.

This results in selection pressure for predators that have the same length of cycle as the prey. While same length is perfect, a predator cycle that is a factor of the prey length will also work. E.g. if prey has a cycle of 6 years, a predator with a cycle of 3 years can still arise in numbers to consume the prey. So the problem, from the predator phenotype’s non-existantperspective is to guess (through random mutations, etc) the cycle length that overlaps most frequently with the prey. Factors of the prey’s cycle length will, of course, overlap more frequently. The best length the largest factor of the prey’s cycle length, namely the prey’s cycle length itself. (As an aside, interesting abstract algebra connections with cyclic groups, etc.)

From the prey phenotype’s non-existant perspective: It now becomes an information hiding game. Given a cyclic group of order t (constant time between cycles), how do we minimize the overlap with groups of all other orders? The answer is choose a large t with the fewest factors, i.e. to choose a long cycle that is also a prime number. Cicada’s long prime cycles are a very rudimentary form of encryption to keep random mutations in predators from “guessing” a compatible cycle length. Cool!

Now of course, using a non constant function for time_between_cycles(cycle_number) would work even better. And in some sense Cicadas have that too by having two different cycle lengths. According to the economist article populations have even been observed shifting from a 17 year to a 13 year cycle in response to selection pressures caused by a fungus that developed a 17 yeard cycle.

Argument In Brief

  1. Microsoft’s C#/CLI licensing people, at high levels, are aware of us.
  2. Microsoft can choose to do damaging things in the current C#/CLI licensing ambiguity.
  3. Microsoft considers the free software / Linux community to be a major competitive threat
  4. Microsoft does not “compete” gently
  5. A + B + C + D = ?

The word pile amassed below defends points (1) and, in particular, (2). I take points (3) and (4) as given. I leave point (5) an exercise for the reader. 😉

Stupid Disclaimer

Since I’m not a lawyer, I don’t know if these disclaimers are important. But given the nature of the topic, I’ll play it safe and write one. I’m not a lawyer, and this ain’t legal advice, its just a dump my current thinking on an issue. It does not represent my employer’s opinion. It may represent my cat’s opinion, but only on the second tuesday of summer months.

Restatement of the Issue

Miguel has repeatedly stated that the patents necessary to implement the standards ECMA-334 (C#) and ECMA-335 (CLI) are available from Microsoft “RAND + Royalty Free”. This seems like an effective open patent grant and encouraged me initially that we could do Mono. I really like Mono. Its terrific technically, and I’d love to be able to use it. But two problems upon further consideration the past couple months:

  1. I’ve not seen an official statement by Microsoft that will let me trust the royalty free assertion. I think we are remiss if we do not assume Microsoft is looking for ways to, quite frankly, screw us. So unless there is a statement from Microsoft that they will have to stick to in a court, I feel (at the very least) uncomfortable.
  2. “RAND + royalty free”, can still seriously screw Free Software. I think this is more important than the first point. Even with RAND + royalty free you still have to execute a license agreement with Microsoft, and license agreements can stipulate things that are RAND from a corporation perspective but still screw over Free Software. Also, there is evidence that key Microsoft people are already aware of (or planned?) incompatibilities between the licensing scheme for C#/CLI and, at least, the GPL. The eye of Sauron is upon us. RAND + royalty free is very different from a patent grant.

In short, we are in an adversarial situation. Microsoft does not want us to succeed. Thus we cannot trust Microsoft, even if we’d like to, and must consider Mono based upon the question “What is the worst thing MS can reasonably do?”. We can only trust Mono if we are convinced Microsoft doesn’t have weasel room. The current situation appears, to me, to have lots of weasel room. The technical merits of Mono are basically irrelevant if its a trojan horse in the long term.

The Horror Story

So here’s the obligatory horror story based upon what I see as our current course. Actually, I don’t think this is taken to extremes at all. The GNOME actions look to me like the path we are currently on, and the Microsoft actions are not out of character, and look legally tenable based on what I know today. Microsoft can choose to not exercise these actions, but they will have the possibility (and will be more likely to the more successful the Linux desktop is).

  • Act 1 – Novell hackers continue to push Mono. Novell hackers code most new independent programs/functionality in Mono and gradually start writing extensions to software like Evolution in Mono. Evolution’s core continues to remain Mono free, but if you want features X, Y, and Z you have to use Mono. A few GNOME hackers write apps in Mono, some as toys, and perhaps a couple more serious. Red Hat hackers complain. Some try to weakly push Java and some stick with working in C & Python. Sun makes noise, and does their own thing, starts some wacky projects, tries to push Java with OpenOffice.org, and is generally ineffectual.
  • Act 2 – As the number of Mono-only features grows, Red Hat’s unwillingness to ship Mono begins affecting sales. Novell holds a competitive advantage (self-inflicted by Red Hat) because Red Hat-written features can be shipped by SuSE, but Novell written features require Mono. A couple years down the line, Red Hat caves and begins shipping Mono. Evolution or some other major GNOME application begins to convert their core to Mono. Maybe a couple do. GNOME starts to move toward Mono.

So far, no real problems. We’ve got a better technical infrastructure, and new features are developed more quickly. There are some road bumps and schedule slippage as major GNOME apps (or core) begin to convert pieces more aggressively to Mono. There might be a loss of focus on user features for a while, such as happened with GNOME 2, but it won’t be terribly bad, and the gains will be substantial.

  • Act 3 – Its been 4 years. Desktop Linux has made a large impact on the market, and Microsoft is even more determined. Large pieces of GNOME are written in Mono, and other parts of the Linux stack are considering it. Some may already be using it. Microsoft starts gently nudging companies, reminding them that they are required to license the C#/CLI patents. Novell already has a license so it can distribute Mono, and Red Hat is in the process of finishing the agreement with MS. (As an aside, notice that this doesn’t totally screw over corporations, so beware treating their willingness or unwillingness to use Mono as a useful indicator of whether mono is safe).
  • Act 4 – Eventually Microsoft starts dropping barbs, saying things in the press, etc reminding people that to distribute C#/CLI implementations you need a license from Microsoft. It slowly works up to the point that they’ve made it very clear that individual contributors not working for their corporation etc all need to execute license agreements with Microsoft. In the best case, these can be done by individuals, in the worst case, RAND excludes license agreements that are “too small”. In either case, people have to work with Microsoft to get a license (who stalls and takes a long time) and agree to terms that include restrictions on sub-licensing. Microsoft uses other license features to exert leverage in irritating ways. In the worst case (and this is unlikely for MS PR reasons) Microsoft actually drops the royalty free bit.

At the end of the day, big chunks of GNOME are based upon technology that is substantially encumbered. Microsoft has used the tactic of allowing technically illegal behavior and only later coming down to exert control / extract money in the past. For example, from an article by Dave Malcolm, “Chris Williams, Microsoft’s director of product development, explained his attitude to software piracy in the Far East: ‘We’re just flooding the market with copies… The goal is… that when people actually end up having to buy software, they [will] already know our software and it’s the one they will have to buy when the laws get passed. We’re basically getting market share. As soon as we start to get a return on that investment, it will be humongous’.”

From a paranoid conspiracy theory perspective, the current ambiguity affords Microsoft the most future possibility. I don’t consider it ridiculous that it could even be an intentional trojan horse (of course, its dangerous whether intentional or not). If they came out and declared C#/CLI unencumbered in a satisfactory way, we would adopt Mono and life would be good. If they came out and gave the license terms which were in fact damaging, we would not adopt Mono and life would be OK. By providing just enough hooks to make those of us who really like the technology ignore the danger, but without providing details or statements that stand up in court, we buy into Mono without Microsoft having to give up a useful hold over us.

Can We Trust it will Always be Royalty Free?

The number of online MS-affiliated official-seeming sources (that I can find) suggesting that Microsoft will offer necessary patent licenses under royalty free terms is very small. I do not doubt that Microsoft will offer royalty free licenses at the present date. However, I can’t find strong evidence that would legally lock them in (serving as promissory estoppel, or something like that) to providing the patent licenses royalty free in the future.

The Mono FAQ links to a posting to the “dotnet-sscli” mailing list from Jim Miller, one of the CLR architects. The relevant sentence (a fifth of the message!) is:

“But Microsoft (and our co-sponsors, Intel and Hewlett-Packard) went further and have agreed that our patents essential to implementing C# and CLI will be available on a “royalty-free and otherwise RAND” basis for this purpose.”

The message contains almost no detail. Further, and perhaps more importantly, it doesn’t seem to involve Dr. Miller representing Microsoft in an official capacity. In fact, in the first sentence he says “…I’d like to explain why I’ve never felt [emphasis mine] the two are in conflict.” To me the whole message is premised as being a personal opinion. It is not Microsoft’s official promise to provide “royalty-free and otherwise RAND”. This could probably serve as a bit of evidence that Microsoft was presenting its licensing terms as royalty free. Its not the smoking gun, and probably won’t serve as promissory estoppel. In short, this doesn’t seem like the sort of evidence we should trust to protect us as a Free Software project.

Miguel wrote in a message to desktop-devel, “But lets not waste our time on this discussion on the mailing list, forlegal matters, you should get legal counsel. Have your lawyers engage Microsoft on this topic, that is the only way of getting a solid answer.”

Its not enough that Red Hat or Novell’s lawyers can call Microsoft and be told “we will license this to you royalty free”. As a Free Software project, I want legal weight with public accountability to hold Microsoft to royalty free, not “call Microsoft and they’ll tell you”. That means some sort of public (web, preferably) legally binding page that says: we will offer the technology to anyone on these terms. Given the need to still execute a license, knowing its royalty free isn’t enough. I think we need a public statement as to the terms of the license itself.

The other source of semi-official Microsoft statement about being royalty free, that I’ve found, is a ZDNet article from 2002 by David Berlind. He talks to Michele Herman, “Microsoft’s directory of intellectual property”. She states that Microsoft “…will be offering a conventional non-royalty non-fee RAND license”. This is a pretty good source, and if it was on an official Microsoft PR website, I would agree Microsoft is probably locked in to royalty free. I’m a little more dubious about it coming from a magazine article, esp. given how historically off the mark I’ve found magazine article’s to be about technology. Nonetheless, I do find it more reassuring than Dr. Miller’s message. I don’t know if this would hold up in a court, but its a lot closer.

Now the message from Dr. Miller does suggest that Microsoft (and Intel and Hewlett Packard) have made an official agreement somewhere to provide the patents “royalty-free and otherwise RAND” (perhaps on some ECMA form somewhere???). I would absolutely love to see that in some sort of official form! I’ve looked, but it doesn’t seem to be available online. If somebody has a solid source (online or otherwise), on a Microsoft web page, some sort of ECMA statement or form or assurance, or in a direct verifiable form from a Microsoft spokesperson (not processed through some reporter), please email me.

RAND + Royalty Free Isn’t Enough

ECMA pretty clearly requires companies to agree to license patents under RAND (Reasonable & Non-Discriminatory) terms. If you don’t, you get booted out of ECMA. ECMA explicitly does not define what is and what is not RAND license terms. If I were a corporation, I would still feel pretty safe here, because I believe there is sufficient legal precedent defining what is reasonable and what is non-discriminatory (in the context of dealing with licensing to other companies). I am convinced Microsoft is going to provide licenses to C#/CLI patents under terms that corporations will find perfectly acceptable. So unlike the “royalty free” part where things look murky from what I know now, it looks like Microsoft is locked into “RAND”.

Unfortunately, what is reasonable and non-discriminatory toward a corporation may not prove particularly reasonable for, and may discriminate against, Free Software. To my knowledge, there’s no precedent suggesting that you have to accommodate Free Software to avoid being discriminatory. RAND, as I understand it, is clearly premised for a corporate context.

So lets assume that Microsoft was locked into “royalty free”, and will provide “RAND + Royalty Free” license terms on C#/CLI to anyone, anywhere, for all time. What can MS do to Free Software now? RAND + Royalty Free still means you have to get a license. Licenses can stipulate things. And that’s where the problems come in. Big problems. Compared to this, the question of Microsoft’s commitment to “royalty free” seems like small fry.

Having to get a license at all is a major burden for hobbyists, individual contributors, and even small companies. At the very least, its irritating. At worst, if acquiring a license takes a bunch of paperwork and two months, it will very effectively deter (legal) individual contribution.
Interestingly, in the aforementioned ZDNet article, Herman (MS directory of IP) provides a number of quotes about what the royalty free + RAND license might stipulate in the case of C#/CLI:

  • “Reciprocity”: Herman says this means, “This is where I say I will license a royalty-free license to my essential patents, and in return I expect you to license your essential patents to me on an royalty-free basis.” Now GNOME doesn’t have any patents to license back to MS, but what this does suggest to me is that getting this license entails a real legal agreement where parties negotiate back and forth. Its not some shrinkwrap “sign on the dotted line” deal where MS is prepared to rubber stamp anyone who wants a license. You still gotta play with Microsoft, even if RAND means they have to play nice with other companies.
  • “Defensive suspension”: They have the right to revoke the license if you sue them. Pretty standard, though still I’d rather not have to agree to.
  • “Field of use limitation”: you only get the patent license for implementing the standard. Not a huge irritation practically, though it is GPL incompatible.
  • “Sub-licensing prohibition”: you can’t transfer the license, or use somebody else’s license. This is the major practical problem. The sub-licensing raises all sorts of issues as to who has to get a license when mixed with free software. Do you only need a license for distribution (ala GPL)? Or do you need a license for each user (ala MP3)? If Microsoft wanted to really screw Free Software, could they require “per user licensing” (albeit royalty free)? In either case, each person, org, and company that wants to redistribute the software will probably have to license directly from MS (and given the indications from reciprocation, this probably isn’t a totally trivial process)

Herman (remember, MS director of IP) explicitly says, “the field of use (…) and the prohibition on sub-licensing are inconsistent with the requirements of Sec. 7 of the GPL. Sec. 7 of the GPL says that if you do not have the rights to distribute the code as required under the GPL then you do not have the right to distribute at all. The GPL says you must have the rights to sublicense and to freely modify outside the field of use limitation.” The GPL incompatibility presumably isn’t a big problem for Mono since (I think) its under an X style license, and GPL’d apps can still run atop it. However, this underscores that Microsoft knows full well that their particular terms have interactions with free software. Given the potential for sub-licensing to wreak havoc (as outlined above), I’m very worried that we, the free software community, are not flying “under the radar”.

Until I read that quote, I thought there was basically no chance this was an intentional trojan horse, that this was all just dangerous possibility. I’m not so sure anymore.

In conclusion, I refer you back to my opening argument.

Ambiguous Detriment

April 8, 2004

So I got a steering wheel and pedals that are compatible with Linux for $25. No force feedback, but makes TORCS a lot more fun.