Archive for June, 2004

Storage Talk

Tuesday, June 29th, 2004

Unfortunately my ankle was fractured pretty badly and it was important I have
surgery on Wednesday. Unfortunately this precluded my flying to Norway for GUADEC
on saturday. I actually proposed that I fly to Norway on Saturday to my orthopaedic surgeon.
He gave me a look that was darker than oil at midnight, and went back to what he was doing
without saying anything. Some people tell me I should have interpreted this as a “sounds ok”.
However, he later said some things about our goal being to “reduce the chance of having
arthritis in the ankle for the rest of your life”. That scared me into behaving.

There’s a more formalish storage paper for the occasion here. But honestly, I think the speaking notes are
more informative for getting at the soul of the material. In my experience that’s often true
of talks vs accompanying papers. So I’m including my speaking notes here. I blame oxycontin for any incoherent bits. They’re a little random but I hope you press through
because some of the good stuff is near the middle/end ;-) . Maybe I’ll do sketches on whiteboards for all the places I was going to do live sketches and take pictures, but for now the notes are all booooring woooords. Unfortunately in many cases the sketches are the meat of the thing, but I think you can get some idea what I’m talking about from the text. I’ve fleshed it out past the notes in some places where it was totally incomprehensible:

  • Storage is designed to support a
    more general user experience than just “find files more easily�.
    Storage isn’t a silver bullet, but it can serve as a toolkit for
    making new user experiences easier to extend across the desktop. In
    the process it helps dissolve the application/desktop boundary a
    little.

The Experience

  1. Intro:
    Related to many existing systems

    1. Wiki –
      anybody can edit or work with information. Information is not super
      formal to start with, but can become “formalized�. Unlike wiki,
      allow for rich in place editing and better tie in to the OS for
      noticing changes and tracking “change threads� (which are
      themselves communication often).

    2. Whiteboard –
      support quick informal live collaborations. Don’t force things into
      a particular “format� or medium but allow people to mix it up.
      Share a space with lots of presence information, etc. Also envision
      this working when people are in the same place.

    3. Groupware –
      handle objects that people need to deal with to get their job done.
      People, teams, projects, tasks, deadlines. These are more central
      to knowledge workers than even documents. Like groupware, track
      threads of communication, but don’t tie people down to text
      messages. Let them respond with people, projects, tasks, etc.
      Rather than “posting to lists� you just append items to a topic
      in the (or a) central store.

    4. Bugzilla –
      tasks, and schedules, process, status, owner, etc. Track more
      interesting metadata in a way that people can shape to their
      organization.

  2. Build “objects people care
    about�

    1. This is more about what gets
      built on top of Storage, but its a major part of the overall
      experience. The file manager (atop the filesystem) is about
      managing formal documents and folders to group documents in large
      concrete chunks. The <some name here> (atop storage) should
      focus on objects that fill people’s daily lives
      .

    2. People, Projects, Teams, Tasks,
      Messages, Topics, Discussions, Managers, Proposals, etc, etc, etc
      (and yes, Documents too) are objects people care about. Many others
      that are specific to particular industries and job roles. Some of
      these objects currently live in specialized applications like
      evolution, and most of these will still be handled primarily
      through a specialized interface. <sketch the two specialized
      interfaces>.

    3. Its usually a good idea to have
      specialized tools for targeting specific use cases.

    4. OTOH, although we work on text
      documents mostly in the office suite, we still expose common
      operations to the base OS (the filemanager mostly in this case).
      How can we extend the set of useful things that can be done with
      information across the information boundary? In a less generic
      sense, can we build support for the objects people deal with on
      a day to day basis more deeply into the OS
      . It doesn’t have to
      be done by a univeral component system, but base libraries like
      storage can make it easier to support the important “one off�
      optimizations in the base OS (such as for projects).

  3. Support informal work

    1. Most office applications are
      focused on producing deliverables: formal documents. But
      deliverables are the exception. Most knowledge workers spend most
      of their time processing, sharing, and extending information not
      producing deliverables. We want to build interfaces that allow for
      some degree of information soup. <sketch the process flow for
      organzing SubsByTheInch2005>

    2. Informal work can eventually turn
      into formal deliverables. Make this process as convenient as
      possible.

  4. Information is information, don’t
    force large chunks

    1. We currently have odd
      granularities of information. “Files� in the case of “formal
      documentsâ€? (but since we don’t have informal constructs, many
      things are pushed into this).

  5. Access items within large bodies
    of information

    1. The storage “research-y�
      solution to this is object reference using human language phrases

    2. This aspect of storage still
      interests me, and has been where most of the work has gone until
      now…. but it is more researchy because it is prone to
      being technically infeasible (jury is still out ;-) . As such, other
      parts of storage are not predicated on it.

  6. Provide the components for
    collaboration

    1. If storage is the physics, social
      interaction is the chemistry. Storage needs to provide some very
      basic structures that will give rise (when people, environments,
      tasks, etc) are thrown into the mix to social interactions. Rather
      than trying to control things rigidly, as traditional computer
      environments have done, we allow social behaviors to regulate
      things more (as things work normally outside computer world).

    2. Presence information is the
      substrate for coordinating social interactions. Who is where and
      doing what is the most relevant context for social interactions.

    3. Access by multiple
      threads/computers/people. Rather than “versioning� documents
      and the associated problems (e.g.
      merging is a nearly insoluable
      UI
      problem) we allow “live� (or at least effectively live) access
      to documents.

    4. Fine granularity. If we have
      access from multiple places, the temptation is to use locking of
      “documents�. Even inside formal documents, however, this will
      greatly limit collaborative ability. If we have rich fine grained
      presence information, combined with very fine grained data access,
      we can provide the ability to socially manage interactions rather
      than requiring “forced� lockouts.

  7. Track information flow

    1. E-mail showed the importance of
      threads of communication between people. An e-mail
      thread morphs into a task (like a bug), which morphs into a few
      more tasks (which might have discussions associated with them),
      which turns into a full fledged project with an associated team,
      which eventually produces a policy document. All this stays tied
      together. <show interface idea>

A Brief History (aka
excuse):

  • Storage was initially
    implemented as project Gargamel by a team of Stanford CS (and one
    EE, and yours truly) students as a senior project. Brian Quistorf,
    James Farwell, Khalil Bey, Josh Radel. It gets to a nice demo-able
    point before they graduate.

  • It gets even more
    finished as I work on it after graduation while not looking for
    work. Web page is written, screenshots made, etc.

  • I foolishly decide to
    rewrite the NL parser (and lose the old CVS history when importing
    to cvs.gnome.org). I get sidetracked writing the NL parser.

  • Slashdot etc hit.
    Lots of developer interest, but I’m snowed for other reasons and
    don’t succesfully get development moving with other people. Plus I
    still have to finish the NL rewrite before things will function
    again.

  • The summer is
    completely crazy, and I stop working on Storage for 8 months.

  • Today: NL rewrite is
    now done. Its a much stronger foundation, but the semantic grammar
    is still small. However, even with the small grammar it can do very
    sophisticated (correct) interpretations of phrases like “songs
    that aren’t by ‘John Lennon’ but have the word ‘love’ in themâ€?.
    This would be very difficult to parse with a traditional “naive�
    scavanging search interpretation. Marco is also contributing to
    Storage, as well as some other Epiphany dudes. Things are starting
    to pick up, and I’m determined to not kill storage by bottlenecking
    again. I’m looking for a “project managerâ€?.

What’s there today:

Non-NL

  • storage-store
    – manages the postgresql server, handles notification

  • libstorage
    – GObject interface to store items

  • libstorage-translators
    – serializes / deserializes data streams from / to storage items’

  • GnomeVFS
    module
    – automatically invokes translators on
    read/write into the store allowing existing GNOME apps to use the
    store like a normal filesystem

  • NL

  • PET
    – parses sentences into Head-Phrase Structure Grammar (HPSG)
    trees, by Dr. Ullrich Callmeier.

  • libmrs
    – interface to the Minimal Recursion ‘Semantics’ information in
    the HPSG tree

  • libmrs-converters
    – translates MRS into a more meaningful XML statement using a
    client chosen semantic grammar

  • libstorage-nl
    – translates using storage-specific semantic grammar into the
    intermediate XML form, and then to an SQL query

What’s in the near future:

  • Currently libstorage,
    the VFS module, and some translators directly access the postgresql
    server. This is undesirable: it means permissions on a shared store
    would have to be enforced using a collection of SQL views, it means
    locking becomes very tricky, and it means that libstorage and other
    things link directly against postgresql libraries (though this could
    be addressed by gnome-db).

  • Support for NL
    searches in select non-English languages (probably Spanish first,
    but perhaps Japanese). Storage is built on a “language neutral
    framework�, but grammar engineering is a very
    difficult task. Some of the availability of NL searches will depend
    on what the linguistics community produces and distributes freely.

  • A
    nifty collaborative application to provide a test bed for the
    collaboration/locking framework. <sketch collaborative
    whiteboard/wiki design> (also shows informal work) Ideas? ;-)

<demo NL search interface>

<show NL slides and explain basic NL
process>

Adventures in Gimpin’

Sunday, June 20th, 2004

A slivver more than two weeks ago I “sprained” my left ankle playing barefoot soccer. The ankle felt like it was getting better, but as swelling receded my foot felt awful. I could step on it, but the first few steps hurt like crazy. Other steps hurt but I didn’t have to brace myself for them. I finally caved and decided to see a doctor.

Unfortunately, as predicted, this turned out to be a very frustrating affair. I know some people like lots of choice in Doctor, etc. But I sort of like the Kaiser-Permanente (HMO? in CA) model where they have their own big buildings with everything in it. You show up, and they’ll figure out what to do with you. Anyway, I called the Blue Cross Blue Shield of North Carolina (*sigh*, since RH is based in Raleigh) advice nurse line twice. They were both confident I should go to an urgent care facility, and were basically unwilling to believe there are no urgent care facilities here.

There are tons in Conneticut, there are tons in Rhode Island, there are tons in North Carolina, and there are tons in California. There are almost no urgent care facilities in New Hampshire or Massachusetts. Some puritans probably made a law against them a few hundred years ago. Or maybe, its because human life is oh so critically important that even non-emergencies should go to the emergency room “just in case”. Or maybe its because the NE sucks in general. I dunno.

I finally decided to just drive to Rhode Island, because I really hate the thought of going to an emergency room for a non-emergency. Despite having to clutch with my hurt ankle/foot (which fortunately you don’t have to do much on an interstate… just stay in 5th), it was actually a very positive drive. I was feeling pretty blue, and driving into the sunset in the outdoors is really nice. So I drive across Massachusetts and show up at this small town urgent care center. They X-Ray my foot, and my ankle. Nothing seems wrong, which surprises them given how my foot looks. Anyway, they X-Rayed my leg and it turns out my fibula (the small bone of the lower leg) is pretty badly fractured. So they splinted the area, and told me to go to an orthopaedic surgeon.

Lovely. Very strange that the pain was in my foot. I’m still a little paranoid that there’s an occult fracture of the fifth metatarsal causing the foot pain. So the good side to all this is they gave me the X-Rays to take to the orthopaedic surgeon. I’ve been studying them and reading medical research papers from medline about what I see. I’m finding this very interesting. Ankle fractures (and sprains) turn out to be extremely varied. Looking at the damage from lots of different angles has also made it possible to reconstruct in more detail how I must have fallen.

Anyway, I don’t have a scanner but I (very appropriately) gimped an online X-Ray of a healthy ankle to be a fairly good replica of mine. I cheated a little because I made it look like my posterior projection of my left foot. The online image is, I believe, a front projection of a right foot. From other projections it looks like this may be a spiral fracture, but from this projection it looks mostly like an oblique fracture. I’m not really sure either way, they apparently often look very similar from non-axial projections. I also labelled some stuff to give bearings.

The yellow areas are the (from left to right) lateral and medial malleolus. That’s the boney bump on the left and right of your ankle. The pink area is the tibiofibular syndesmosis, which connects the fibula (smaller bone) and tibia (the larger weight bearing bone) together. Sprains are often a result of stretching this. Anyway, because the fracture is proximal to the tibiofibular syndesmosis, this is probably a suppination with external rotation (Weber B). That means the injury probably occurred with the weight leaned on the outside edge of the foot, and then the foot was rotated. It is possible that its pronation with external rotation (a form of Weber C).

So the bad news is that most Weber C injuries require open reduction (reduction is placing the bones so they align for healing). That would mean cutting my poor ankle open, and possibly even using syndesmotic screws that would have to be removed some weeks later :-/ The other problem with open reduction, besides the fact I’d need surgery, is that studies of outcomes suggest that open reduction results in a far slower recovery and goes awry far more often. With any luck its a Weber B.

What I Did Today

Thursday, June 10th, 2004

Its 2:30 pm, I’ve been awake for a little over an hour, and this is turning out to be a very miserable day. Actually, take that back, its an agressively bad day. The context for this, is that I managed to sprain my ankle pretty badly over the weekend. So I figured there are two common sets of “bad things” in daily life: things that are annoying, and things that hurt. Sprained ankles have a way of taking the set of things that are annoying and making them also lie in the set of things that hurt. Trying to fall asleep is one of those things. I’ve been having trouble sleeping because, despite the ankle not hurting much during the day anymore, it always manages to throb at night. So I wake up to get a drink, hobble over to the sink, and then lie awake for another couple hours.

So to start my day, last night I forgot to reset my alarm clock which had been wiped by a power outage. Anyone who knows me knows the results of this: rather than waking up at 10am, I woke up at 1:15pm (and I’m lucky it wasn’t 4pm) and missed an important meeting. I feel totally shitty about this. So I stormed off to the shower (well, limped agressively), and turned it on w/o thinking. It was freezing cold because I forgot to warm it a little first. In my panic, I put more weight than I should have on my hurt ankle, and fell. My head just missed the faucet, but I did manage to hit my head into the wall and felt dazed for a minute or two.

After a hasty shower (which I hate and makes my eyes feel sleepy the rest of the day, but I hold out hope that I won’t entirely miss all the meeting) I go to get tylenol and a drink of water from the kitchen. In the process, I knock over the knife block and it falls to the floor. One of the knives hits handle first with the weight of the block behind it and the blade bends. Fortunately one of the cheaper crappy knives, but I still have enough scotsman in me to be very grumpy about this.

And to add insult to injury, I get stuck behind a dump truck going 20 mph for 2/3 of my commute. Normally I’m very intentional about not getting worked up about this sort of thing, because it doesn’t really matter, but I get really annoyed. This only makes things worse. Of course, its also one of those ratty diesel things, so I’m stuck between sucking down fumes or roasting in the car with the windows shut and only internal ventilation on (no A/C).

So its now 2:41pm. I’ll probably be at work at work until midnight, and head straight to bed. That gives the day 9 more hours to take me down.

Cicada Cycles and Encryption

Tuesday, June 1st, 2004

Apparently, according to an article in the Economist, cicadas have prime-numbered life cycles of 17 or 13 years. Simplistically, when the number of prey increase after some time lag the number of predators increases, driving the number of prey down, resulting in equalibrium. Call this smoothly varying population. Prey that have a cycle where you spike every n years rather than smoothly varying have a leg up on a smoothly varying predator. When cicadas bloom there is a number of predators appropriate to no cicada. They reproduce before predator numbers rise, and then disappear. Effectively non-smoothly varying popultions can avail of the time lag before predator numbers rise to match prey.

This results in selection pressure for predators that have the same length of cycle as the prey. While same length is perfect, a predator cycle that is a factor of the prey length will also work. E.g. if prey has a cycle of 6 years, a predator with a cycle of 3 years can still arise in numbers to consume the prey. So the problem, from the predator phenotype’s non-existantperspective is to guess (through random mutations, etc) the cycle length that overlaps most frequently with the prey. Factors of the prey’s cycle length will, of course, overlap more frequently. The best length the largest factor of the prey’s cycle length, namely the prey’s cycle length itself. (As an aside, interesting abstract algebra connections with cyclic groups, etc.)

From the prey phenotype’s non-existant perspective: It now becomes an information hiding game. Given a cyclic group of order t (constant time between cycles), how do we minimize the overlap with groups of all other orders? The answer is choose a large t with the fewest factors, i.e. to choose a long cycle that is also a prime number. Cicada’s long prime cycles are a very rudimentary form of encryption to keep random mutations in predators from “guessing” a compatible cycle length. Cool!

Now of course, using a non constant function for time_between_cycles(cycle_number) would work even better. And in some sense Cicadas have that too by having two different cycle lengths. According to the economist article populations have even been observed shifting from a 17 year to a 13 year cycle in response to selection pressures caused by a fungus that developed a 17 yeard cycle.