Storage Talk

June 29, 2004

Unfortunately my ankle was fractured pretty badly and it was important I have
surgery on Wednesday. Unfortunately this precluded my flying to Norway for GUADEC
on saturday. I actually proposed that I fly to Norway on Saturday to my orthopaedic surgeon.
He gave me a look that was darker than oil at midnight, and went back to what he was doing
without saying anything. Some people tell me I should have interpreted this as a “sounds ok”.
However, he later said some things about our goal being to “reduce the chance of having
arthritis in the ankle for the rest of your life”. That scared me into behaving.

There’s a more formalish storage paper for the occasion here. But honestly, I think the speaking notes are
more informative for getting at the soul of the material. In my experience that’s often true
of talks vs accompanying papers. So I’m including my speaking notes here. I blame oxycontin for any incoherent bits. They’re a little random but I hope you press through
because some of the good stuff is near the middle/end ;-). Maybe I’ll do sketches on whiteboards for all the places I was going to do live sketches and take pictures, but for now the notes are all booooring woooords. Unfortunately in many cases the sketches are the meat of the thing, but I think you can get some idea what I’m talking about from the text. I’ve fleshed it out past the notes in some places where it was totally incomprehensible:

Storage is designed to support a
more general user experience than just â€œfind files more easilyâ€?.
Storage isn’t a silver bullet, but it can serve as a toolkit for
making new user experiences easier to extend across the desktop. In
the process it helps dissolve the application/desktop boundary a
little.

The Experience

Intro:
Related to many existing systems
1. Wiki â€“
  anybody can edit or work with information. Information is not super
  formal to start with, but can become â€œformalizedâ€?. Unlike wiki,
  allow for rich in place editing and better tie in to the OS for
  noticing changes and tracking â€œchange threadsâ€? (which are
  themselves communication often).
2. Whiteboard â€“
  support quick informal live collaborations. Don’t force things into
  a particular â€œformatâ€? or medium but allow people to mix it up.
  Share a space with lots of presence information, etc. Also envision
  this working when people are in the same place.
3. Groupware â€“
  handle objects that people need to deal with to get their job done.
  People, teams, projects, tasks, deadlines. These are more central
  to knowledge workers than even documents. Like groupware, track
  threads of communication, but don’t tie people down to text
  messages. Let them respond with people, projects, tasks, etc.
  Rather than â€œposting to listsâ€? you just append items to a topic
  in the (or a) central store.
4. Bugzilla â€“
  tasks, and schedules, process, status, owner, etc. Track more
  interesting metadata in a way that people can shape to their
  organization.
Build â€œobjects people care
aboutâ€?
1. This is more about what gets
  built on top of Storage, but its a major part of the overall
  experience. The file manager (atop the filesystem) is about
  managing formal documents and folders to group documents in large
  concrete chunks. The <some name here> (atop storage) should
  focus on objects that fill people’s daily lives.
2. People, Projects, Teams, Tasks,
  Messages, Topics, Discussions, Managers, Proposals, etc, etc, etc
  (and yes, Documents too) are objects people care about. Many others
  that are specific to particular industries and job roles. Some of
  these objects currently live in specialized applications like
  evolution, and most of these will still be handled primarily
  through a specialized interface. <sketch the two specialized
  interfaces>.
3. Its usually a good idea to have
  specialized tools for targeting specific use cases.
4. OTOH, although we work on text
  documents mostly in the office suite, we still expose common
  operations to the base OS (the filemanager mostly in this case).
  How can we extend the set of useful things that can be done with
  information across the information boundary? In a less generic
  sense, can we build support for the objects people deal with on
  a day to day basis more deeply into the OS. It doesn’t have to
  be done by a univeral component system, but base libraries like
  storage can make it easier to support the important â€œone offâ€?
  optimizations in the base OS (such as for projects).
Support informal work
1. Most office applications are
  focused on producing deliverables: formal documents. But
  deliverables are the exception. Most knowledge workers spend most
  of their time processing, sharing, and extending information not
  producing deliverables. We want to build interfaces that allow for
  some degree of information soup. <sketch the process flow for
  organzing SubsByTheInch2005>
2. Informal work can eventually turn
  into formal deliverables. Make this process as convenient as
  possible.
Information is information, don’t
force large chunks
1. We currently have odd
  granularities of information. â€œFilesâ€? in the case of â€œformal
  documentsâ€? (but since we don’t have informal constructs, many
  things are pushed into this).
Access items within large bodies
of information
1. The storage â€œresearch-yâ€?
  solution to this is object reference using human language phrases
2. This aspect of storage still
  interests me, and has been where most of the work has gone until
  now…. but it is more researchy because it is prone to
  being technically infeasible (jury is still out ;-). As such, other
  parts of storage are not predicated on it.
Provide the components for
collaboration
1. If storage is the physics, social
  interaction is the chemistry. Storage needs to provide some very
  basic structures that will give rise (when people, environments,
  tasks, etc) are thrown into the mix to social interactions. Rather
  than trying to control things rigidly, as traditional computer
  environments have done, we allow social behaviors to regulate
  things more (as things work normally outside computer world).
2. Presence information is the
  substrate for coordinating social interactions. Who is where and
  doing what is the most relevant context for social interactions.
3. Access by multiple
  threads/computers/people. Rather than â€œversioningâ€? documents
  and the associated problems (e.g.
  merging is a nearly insoluable UI
  problem) we allow â€œliveâ€? (or at least effectively live) access
  to documents.
4. Fine granularity. If we have
  access from multiple places, the temptation is to use locking of
  â€œdocumentsâ€?. Even inside formal documents, however, this will
  greatly limit collaborative ability. If we have rich fine grained
  presence information, combined with very fine grained data access,
  we can provide the ability to socially manage interactions rather
  than requiring â€œforcedâ€? lockouts.
Track information flow
1. E-mail showed the importance of
  threads of communication between people. An e-mail
  thread morphs into a task (like a bug), which morphs into a few
  more tasks (which might have discussions associated with them),
  which turns into a full fledged project with an associated team,
  which eventually produces a policy document. All this stays tied
  together. <show interface idea>

A Brief History (aka
excuse):

Storage was initially
implemented as project Gargamel by a team of Stanford CS (and one
EE, and yours truly) students as a senior project. Brian Quistorf,
James Farwell, Khalil Bey, Josh Radel. It gets to a nice demo-able
point before they graduate.
It gets even more
finished as I work on it after graduation while not looking for
work. Web page is written, screenshots made, etc.
I foolishly decide to
rewrite the NL parser (and lose the old CVS history when importing
to cvs.gnome.org). I get sidetracked writing the NL parser.
Slashdot etc hit.
Lots of developer interest, but I’m snowed for other reasons and
don’t succesfully get development moving with other people. Plus I
still have to finish the NL rewrite before things will function
again.
The summer is
completely crazy, and I stop working on Storage for 8 months.
Today: NL rewrite is
now done. Its a much stronger foundation, but the semantic grammar
is still small. However, even with the small grammar it can do very
sophisticated (correct) interpretations of phrases like â€œsongs
that aren’t by ‘John Lennon’ but have the word ‘love’ in themâ€?.
This would be very difficult to parse with a traditional â€œnaiveâ€?
scavanging search interpretation. Marco is also contributing to
Storage, as well as some other Epiphany dudes. Things are starting
to pick up, and I’m determined to not kill storage by bottlenecking
again. I’m looking for a â€œproject managerâ€?.

What’s there today:

Non-NL

storage-store
â€“ manages the postgresql server, handles notification
libstorage
â€“ GObject interface to store items
libstorage-translators
â€“ serializes / deserializes data streams from / to storage items’
GnomeVFS
module â€“ automatically invokes translators on
read/write into the store allowing existing GNOME apps to use the
store like a normal filesystem
NL

PET
â€“ parses sentences into Head-Phrase Structure Grammar (HPSG)
trees, by Dr. Ullrich Callmeier.
libmrs
â€“ interface to the Minimal Recursion ‘Semantics’ information in
the HPSG tree
libmrs-converters
â€“ translates MRS into a more meaningful XML statement using a
client chosen semantic grammar
libstorage-nl
â€“ translates using storage-specific semantic grammar into the
intermediate XML form, and then to an SQL query

What’s in the near future:

Currently libstorage,
the VFS module, and some translators directly access the postgresql
server. This is undesirable: it means permissions on a shared store
would have to be enforced using a collection of SQL views, it means
locking becomes very tricky, and it means that libstorage and other
things link directly against postgresql libraries (though this could
be addressed by gnome-db).
Support for NL
searches in select non-English languages (probably Spanish first,
but perhaps Japanese). Storage is built on a â€œlanguage neutral
frameworkâ€?, but grammar engineering is a very
difficult task. Some of the availability of NL searches will depend
on what the linguistics community produces and distributes freely.
A
nifty collaborative application to provide a test bed for the
collaboration/locking framework. <sketch collaborative
whiteboard/wiki design> (also shows informal work) Ideas? 😉

Posted by seth
Filed in General

Comments Off

Seth Nickell

Storage Talk

June 29, 2004

Meta