Storage Talk
June 29, 2004
Unfortunately my ankle was fractured pretty badly and it was important I have
surgery on Wednesday. Unfortunately this precluded my flying to Norway for GUADEC
on saturday. I actually proposed that I fly to Norway on Saturday to my orthopaedic surgeon.
He gave me a look that was darker than oil at midnight, and went back to what he was doing
without saying anything. Some people tell me I should have interpreted this as a “sounds ok”.
However, he later said some things about our goal being to “reduce the chance of having
arthritis in the ankle for the rest of your life”. That scared me into behaving.
There’s a more formalish storage paper for the occasion here. But honestly, I think the speaking notes are
more informative for getting at the soul of the material. In my experience that’s often true
of talks vs accompanying papers. So I’m including my speaking notes here. I blame oxycontin for any incoherent bits. They’re a little random but I hope you press through
because some of the good stuff is near the middle/end ;-). Maybe I’ll do sketches on whiteboards for all the places I was going to do live sketches and take pictures, but for now the notes are all booooring woooords. Unfortunately in many cases the sketches are the meat of the thing, but I think you can get some idea what I’m talking about from the text. I’ve fleshed it out past the notes in some places where it was totally incomprehensible:
-
Storage is designed to support a
more general user experience than just “find files more easily�.
Storage isn’t a silver bullet, but it can serve as a toolkit for
making new user experiences easier to extend across the desktop. In
the process it helps dissolve the application/desktop boundary a
little.
The Experience
-
Intro:
Related to many existing systems-
Wiki –
anybody can edit or work with information. Information is not super
formal to start with, but can become “formalized�. Unlike wiki,
allow for rich in place editing and better tie in to the OS for
noticing changes and tracking “change threads� (which are
themselves communication often). -
Whiteboard –
support quick informal live collaborations. Don’t force things into
a particular “format� or medium but allow people to mix it up.
Share a space with lots of presence information, etc. Also envision
this working when people are in the same place. -
Groupware –
handle objects that people need to deal with to get their job done.
People, teams, projects, tasks, deadlines. These are more central
to knowledge workers than even documents. Like groupware, track
threads of communication, but don’t tie people down to text
messages. Let them respond with people, projects, tasks, etc.
Rather than “posting to lists� you just append items to a topic
in the (or a) central store. -
Bugzilla –
tasks, and schedules, process, status, owner, etc. Track more
interesting metadata in a way that people can shape to their
organization.
-
-
Build “objects people care
about�-
This is more about what gets
built on top of Storage, but its a major part of the overall
experience. The file manager (atop the filesystem) is about
managing formal documents and folders to group documents in large
concrete chunks. The <some name here> (atop storage) should
focus on objects that fill people’s daily lives. -
People, Projects, Teams, Tasks,
Messages, Topics, Discussions, Managers, Proposals, etc, etc, etc
(and yes, Documents too) are objects people care about. Many others
that are specific to particular industries and job roles. Some of
these objects currently live in specialized applications like
evolution, and most of these will still be handled primarily
through a specialized interface. <sketch the two specialized
interfaces>. -
Its usually a good idea to have
specialized tools for targeting specific use cases. -
OTOH, although we work on text
documents mostly in the office suite, we still expose common
operations to the base OS (the filemanager mostly in this case).
How can we extend the set of useful things that can be done with
information across the information boundary? In a less generic
sense, can we build support for the objects people deal with on
a day to day basis more deeply into the OS. It doesn’t have to
be done by a univeral component system, but base libraries like
storage can make it easier to support the important “one off�
optimizations in the base OS (such as for projects).
-
-
Support informal work
-
Most office applications are
focused on producing deliverables: formal documents. But
deliverables are the exception. Most knowledge workers spend most
of their time processing, sharing, and extending information not
producing deliverables. We want to build interfaces that allow for
some degree of information soup. <sketch the process flow for
organzing SubsByTheInch2005> -
Informal work can eventually turn
into formal deliverables. Make this process as convenient as
possible.
-
-
Information is information, don’t
force large chunks-
We currently have odd
granularities of information. “Files� in the case of “formal
documentsâ€? (but since we don’t have informal constructs, many
things are pushed into this).
-
-
Access items within large bodies
of information-
The storage “research-y�
solution to this is object reference using human language phrases -
This aspect of storage still
interests me, and has been where most of the work has gone until
now…. but it is more researchy because it is prone to
being technically infeasible (jury is still out ;-). As such, other
parts of storage are not predicated on it.
-
-
Provide the components for
collaboration-
If storage is the physics, social
interaction is the chemistry. Storage needs to provide some very
basic structures that will give rise (when people, environments,
tasks, etc) are thrown into the mix to social interactions. Rather
than trying to control things rigidly, as traditional computer
environments have done, we allow social behaviors to regulate
things more (as things work normally outside computer world). -
Presence information is the
substrate for coordinating social interactions. Who is where and
doing what is the most relevant context for social interactions. -
Access by multiple
threads/computers/people. Rather than “versioning� documents
and the associated problems (e.g.
merging is a nearly insoluable UI
problem) we allow “live� (or at least effectively live) access
to documents. -
Fine granularity. If we have
access from multiple places, the temptation is to use locking of
“documents�. Even inside formal documents, however, this will
greatly limit collaborative ability. If we have rich fine grained
presence information, combined with very fine grained data access,
we can provide the ability to socially manage interactions rather
than requiring “forced� lockouts.
-
-
Track information flow
-
E-mail showed the importance of
threads of communication between people. An e-mail
thread morphs into a task (like a bug), which morphs into a few
more tasks (which might have discussions associated with them),
which turns into a full fledged project with an associated team,
which eventually produces a policy document. All this stays tied
together. <show interface idea>
-
A Brief History (aka
excuse):
-
Storage was initially
implemented as project Gargamel by a team of Stanford CS (and one
EE, and yours truly) students as a senior project. Brian Quistorf,
James Farwell, Khalil Bey, Josh Radel. It gets to a nice demo-able
point before they graduate. -
It gets even more
finished as I work on it after graduation while not looking for
work. Web page is written, screenshots made, etc. -
I foolishly decide to
rewrite the NL parser (and lose the old CVS history when importing
to cvs.gnome.org). I get sidetracked writing the NL parser. -
Slashdot etc hit.
Lots of developer interest, but I’m snowed for other reasons and
don’t succesfully get development moving with other people. Plus I
still have to finish the NL rewrite before things will function
again. -
The summer is
completely crazy, and I stop working on Storage for 8 months. -
Today: NL rewrite is
now done. Its a much stronger foundation, but the semantic grammar
is still small. However, even with the small grammar it can do very
sophisticated (correct) interpretations of phrases like “songs
that aren’t by ‘John Lennon’ but have the word ‘love’ in themâ€?.
This would be very difficult to parse with a traditional “naive�
scavanging search interpretation. Marco is also contributing to
Storage, as well as some other Epiphany dudes. Things are starting
to pick up, and I’m determined to not kill storage by bottlenecking
again. I’m looking for a “project managerâ€?.
What’s there today:
Non-NL
-
storage-store
– manages the postgresql server, handles notification -
libstorage
– GObject interface to store items -
libstorage-translators
– serializes / deserializes data streams from / to storage items’ -
GnomeVFS
module – automatically invokes translators on
read/write into the store allowing existing GNOME apps to use the
store like a normal filesystem -
NL
-
PET
– parses sentences into Head-Phrase Structure Grammar (HPSG)
trees, by Dr. Ullrich Callmeier. -
libmrs
– interface to the Minimal Recursion ‘Semantics’ information in
the HPSG tree -
libmrs-converters
– translates MRS into a more meaningful XML statement using a
client chosen semantic grammar -
libstorage-nl
– translates using storage-specific semantic grammar into the
intermediate XML form, and then to an SQL query
What’s in the near future:
-
Currently libstorage,
the VFS module, and some translators directly access the postgresql
server. This is undesirable: it means permissions on a shared store
would have to be enforced using a collection of SQL views, it means
locking becomes very tricky, and it means that libstorage and other
things link directly against postgresql libraries (though this could
be addressed by gnome-db). -
Support for NL
searches in select non-English languages (probably Spanish first,
but perhaps Japanese). Storage is built on a “language neutral
framework�, but grammar engineering is a very
difficult task. Some of the availability of NL searches will depend
on what the linguistics community produces and distributes freely. -
A
nifty collaborative application to provide a test bed for the
collaboration/locking framework. <sketch collaborative
whiteboard/wiki design> (also shows informal work) Ideas? 😉
<demo NL search interface>
<show NL slides and explain basic NL
process>