Announcing Composite Widget Templates

Hello fellow hackers, today we bring you a new feature which I believe can very much improve the GTK+/GNOME developer story.

This is a feature I’ve been planning for a long time (I originally blogged about it 3 years ago) so I’m very excited about it having finally landed in GTK+, it’s my hope and ambition that this feature will help shape the future of user interface programming with GTK+.

Before I continue, I have to thank Juan Pablo Ugarte for keeping the dream alive and talking about this at the last GUADEC. Also recognition must be given to Openismus GmbH for sponsoring my full time work on this for the last few weeks, the time to completely focus on this task would not have been afforded me without them.

Unfortunately this post will be a little terse, the one I had planned which is a bit more relaxed and has some comic relief will not be ready on time. So, at the risk of being taken seriously, let’s continue with a brief overview of the APIs introduced to GTK+ and the actions taken.

What are Composite Widget Templates ?

Composite Widget Templates are an association of GtkWidget class data with GtkBuilder xml, which is to say that the xml which defines a composite widget is now a part of the definition of a widget class or type.

This feature automates the creation of composite widgets without the need for directly accessing the GtkBuilder APIs and comes with a few features that help to bind a GtkWidget with it’s GtkBuilder xml.

As of yesterday, 23 composite widget classes in GTK+, from simple classes such as GtkFontButton or GtkVolumeButton to more complex widget classes such as GtkFileChooserDefault and GtkPrintUnixDialog have all been ported to remove all manual user interface creation code, in favour of GtkBuilder xml.

So, how can I use it ?

There are three or four new APIs added to GtkWidget which play on the class data, currently they will only be available in C, but if you have a little imagination, you can see how this can be very useful in higher level languages, by extending the syntax and adding some keywords (hint: I have vala in mind as a top candidate).

Before I go into the API details here, I would like to point out a complete working example which I created today while writing this post. To give it a try, you need of course GTK+ master from today (or yesterday). For those who are interested I suggest you download that small tarball, and build it with one simple ‘make’ command.

First, lets start with an example of how to bind your template to your widget class:

static void
my_widget_class_init (MyWidgetClass *klass)
{
  GtkWidgetClass *widget_class = GTK_WIDGET_CLASS (klass);

  /* Setup the template GtkBuilder xml for this class
   */
  gtk_widget_class_set_template_from_resource (widget_class, "/org/foo/my/mywidget.ui");
}

static void
my_widget_init (MyWidget *widget)
{
  /* Initialize the template for this instance */
  gtk_widget_init_template (GTK_WIDGET (widget));
}

So, to bind some GtkBuilder XML to a widget class, we need to call two functions:

  • gtk_widget_class_set_template_from_resource() binds some GtkBuilder XML to the class data
  • gtk_widget_init_template() initializes the template for a given instance, this is currently needed for the base C apis, but both could certainly be automated in a highlevel language.

Next, we have a function which creates an implicit relationship between some instance variables and some objects defined in the GtkBuilder XML:

struct _MyWidgetPrivate
{
  /* This is the entry defined in the GtkBuilder xml */
  GtkWidget *entry;
};

static void
my_widget_class_init (MyWidgetClass *klass)
{
  GtkWidgetClass *widget_class = GTK_WIDGET_CLASS (klass);
  GObjectClass *gobject_class = G_OBJECT_CLASS (klass);

  /* After having called gtk_widget_class_set_template_from_resource(), we can
   * define the relationship of the private entry and the entry defined in the xml.
   */
  gtk_widget_class_bind_child (widget_class, MyWidgetPrivate, entry);

  g_type_class_add_private (gobject_class, sizeof (MyWidgetPrivate));
}

In the above code, we’ve defined a relationship between the MyWidgetPrivate pointer named ‘entry’ and the object in the GtkBuilder XML of the same name ‘entry’. The entry will be available for access on the subclassed GtkWidget instance private data at any time after gtk_widget_init_template() was called, until the given widget is disposed (at which time the pointer will become NULL). GTK+ takes care of memory managing such automated pointers, so it is ensured to exist for the lifetime of your instances.

Again, with highlevel bindings in mind, this could be implemented as some syntactic sugar in the actual declaration of the instance variable.

Finally there is one more point of interest in the API which is Callbacks. Functions in your widget class code can be specified as Callbacks which serve as endpoints for any signal connections defined in the GtkBuilder XML.

/* A callback handling a "clicked" event from a button defined in the GtkBuilder XML */
static void
my_widget_button_clicked (MyWidget  *widget,
                          GtkButton *button)
{
  g_print ("The button was clicked with entry text: %s\n",
           gtk_entry_get_text (GTK_ENTRY (widget->priv->entry)));
}

static void
my_widget_class_init (MyWidgetClass *klass)
{
  GtkWidgetClass *widget_class = GTK_WIDGET_CLASS (klass);

  /* After having called gtk_widget_class_set_template_from_resource(), we can
   * declare callback ports that this widget class exposes, to bind with <signal>
   * connections defined in the GtkBuilder xml
   */
  gtk_widget_class_bind_callback (widget_class, my_widget_button_clicked);
}

Note that all signal connections defined in composite templates have the composite widget instance as user data by default.

In the above example code, my_widget_button_clicked() callback was declared with the assumption that the <signal> connection defined in the template was declared as ‘swapped’. Swapped signal connections are connections where the user data of the callback is returned first instead of the emitter. I think that this should be the default for composite widget callbacks as it blends in more naturally with normal class methods (where the class instance is always the first parameter).

This detail might not apply directly to higher level languages, which could achieve the above by adding some syntactic sugar in the declaration of a Callback method. Perhaps the instance will be implied as the ‘self’ variable.

I have also included some additional API to allow bindings to specify the GtkBuilderConnectFunc which should be used to make signal connections for a given widget class. I hope that bindings authors will contact me if they need any additional support in the GTK+ api to implement this.

Can I now use Glade to define my Composite Widget Templates ?

Of course silly ! That’s the whole point right ?

You’ll need Glade master from today as well, however I should be rolling a development snapshot with full support for this later this week as well.

All of GTK+’s composite widget classes have been recreated using Glade. Here is a screenshot of a GTK+ composite widget being edited in Glade:

Glade editing the GtkFileChooserDefault
Glade editing the GtkFileChooserDefault

Glade is still in it’s early stages supporting this, so there is hardly any features added here. I hope to work towards a brighter future where Glade can understand a multitude of composite widget templates as components of a single project, which will open the doors for some really nice and useful features.

Conclusion

All in all I have a lot to say about this work, but I’ll cut this blog post short for now, however I may be posting followups in the near future.

I’m very satisfied with this work, and I hope you will enjoy creating user interfaces as composite widget classes.

Where did my memory go ? – A detective story

Here, is yet another follow up post on EDS memory consumption. For the last few days I’ve been tracking where memory is spent in EDS and our benchmarking tools, and it was a very interesting experience.

And I’m not just saying that ! it was very trying and it’s still a bit of an unsolved mystery to me (so please feel free to step in with your theories on the unsolved parts !).

It all started when Michael asked me to explain the funny spikes in the memory usage graph presented in the previous post. The first thing I did was to produce a more “bumpy” graph by disabling the slice allocator, yielding what is in some ways a more accurate account of actual memory usage:

Memory usage measured for 12,800 contacts with G_SLICE=always-malloc
Memory usage measured for 12,800 contacts with G_SLICE=always-malloc

Interestingly, I say “in some ways” above because; one of the elements that we have to consider is memory fragmentation; memory management is generally more optimal and less fragmented when the slice allocator is active.

What we are looking at above is a left to right graph of overall memory usage; measured after each and every operation that we run on the addressbook. Each “dot” can be associated to one of the various latency tests that we run for each and every build of EDS (indicated in the legend).

First of all let’s demystify the “curious humps” which occur mostly to the “Custom Light” (light blue) benchmarks but are also noticeable in other benchmarks. These “humps” occur for four dots at a time, particularly when performing suffix searches on contact fields that are not stored in the summary SQLite tables for quick searches.

This phenomenon is partly attributable to the fact that all contacts in the addressbook need to be individually examined (and the vcards individually parsed) when the given contact field is not stored in the SQLite tables individually (or what we refer to in EDS terms as “the summary”). I’m not really very concerned by these “spikes”; obviously the memory is reclaimed later on, however it is curious that this happens specifically for suffix matching and not for prefix matching (presumably lot’s of extra string duplications and normalizations are needed for the case insensitive suffix matching routines).

Now that that’s out of the way, it leads us to some of the…

More interesting parts

I was at first not satisfied with only this explanation, sure, it kindof explains the “funny humps” in the benchmark progress but… by taking a closer look at what else is actually happening… I needed a better explanation.

The portions of the presented memory usage graphs that interest me more are the memory growth observable over the course of the first four dots, as well as the curious memory growth that also occurs at the very end of the benchmarks.

So what is happening in these stages ?

First of all, it’s positive news to know that the number of automatically generated vcards used for testing are already in memory before the benchmarks start at all, in the above graph that represents 12,800 vcards all in memory before the first benchmark is measured. And then…

  1. The addressbook is initialized and created, so at the point of measuring the very first dot, we have 12,800 vcards in memory and an initialized EBookClient on the client side and an addressbook counterpart (SQLite database created and active SQLite connection) in the server side memory
  2. Next, at the second dot we’ve created 12,800 EContact objects in memory… the 12,800 EContacts and 12,800 vcard strings remain in memory throughout the benchmark progress. This second dot is about 45MB higher on the scale than the first dot, so it’s pretty safe to say that 12,800 EContact objects cost roughly 45MB of resident memory which will not be reclaimed for the duration of the benchmark progress.
  3. The third dot is measured directly after adding all the contacts to the addressbook, here we start to see some divergence in memory usage; notice that this costs roughly 25MB extra for EBookClient based benchmarks, but only about 5MB for EBook based benchmarks. Being a bit naive, I overlooked this detail at the beginning of the investigation… one of the notable differences in the EBook apis is that it was lacking in batch commands. So the major difference here is that EBook tests add contacts one by one over D-Bus, while the EBookClient tests add contacts in batches of 3200 contacts at a time.
  4. This fourth dot, is after fetching all contacts at once from the addressbook. Here is where I became seriously alarmed. For normal clients, this shows an approximate 30MB growth in memory consumption. So where did my memory go ? A simple case of amnesia ?! Note though, that the Direct Read Access (red) benchmark hardly increases in memory for a fetch of 12,800 contacts, good show.

Naturally, feeling embarrassed about the consequences of the evil fourth dot… I frantically started my search for memory leaks… first I blamed the obscure nature of C++ code and it’s attempts to hide memory management behind smart pointers…I tried to pin it as a memory leak in the actual benchmarking code (after all, I did just lose 30MB of memory… it must have gone somewhere… right ?)… but after some tracing around, I found that those returned contacts, stored by smart pointers or not, were properly finalized and freed, leaving me with this uncomfortable mystery still on my hands.

While most of my memory leak hunt revolved around explaining the 30MB memory overhead incurred from dot 3 to dot 4, I should mention that the last memory jump was also suspicious. This last memory jump (which seems to vary between a 10MB to 25MB increase depending on the benchmark type) is incurred by deleting all contacts in the addressbook. So how about that ? I’ve just deleted all the contacts, and now I’m using MORE memory than before ?

The following day…

… I ran the benchmarks in loops, for some I’ll share below because this is how I eventually solved the mystery case, I also ran the benchmarks (server and client) under valgrind, ran some various test cases with the server and test cases running under valgrind. But the alleged memory leak was not to be tracked. Some testing of the benchmarks running in a loop seemed to indicate that there was some memory growth over time, not very much so, but enough to make me believe there must be some leak and be determined to find out.

Finally, today…

… I let my laptop chug along and loop the benchmarks (at least some of them) with a huge 12,800 contact count (that takes time), so let’s share those enlightening results here:

Memory usage while benchmarking 12,800 contacts in the first iteration
Memory usage benchmarking 12,800 contacts in the second iteration
Memory usage benchmarking 12,800 contacts in the second iteration
Memory usage benchmarking 12,800 contacts in the third iteration
Memory usage benchmarking 12,800 contacts in the third iteration

These results would be better viewed from left to right instead of one on top of the other, but you get the idea. Just consider that the last dot in the first chart happens directly before the first dot in the following chart, and so on.

So, after viewing this data… we can see that in the second and third graph, memory we presumed to be lost, is eventually returned to the system (in other words, it was indeed only a case of temporary amnesia, and not a more severe degrading case of alzheimer’s)… This is very reassuring, numerous runs with valgrind also show no real evidence of memory leakage, which is also reassuring evidence that our EDS is leak free.

But, that still doesn’t really explain…

Where is that memory actually going ?

At this point I can only give you my best guess, but all of the clues seem to point towards D-Bus traffic:

  • At the second “dot” where contacts are added to the addressbook, EBook APIs adding only a single contact at a time seems to cost much much less than using EBookClient apis and adding the contacts in batches of 3200 contacts at a time.
  • At the third “dot” where a brute “fetch all contacts” call is made to the addressbook, we can see a huge increase in memory consumption all except for when using Direct Read Access mode. So when fetching a list of 12,800 contacts not using D-Bus, we don’t suffer from memory loss.
  • In the last suspicious “dot”, where we delete all contacts from the addresssbook at once, all benchmark types seem to suffer significant memory loss. In this case the client is sending a list of 12,800 contact UIDs over D-Bus to the addressbook (in Direct Read Access as well, since deleting contacts is a write operation).

My best guess ? this is all due to zero-copy IPC transfers implemented by D-Bus.

In other words (if you’ve read up to this point you probably don’t need any explanation), instead of the sender writing chunks of data to a socket, and the receiver reading bytes from a socket; the sender is owning some shared memory which is accessed directly by the receiver.

This shared memory is probably managed by the D-Bus daemon itself, so it would make sense that the daemon not release the shared memory straight away but instead reserve some head room in the case that further transfers might reuse that memory.

So how come the fourth dot where a batch of 12,800 vcards are passed to the client, is not reused by the last dot where all contacts are deleted ? … Because, when contacts are fetched the shared memory owner would have to be the sender, which is the addressbook server. However when contacts are deleted, it is the EBookClient user process which sends a list of 12,800 UIDs, in this case the owner of the shared memory should be the other, client process.

I’ll probably need to pursue some extra verifications to be sure, but this best guess is very compelling to me at this time.

In conclusion, this was a really interesting exercise, which I don’t hope to repeat very often… but I did learn a few things and it did put some things into perspective. First and foremost; measuring memory usage, when compared to just tracking and plugging leaks, is quite another story… a lot more tricky and probably not an exact science.

If you’ve got this far, I hope you’ve enjoyed this detective story… I did enjoy it.

Amendments

It’s probably bad form but I’ll just add this here, my theory is obviously false. As I’ve been informed (already) that D-Bus does not implement any such zero-copy mechanisms with shared memory… so there is still a huge memory fluxuation, definitely related to D-Bus usage, which I can’t readily explain.

 

An exercise in memory consumption analysis

Hi again.

This is a follow up on my recent post on features and improvements to the Evolution Data Server that we’ve been working on at Openismus. Note that the previous post explains what we’ve done in greater detail, some of this post might not make sense without reading the aforementioned post.

As I was asked to write a more complete report on how each of our patch sets effect memory consumption in EDS, I went ahead and ran some further comparisons. As usual, Mathias’ benchmarks saved the day (while the original benchmark suite only generates memory consumption comparisons for a single run of contacts, I was easily able to produce charts for each individual run and compare them separately).

Actually I had postponed this post since I was hoping to update our final patch set for Direct Read Access apis before reporting my findings. It seems however that currently EDS master is in a period of transition and so I’ll postpone the new patch submissions until some temporary regressions in EDS are fixed (the code which does work with EDS master is however available on the branch).

Memory Usage Report

In order to get a grasp of the impacts on memory consumption that each patch set incurs, I’ve added two additional benchmarks to our normal set of benchmarks.

No BDB

This is a custom build of EDS gnome-3-6 branch with the removal of the BDB usage in the local file backend.

At this point there is no extra table in the SQLite to handle multi-valued vCard attributes, it’s simply a comparison of storing the vCard data in the BDB vs SQLite only.

Custom Light

This is a special run of our regular openismus-work branch, but with only the “Full Name” configured and indexed in the summary.

So this benchmark is a light-weight summary with considerably less columns (and one less table) used in the SQLite.

I ran this variation in the suspicion that SQLite might require significantly more memory with the additional multi-value table created to handle multi-valued attributes such as E_CONTACT_TEL.

Benchmark Results

Note that the RSS and VMS memory snapshots are taken by way of observing the /proc/$pid/status file for both the EDS server process and client benchmark process directly after stopping the clock for each benchmark in the suite. So a given value in the charts presented below is based on the “VmRSS” value of the server process added to the “VmRSS” value of the client process.

First, let’s show the results, or at least some of them, to put our deductions into context:

50 Contacts
50 Contacts

… Skipping a few results here in the interest of avoiding clutter … lets jump directly to 400 contacts …

memory-usage-rss-00400
400 Contacts
800 Contacts
800 Contacts
1600 Contacts
1600 Contacts
3200 Contacts
3200 Contacts
6400 Contacts
6400 Contacts
12800 Contacts
12800 Contacts

And now, some of the conclusions I came to while observing the results

BDB Removal

When compared to the unmodified EDS 3.6 branch, we can observe that the BDB removal reduces memory consumption for most reasonably sized address books. Up until we run the benchmark for 3,200 contacts, memory consumption is less without BDB… with 3,200 contacts and higher, memory consumption is increased by removing the BDB.

Without an in depth understanding of SQLite internals, I think we can deduce that the SQLite starts to require more memory to handle databases with >= 3,200 rows

Custom Light

This benchmark basically disproved my suspicion.

While using exactly the same code-base as the “EDS Custom” and “EDS Custom DRA” benchmarks; Using more indexes and tables in the SQLite does not seem to incur much of a difference in terms of memory consumption.

While the output is certainly different, as specially with large addressbooks, I don’t see much of a noticeable pattern here.

EDS Custom

This benchmark is basically the openismus-work branch with fully customized indexes for better performance in telephone number lookups.

When comparing this one to the unmodified EDS 3.6 benchmark, we can observe that memory consumption is slightly less using the custom EDS code than stock EDS 3.6.

When comparing this to the removal of BDB, we can notice that, as specially for small addressbooks, the base memory requirement of the EDS Custom is significantly higher than with only the BDB removal.

This second point is easily explainable, since removal of BDB alone reduces the overall memory footprint of EDS. The custom EDS benchmarks, without actually leveraging the Direct Read Access mode still links against the EDataBook library. Essentially this replaces the memory footprint overhead incurred by linking to BDB with a different overhead incurred by linking directly with EDataBook.

EDS Custom DRA

This benchmark is particularly interesting.

For smaller addressbooks the Direct Read Access mode indeed costs more resident memory than any other benchmark. This can be attributed at least partly to the penalty of loading an EDataBook into memory on the client side. Consequently, loading the EDataBook also loads the backend module in the client process, meaning we also have a running EBookBackendFile in the client as well as client side linkage and usage of the SQLite library.

However, once we approach addressbooks with 1600 contacts and more, the overall resident memory consumption starts to even out. Direct Read Access mode actually costs significantly less than any other benchmark for addressbooks as large as 6400 contacts and more.

These results are a bit harder to explain. My theory is that since the EDS server process essentially goes to sleep after adding the initial contacts. All queries thereafter require no interaction with the EDS server process.

Some things to consider here are:

  •  The cost in memory of constantly waking up the EDS process to handle a query
  • The cost of server side heap allocations used to deliver the results over D-Bus
  • The cost of client side heap allocations used to receive results over D-Bus

Overall Memory Consumption differences

In summary, we can conclude that after all measures taken to improve performance of contact fetches in EDS; the Direct Read Access mode is the single element which makes a tradeoff in terms of memory consumption versus speed.

Without the Direct Read Access patches, memory consumption as well as time to fetch contacts has seen a net improvement. With Direct Read Access enabled we see that for smaller address books an additional memory overhead is required, while with larger addressbooks (larger than 3,200 contacts); overall resident memory usage has seen a significant improvement as well.

 

Meritocracy

I’m posting this here, while I would have replied to Taryn Fox’s blog but couldn’t do it without subscribing to something….

(I’m throwing away all of the text I wrote yesterday and starting over, I’ll instead try to write something shorter).

First and foremost, please remember that GNOME projects are indeed mostly volunteer driven, except for a few projects in GNOME which may be dominated at times by developers all working at a given company (and in those corner cases, the meritocracy approach may not apply as strictly).

In most cases, the maintainer is the only one that actually cares about the given project enough to weather the storm. Example, if I had not been so determined to make something out of Glade for a number of years in my spare time… believe me that the project would have died, in the same way that if Juan Pablo did not take care of Glade these last couple years, nobody else would have taken charge for the long term. I know this because I see the flood of contributors who come and go, the ones who stay the course and show dedication are few and far between. It’s only fair that we afford a special level of trust to those who work hard and stay the course.

Yes there are things that can be improved, hopefully we can all take criticism and try not to hurt people’s feelings etc etc, but please consider the cruel alternatives to meritocracy.

The alternative to meritocracy as I see it are those “Pay to get in Boys Clubs”, what I mean by “Boys Club” is you know… those people who’s daddy was rich or knew the right people, and so were able to go to the most reputable universities and have all the opportunities that others did not. Now let me stress that not all members of these clubs have an arrogant sense of self entitlement, however sadly some of them do in my experience, also most corporate human resource departments are unconditionally biased to hire only people who hold some kind of university degree (or even, those who hold a degree from a first world country).

Meritocracy helps us to level the playing field, it gives a chance to those of us who grew up in a cardboard box or in a third world country, to prove that they can indeed make just as worthy contributions as those of us who attended one of these rich kid clubs/universities and also get the same recognition, provided they at least did their homework (whether the walls of that home were made of brick, wood, or only cardboard).

This is something worth fighting for, worth protecting.

Getting your contacts… Right Now!

Hi all, hope you’ve spent a pleasant holiday season.

As promised, here is another post describing what new tricks we’ve been teaching EDS (Evolution Data Server) this year at Openismus.

Before I go through all the details, a little context is in order. Last year Mathias created a nifty benchmark tool for EDS allowing us to track performance improvements and regressions of the Evolution Data Server across releases and branches. Mathias, with his prior experience and knowledge of EDS was able to make some educated guesses on where we could save some milliseconds, all in the interest of providing an EDS that is stable/reliable in terms of performance and also useful in a variety of platforms and scenarios (not only as the backend of the Evolution Mail client on Desktops).

We’ve come a long way on this, so first let me describe the major changes that we’ve made and then I’ll move on to show you the results.

Removing Berkeley DB

Historically, Evolution’s local addressbook used Berkeley DB to store VCards for all contacts. Over time some optimizations were made, originally an in-memory “summary” was maintained holding some of the contact data in order to speed up queries to the addressbook. This was eventually replaced with an SQLite implementation of the quick search “summary” data. In the end that left us with a two step query for fetching contacts from the addressbook; an initial query to the SQLite to find any UID which match the query terms and then another query to the BDB fetching the actual VCard data for that contact.

Removing the BDB implementation and storing all contact data in the SQLite instead naturally makes queries faster, not to mention there is considerably less flash wear as we only have one DB persisting contact data now instead of two. Additionally we’ve also observed that the old BDB code fails (crashes, even) with an out of memory condition in some cases such as deleting more than 6400 contacts at once, this is all handled much cleaner using SQLite exclusively.

This has landed in EDS’s git master a couple of months ago and should be available in the next release.

Configurable Summary Fields

Summary Fields in EDS refer to the VCard fields for a given addressbook which should be elected for fast results. They are stored separately in an SQLite table so that contacts can be queried without parsing the complete contact VCard data for every contact.

This list has always been hard coded and tailored to the needs of the Evolution Mail client (the list basically consisted of the contact name fields plus a hand full of email fields which Evolution is accustomed to using). This would of course be appropriate for an email client but falls short for applications that have different needs such as hand phones, which require extremely fast results for queries by phone number.

So we’ve now introduced a set of APIs which allow configuration of the summary fields of a given addressbook at addressbook creation time. This allows us to choose which fields are stored in the summary and which of those fields should be indexed for extra fast retrieval.

As a side effect of this, we now also support multi valued VCard attributes to be stored in the summary (i.e. list of emails or list of addresses).

This has also landed in EDS master some time ago and will be available in the next EDS stable release.

Direct Read Access

One of the more intrusive changes is the Direct Read Access mode. Mathias foresaw that we would gain significant performance simply by delivering query results directly to the client instead of squishing them into the socket and pushing them arduously through the D-Bus byte by byte (or probably 8192 bytes at a time…). I have to admit that I was a little sceptical about this change but after benchmarking the direct read access approach I was able to notice a serious performance gain.

Our fastest queries using the previously described configurable summary fields return in roughly 4-7 milliseconds.

The same queries in Direct Read Access mode are quite consistently 0.2 or 0.3 milliseconds.

In other words; for the simplest queries where the EDS server can fetch the results very fast, we waste the grand majority of our time serializing/deserializing VCard data and tinkering on the D-Bus socket.

This has not yet landed in EDS master, so I’ll keep you posted 😉

How fast can I get my contacts ?

In conclusion, let’s go over the results of our benchmarks and compare.

First a few details regarding the results we’re looking at:

  • EDS 3.6 – This is stable EDS 3.6.2 without any of the above modifications, it’s important to note that this version still uses Berkeley DB as well as SQLite to store contacts. Furthermore, the stock 3.6.2 does not take advantage of SQLite indexes.
  • Custom – Built from our EDS 3.6 based work branch (called ‘openismus-work’) this build has Berkeley DB removed and configures the summary with a custom summary configuration.
  • Custom DRA – Built from our EDS 3.6 based work branch; this build additionally enables Direct Read Access, using the same configuration for summary fields as the ‘Custom’ build uses.

For both ‘Custom’ and ‘Custom DRA’, the customized summary is configured as follows:

  • Full Name: In the summary and indexed for prefix searches
  • Given Name: In the summary and indexed for prefix searches
  • Family Name: In the summary and indexed for prefix searches
  • Telephone Number: In the summary and indexed for prefix & suffix searches
Query for exact match of the full name attribute

In EDS 3.6 stable, the full name attribute is indeed stored in the SQLite summary. Notice that for small addressbooks (less than 200 contacts) the results are similar to EDS Custom. However with the customized summary fields we’ve also ensured that the SQLite indexes are getting used properly; this is what ensures the performance doesn’t degrade too much with larger addressbooks.

The red (DRA) line is EDS doing the same thing but avoiding the arduous tinkering with D-Bus messaging.

Phone Numbers

Query for prefix match of phone number

Since we’ve configured EDS to optimize the phone number field of a given contact for prefix & suffix searches, we can now use phone number queries at reliable speed. In other words you can again use EDS on your hand phone to implement your contacts database and kittens wont be sacrificed.

The reason why this takes an extremely long time with EDS 3.6 is that the contacts have no quick search information ready at hand, this means we must iterate through all of the contacts in the Berkeley DB and parse the VCards for each, extract all of the phone numbers and compare one by one.

Since we optimized for suffix searches, we get similar results for a suffix search:

Query for suffix match of phone number

Memory Usage

We also have in place some monitoring of memory usage:

Virtual Memory Usage

 

Resident Memory Usage

These are basically just memory usage snapshots taken over the course that the benchmarks run.

The “Custom” setup takes slightly more memory to run than EDS 3.6, this is presumably because we maintain more indexes with the SQLite version. Unsurprisingly, the Direct Read Access mode takes significantly more memory; this is because the direct read access mode uses two SQLite connections (one for the server and one for the client to make direct read calls).

This concludes this Tuesday’s episode of “Getting your contacts Right Now!”

Stay Tuned 😉

Isolated unit testing of D-Bus services

Hi all, long time no blog…

At Openismus these past months we’ve been making a series of improvements for the Evolution Data Server, again.

As it goes with patch reviews, better to take on one at a time, so this blog post will focus on isolated unit testing of your D-Bus services.

If you work on the implementation of D-Bus services you’ll be happy to know about the new GTestDBus object introduced in GIO since 2.34. Thanks to this nifty new object we can now perform unit tests on D-Bus services in a completely isolated fashion.

Using GTestDBus to implement a new test fixture for Evolution Data Server (EDS) now makes it possible to run ‘make check’ for EDS without running the actual EDS servers manually, and without even installing the EDS software into the target prefix before hand. With a little more work, we can even get regular distchecks passing for EDS again.

Here’s how:

  • First of all, you need to have an in-tree directory holding your D-Bus .service description files. Typically you already have the .service.in files stored somewhere in tree to be installed somewhere into the install prefix, but you need a separate one (I think it’s good practice to add a subdirectory to your ‘tests/’ directory, so it becomes ‘$(top_builddir)/tests/services’)
  • The contents of your separate .service.in files for testing should point to the in-tree location of your D-Bus service, typically it would look like this:
    [D-BUS Service]
    Name=org.gnome.MyProject.MyServiceName
    Exec=@abs_top_builddir@/src/my-server-name

    This will define a service which explicitly activates a server which you’ve built in your source tree.

  • Then, you just need to provide that in-tree service directory to g_test_dbus_add_service_dir(), typically you would put the following into your tests/Makefile.am:
    -DTEST_SERVICE_DIRECTORY=\""$(abs_top_builddir)/tests/services"\"
  • Finally you just need a basic test fixture, with this test fixture you can ensure that your temporary D-Bus session along with your test service is recreated for every test in the suite:
    typedef struct {
      GTestDBus *dbus;
      MyProxyType *proxy;
    } TestFixture;
    
    static void
    fixture_setup (TestFixture *fixture, gconstpointer unused)
    {
      /* Create a private dbus-daemon for this test */
      fixture->dbus = g_test_dbus_new (G_TEST_DBUS_NONE);
    
      /* Add the private directory with our in-tree service files */
      g_test_dbus_add_service_dir (fixture->dbus, TEST_SERVICE_DIRECTORY);
    
      /* Start the private D-Bus daemon */
      g_test_dbus_up (fixture->dbus);
    
      /* Get a proxy which we will be using in test cases, typically
       * this is some API generated by gdbus-codegen
       */
      fixture->proxy = my_generated_code_new_for_bus_sync (...);
    }
    
    static void
    fixture_teardown (TestFixture *fixture, gconstpointer unused)
    {
      /* Destroy the proxy */
      g_object_unref (fixture->proxy);
    
      /* Stop the private D-Bus daemon */
      g_test_dbus_down (fixture->dbus);
      g_object_unref (fixture->dbus);
    }

With this basic type of fixture, you can produce a series of tests using g_test_add() in the normal way, all testing various behaviours of fixture->proxy.

Of course, perfect isolation of your service’s test cases may need more efforts than just the above, for instance in the EDS test cases we create and delete a data directory between each test and setup the XDG_CONFIG_HOME, XDG_DATA_HOME and XDG_CACHE_HOME enviornment variables to point to that directory (avoiding any interaction with the user’s addressbook & calendar). We also compile a gsettings schema in the local data directory and setup GSETTINGS_SCHEMA_DIR to point to the in-tree directory (avoiding any need to have the EDS settings schemas already installed).

This morning before writing this post I also provided a simple patch to GIO adding an example of this testing paradigm.

 

Glade @ GUADEC

Hi everyone,

Long time no blog. I’ve been meaning to blog and build hype around this but as I’ve been busy with so many things it just hasnt come out.

Well the first thing so say is, please be interested to click this link and visit Juan Pablo’s blog. He is speaking first thing on the first day of GUADEC on the topic of embedding GtkBuilder script natively into GtkContainer derived widgets. Some may remember some of my ancient blog posts on the same topic, I never found time to complete the patches in the composite-containers branch but Juan Pablo has picked up the work with a fury and is going to explain in more details in his talk.

In a last minute decision, as the dates are right, I also decided to drop in too. With all the work Juan has already done, a little consensus and participation hopefully we can finally pull off this great feature.

See you there 😉

 

A long overdue blog

I haven’t had the time to blog about the things I wanted to this summer, unfortunately I’m a couple days behind in the project I’m working on now so I’ll have to try to make this brief.

First ever GNOME summit

This took place in Montreal several weeks ago, it’s definitely a late blog post for this but I really wanted express my gratitude.

I did not take any pictures, however I did force some time into my schedule to push out a release of Glade (you could easily say that the latest stable releases of Glade were brought to you in a large part by the GNOME summit).

All in all I just wanted to voice my appreciation for getting the opportunity to shake hands with some of the people I’ve admired over the years, after arguing countless topics with many of the same people on  gtk-devel-list and desktop-devel-list over the last decade, it’s really amazing to get to meet some of these people in real life.

I do wish I had blogged this earlier, and I sincerely apologize for not having been a better host (as Montreal is my home town), it was hard enough to push the summit (and Glade release) into my schedule at all (was more of a great pleasant surprise that the summit actually came to me).

A summer of evolution-data-server

This summer at Openismus we’ve been making some enhancements to the Evolution Data Server as a part of Intel’s effort on the Meego platform.

I haven’t been blogging about this work, generally because I did not feel like there was something to “show off” about, we haven’t invented anything new, however  since yesterday we’ve landed the final patch so I’ll just give a rundown of which patches I was tasked to work on.

Bug 652178: Store PHOTO data as plain files

This is probably the most complex of the patch set, Evolution Data Server’s addressbook api allows storing of image data either as binary encoded blobs or simply as a URI. This patch basically enforces a policy where the local backend of EDS converts any incoming binary data into URIs on disk managed by the addressbook backend.

Bug 652175 and Bug 652177:

These patches add a backend property to the calendar and addressbook, the value of the property is guaranteed to remain the same so long as no data has changed for that backend, whenever data changes in the backend then the overall revision is bumped (this allows tools like SyncEvolution to abort when no data has changed without iterating over the whole database).

Bug 652171 and Bug 652180:

These patches implement an api which already existed but had remained unimplemented. The api allows one to filter the reported results of a calendar or addressbook view to only report some of the desired information (this way if you only want contact names and UIDs for instance, you dont have to transfer full vCards from the EDS just to get them).

All of these patches have landed in Evolution Data Server and should be available in the next (3.4) release.

Here and Now

Only a few days ago I landed back in Seoul, South Korea where I expect to be spending the greater part of the coming year, right now I’m in a guest house and hacking in the basement, it’s a nice quaint little atmosphere that makes you feel like you’re doing some kind of crazy science experiments in grandma’s basement again.

For the next few months I’ll be devoting much of my time to a fun (commercial software) project which is to write the new up and coming Karaoke Application for TouchTunes.

While I can’t directly devote any of the company time to GNOME, I can always find a good excuse to enhance the code at the correct level in the stack instead of working around the problem in an application. It’s always good to prove that it pays off to give back to the community which provides your platform libraries.

 

Well, it’s been great and I hope there are not too many typos … now back to not having enough time to do all the things I must 😉

Glade 3.10 and 3.8 out the door

Glade 3.10 and 3.8 actually happened.

Thanks to:

  • Juan Pablo the Magnificent our hero who is responsible for
    the sexy new workspace look and feel (and also there to help
    me fix and wrap up the builds in the last minute).
  • Marco Diego Aurélio Mesquita for his work on the new preview feature
  • Florent Thévenet for creating icons for every widget class
  • Openismus GmbH for sponsoring my work on Glade.
  • Dozens of contributors who helped polish Glade, reported and fixed bugs.

There has been so much to do in these last moments wrapping up Glade that I have forgotten to include translator credits… so after this blog post I will follow up on my release mails with that.

I’m very proud that we actually came this far, while I think we’re moments late for the GNOME 3.0 release and also a few freeze breaks seem to have occurred, Juan Pablo and I stayed up very late last night making sure we had something worth while so… there better not be any bugs (i.e. you better love it !)  😉

Enjoy !

 

Animated Drag’n’Drop thoughts

So I’ve been tasked to try an animated approach at Drag’n’Drop to rearrange child widgets inside a specific container widget in libegg (EggSpreadTable), and I’m throwing this out in public just in case someone might have some better ideas than I do currently.

The EggSpreadTable is something like a GtkTable or GtkGrid that takes a list of widgets and displays them in a fixed number of columns, the spread table “spreads” out widgets evenly across the columns in order to consume the least height as possible (important thing to keep in mind is that widgets can shift across columns to make better use of space).

The desired effect is the one produced when running libegg/toolbareditor/test-toolbar-editor… which uses gtk_toolbar_set_drop_highlight_item(), an obscure and highly complex piece of code which is responsible for highlighting potential drop targets in the toolbar by animating other items out of the way.

My plan so far is a little abstract but I’m going to have to write code now and see if it works:

  • When a drag’n’drop (DnD) operation is in effect, and “drag-motion” is being emitted, I think I’ll have to “lock” the spread table so that widgets don’t jump from column to column, this will avoid hovering a drag source at the end of one column and having the animation show the widget at the beginning of the following column, which I think would be unintuitive.
  • When performing DnD animations, not only will we have to animate widgets up and down inside the potential “drop column”, but we’ll have to call gtk_widget_queue_resize() repeatedly whilst the spread table will have to grow in order to fit the new drag source into the column somewhere.
  • To look more intuitive, when a drag begins the drag source widget will have to “slide out” of it’s parent spread table, which will require the same animation technique whenever a drag begins. Likewise, when a drag fails the widget should be animated back into place.
  • Once a widget is successfully dropped in a said spread table (widgets can be dragged and dropped from one spread table to another), then the arrangement of widgets will still be invalid… the target spread table will have to unlock… at this point it’s also probably best to animate the rearranging of widgets (otherwise you get a nice animated DnD experience and when you drop you get this “snap” where widgets are rearranged in the target container).
  • Another complication is that EggSpreadTable is a height-for-width container, the size and space that the target placeholder will use will depend on requesting the size of the the actual widget being dragged (it’s even possible the spread table will have to grow in width & height to fit the animation).
  • And last but not least, of course it has to use GTK+’s horrid Drag’n’Drop apis.

The business logic will surely be horrid in all of this, many states will have to be managed and juggled while performing these various animations… not to mention the whole operation will require resizing of the spread tables in play (causing the whole interface to shift around, kind of like having a GtkInfoBar in play).

The toolbar code is evidently much more obvious, it only lines up a list of fixed size widgets in a row from left to right, when the widgets (tool items) don’t fit the toolbar anymore, an arrow is displayed and the rest of the items are placed in a drop down menu.

So all of this to say… anyone have any better ideas ?