Back from Istanbul and a few updates

Pretty late blog, but really busy. I landed in Bangalore on July 14 8am to attend a 5 day out-door training starting at 10am. Lucky that the timezone difference isn’t very bad, I happened to manage with my training.

Guadec:

  • Nice parties and thanks to all their sponsors and GNOME.
  • Had nice discussions/talks with Michael, jpr, hpj, Miguel, Aaron, Scott and lot of other hackers
  • Announced about the Evolution licensing change
  • Had a couple of talks at GUADEC
  • Nice chats with Muelli, jrb, thomas, philip,  jorgen
  • Attended lots of talks

Post guadec:

  • Stuck in a training till friday on project management training.
  • Merged on-disk summary with trunk and guess what trunk must be broken for few/most cases. We will shape it well.Blame me/sankar if you see lots of warning on compilation, we explicitly added ‘#warnings’ for lots of stuffs to make sure we fix it instead of growing FIXMEs in the code. We ported most of the providers. Didnt do much for imap4 provider. But will do for it also post 2.23.5 which is on Monday. (Sorry fejj, too busy, so we delayed it). Any other camel provider like evolutin-brutus needs to be ported. I had a live demo at guadec on this and I have some graphs which I collected during my demo. I will attach at the end of the blog.

I had a fixed set of data for Evolution, couple of accounts and 2 vfolders which sort of fetches from all these folders. In total 200,000 mails.

Evolution 2.23.4 (before the merge)

before moving to disk summary

Evolution 2.23.5 (after the merge)

After the merge

Sankar already did a post on how it is done. I got these from massif, I know these arent exact but the comparison sort of shows difference/improvements.Evolution sort of releases unused memory over a period of time when you move away from folders and vfolders queries from db and loads counts. So nothing is loaded except what you see. But there are some issues lying around with the sqlite memory cache free, because of which at times it goes 100% cpu on to allocation of memory during sqlite commit. So Im sort of disabling all memory drop thread for 2.23.5 and would get them to shape during 2.23.6 or so. We have added lots of debug and g_assert(0) at few places to capture some important traces/issue so bear with us in 2.23.5 if it crashes more. File all your bugs to bugzilla and add to the tracker bug. All would be fixed and Evo 2.24 would be slim and rock solid.

Camel DB (Disk) Summary – Evolution memory improvements/thoughts

Finally I got my chance to use some of my ITO time (just 3 days this time). I decided to spend my time towards answering memory issues of Evolution Mail. Folder summary (Message list) is one of the biggest reason for Evolution’s high memory usage. The folder summary has Message infos. Every message info is nothing but headers like from/to/cc/sent-received date etc. Notzed (One of the Ex-Evolution/Camel Hacker) wrote a design/code on addressing the core issues. But unfortunately when he left Novell, the code/design wasn’t developed. It used libdb and had lots of new design/ways to access mail data. I was thinking on the same lines and decided to take some concepts from there and wrote a new design/thought . I had spent my 3 days and nights on improving my design and I have a fully working prototype. I used sqlite as the database for Camel Summary and message UID as the primary key. All over the summary code, I used sql queries like “select * from Inbox where uid=’342′” to access the data.

I modified the entire design of Evolution/Camel to store use only UID where ever required. When ever the Message info is required it queries the DB and gets the the data and frees when not required. It could mean that we don’t need to keep the visited folders in memory for the sake of trash/junk. We don’t need to keep the vfolder’s sub folder in memory. It could mean that just the viewed folder (why this also?) can be in memory rest could be just in data base and queried as and when required. I made a prototype with this in mind and I was able to achieve what I thought. (Asking when I’m gonna commit this? Hmm, I have made a prototype. Folder Summary listing works, Junk/Trash works, Search works, VFolders works. But there are lot of things that I can/ need to optimize since I have the flexibility with DB. Since this will break ABI and add more APIs and deprecate a few, I need to design the APIs with lots of things in mind like (remote view, Mails part of EDS, etc). I was discussing with Fejj (another Mail/Camel hacker) on friday and he gave nice thoughts/inputs on my design, like having a LRU implentation to decide what message infos to keep in memory and what not (He gave the code for LRU from GMime). These optimization would reduce huge memory for users having lots of vfolders and folders. Unfortunately this may not have any effect on users lying around just one folder (Just Inbox) and huge mails in it.And after all this optimization of memory, there isn’t be any performance drop, infact, it is a bit faster now with indexed tables. But vfolders was a bit slow, but having persistent summary for vfolders, it is going to be faster than the current implementation (I haven’t prototyped this though). Achieving all this in a cleaner way will be my first target milestone and might take as close as a month. (If I’m allowed to work on this, to go full stretch on this for a month).

In current Evolution, you wont see that memory drop
Note: The drop in memory, is something you won’t see in the current Evolution. Junk/Trash keeps those last visited foders in memory.

Of course there are next levels to this. Remote view & Search-in-disk

Remote View: Currently after my first target, only the viewed folder’s message list is going to be in memory. Now we can have a custom model store, that just maps viewed message list’s infos and may be a buffer of 50/100 above and below the message lists view. We can have a cursor in the db that just moves maps the view+head/tail to memory. It means that when you have a folder of 100,000 mails and the message list shows 50 mails and the head/tail buffer is 50, you would have just 150 mail’s message info on the memory and nothing else. It may be a bit slow, if you do page down/up faster or scroll using mouse. But can make Evolution run on any machine with low memory or mobile devices (Nokia 800/810 etc… Of course you can do a optimized design to over come the performance issue with huge scroll. (Sqlite is pretty fast and possible that you may not notice it most of the times). Also it requires lots of things like sorting/threading etc needs to be built inside the tables and the cursor needs to be mapped to the message list/etree. There is no prototype/data for this, but this is possible for sure after my first milestone. This will be the second mile stone

Search-in-disk: Currently for search, the entire folder summary is brought in memory (anyways in my first milestone only this is in memory). But if we implement a remote view, it may not be so efficient to it this way. We can extend the search to be done inside the data base and just retrieve the uids or use the cursor to map the contents to the message list. Effect: It will be super fast and again on low memory consumption.

I’m not doing much for the second or the third milestone right now. But I want to work on the first milestone for Evolution 2.23.1/2 (Sorry not for GNOME 2.22, too late to bring such a huge design change) The second and the third mile stone might not bring much ABI/API changes if it is designed well in the first mile stone and can be taken/done at any point with out much disturbances to the stability IMO. I wish I w(c)ould on all this.