Well it wasn’t that wet, but it was close enough and it was cold and dreary. I spent an awful lot of the weekend playing with an idea I had last week. Another piece of software to run a blog/wiki/etc – i.e. a ‘cms’. I don’t really need one, and the world doesn’t need another one, I just want to update some of my skills and play with some web stuff again (as much as a vowed never to do any web coding again after a php experience, wpf is worse and has driven me to these depths) and exercise a few other ideas i’ve had lately.
And I want to do it with raw c and no frameworks, just to get back to basics again for a while. I’m only using libdb, and gperf for a static hash table. It is also a cgi programme so I can (hopefully) run it on my isp’s web site (that is another reason I can’t use frameworks). Being a fork/exec cgi means I can be lazy with resources too – implicit garbage collection by means of calling exit. At least for small strings, and that simplifies the code a bit more. I’m also avoiding xml. I don’t really like xml, and I don’t think it aids productivity in most cases – to me it is a technology that lets proprietary products ‘open up’ a little controlled access into their internal logic (thinking xslt mainly here). But it is not a natural way to code for most developers and is extremely resource intensive (at least the c implementations are – java/.net is so resource intensive anyway it’s hard to tell). Anyway – I don’t need it, so I wont use it (unless I start looking at dynamic asynchronous javascript down the track).
So after doing the basic stuff of a simple cgi argument parser, and refreshing some ‘form’ knowledge, I started looking into presentation, and programme flow. I have not looked at css before – so there’s plenty to look at there, and it looks like it’ll be flexible enough to not need anything like xslt for presentation.
But how to add all the basic support HTML around the content? First I thought of templates parsed at run-time. Something like server side includes but with a little more flexibility. But I want a minimum of processing at run-time, and then there’s all the hassle of template-to-function linkage, showing pages recursively and all that guff. I thought i’d just pre-compile the templates, a bit like the way xaml works, but without all the crap. Compile into tables which are executed at run time? No hang on, why not just into a function-per-template and a string of function calls? Hmm, that suddenly made the problem tiny. A very simple ‘compiler’ that converts the templates into a sequence of function calls – output string, call function, output string, etc. Let the compiler handle the linkage and storage issues, and I just have to handle a little run-time context. And the added advantage is that run-time is very fast and the memory use is minimal – and shareable – since the page is read directly in it’s executable form. Problem solvered. I didn’t really want to go the php route and have embedded logic which can end up becoming a huge mess, so you can only call functions and each page outputs in its entirety and the control logic is in the c source. After all I’m not writing a framework here, just a solution to a specific problem, so there’s no need to add more smarts than is required. I wrote a little makefile magic to compile these templates, and some bourne-shell to generate the ‘/* AUTOMATICALLY GENERATED */’ header files, and now I have a nice ‘void page_front_page(void)’ I can call from any other page or c code to output whatever is in ‘front-page.html’.
I had to get up to speed on libdb again – I looked at it for evolution just before I left, and I wanted everything to be transaction protected, use sequences, multiple secondary indices, and possibly queues in the future. Didn’t take long – libdb has excellent documentation. One issue is how to store the data since libdb only gives you a key-value pair, you need to pack the rows yourself. I decided for now on a simple tagged format – extremely simple (and fast of course) to parse on loading and no need to have it human readable. And being tagged means it can change in the future without breaking everything (that was a mistake I made with camel – trying to compress the data too much, the order and number of items was important). Initially I was going to store the records as files and just store the meta-data in the table, but I decided to put everything in the database, at least for the moment. Conceptually, each record is a little like an ldap record, a field followed by a value with the capability of having multi-valued fields. And everything is binary-safe.
That let me add a few more pieces – now I could throw together a front page – 2 small template files, the main page which calls the function to output the content, and a one to display the summary of the post. An editor. A bit of messy logic in the c file to track the lifetime of the editing session, plus one editor page template to communicate with the user. Simple basic stuff – and that’s half of a blog done. Oh I threw together an rss feed with a little bit more work as well.
And then I had enough time to look at the user interface to the content – the ‘wiki’ language. Rather than re-inventing another weird wiki language, I thought i’d try using texinfo (and only texinfo – no html/special stuff). This has been an idea swirling around in my head for a while now. What if you could edit a complete texinfo document, where each @node is a separate page, and the node links (next/prev/up) are used to automagically bind all works into a coherent whole. One problem I find with wiki’s is that the nodes mostly sit by themselves, and the linking process is ad-hoc (usually it is not done since it isn’t completely necessary too – pages stand alone as articles) and it is difficult if not impossible to convert part or all of a wiki into a a printed or all-in-one-page document. Add in the cross referencing and indexing features in texinfo and you could really have a powerful system. Well that’s the idea anyway. For now i’ve got only a small part of texinfo implemented and all it really does is formatting and external links. I’m still not sure how to handle node/navigation – the wiki/texinfo way is to use a node name – but if you change that I want it to go and update any links too – well I have a transactional database with multiple indices, so it should be doable. And I probably need some sort of namespace mechanism to manage multiple separate documents. I can’t run any texinfo tools because they might not be available and it’s too much of a security issue anyway, so I need to do the processing internally (I need to do that anyway for any automatic manipulation).
But I dunno – the texinfo thing is a big task, so i’ll see how it goes – for now it provides a reasonable markup syntax for many types of technical documents. There are plenty of other issues like user management and concurrent access too, should I bother looking into those. Hmm, versioning – that might be something quite interesting to look into too – and something else i’ve been thinking about recently.
I also ordered a new thinkpad. There’s nothing particularly wrong with my old one – T40 – 5 years old and still in good shape (motherboard/keyboard replaced after 3 years on warrantee, and i dropped it a week later, but just a tiny bit of the case broke off). I just felt like spending money and I thought i’d try an X series this time since they were on special last week, and came with a dock/dvd burner thrown in at a reasonable price. I’ll miss the touchpad and the keyboard light, but the x300 is too expensive.
My Thinkpad X60s has a keyboard light, for what it is worth. I’d be surprised if they’ve removed it on the X61. I haven’t really missed the touchpad from my previous laptop though.
My x61s has a keyboard light.
You realise you’re absolutely insane to be doing this in C, right?
You’ll be glad to know that the X60 series also has the keyboard light.
My X60 has a keyboard light (unlike the X60t that just pretend to)
My X60 has a keyboard light (unlike the Z60t that just pretend to)
For a month now, we have resurrected the disk-summary idea and working on a branch – http://svn.gnome.org/viewvc/evolution-data-server/branches/camel-db-summary/
You might want to give a try to sqlite instead of libdb. It supports storing multiple columns and hence you don’t need all that columns-to-a-combined-value overhead code.
fraggle: I’ll bite – what do you suggest?
sankar: But you still need sql to object marshalling/demarshalling, and you have to mix languages in the code, so what’s the difference? The code is still there, but as well as the db code which does it you have to do it again yourself. I think sql-language queries is a valid reason to use it rather than something like libdb, but they also have significant execution and memory space overhead. Anyway it’s an exercise in trying to do it without those crutches as much as anything – I find this sort of code more challenging and thus interesting, and since I’m doing it for fun, I want it to be fun. SQL is not fun.
For sqlite, tcl-tk bindings have a direct tuple-to-object conversion facility. Nothing of that sort is available for GObject and CamelObject yet. And yes, SQL access is one big advantage for sqlite.
Though I did not measure the time, I felt, the column unification and splitting with berkeley db will take a lot of time. And I cannot query for individual fields. I have to get everything. For instance, if an application say, OpenOffice wants to query a certain labelled mail’s subjects alone, it may not be possible.
But as you said, your work is for fun / challenge and interest . So, there is no question of a berkeley db vs sqlite :-)
oh god, tcl. that makes python look almost reasonable. The syntax is worse, and every script is tied tightly to the language version.
query by label? With any database you need to use an index. Berkeley DB is no different. It is much lower level but has the same facilities which are required to implement all of these features in any RDBMS. Including partial row retrieval and storage – when you know the size of the row columns (which you cannot (easily) if you have variable length columns).
Given that querying by some indexing function (e.g. query by label) was the whole point of the disksummary branch to start with, it is definitely possible although it isn’t always as easy to write.