So long and thanks for all the fish

Yes, the obligatory and rather daggy HHGTTG quote to finish off …

I think this shall be my last post on b.g.o. I had written more but changed my mind.

My other, rarely updated and stale, ‘coders diary’ will hang around (as will this for the time being). But I’ve also started another ‘blog’ on blogger – the rather delightfully named `A Hacker’s Craic‘, although I haven’t really thought of anything much to say on it yet.

Well good luck with GNOME.

`Chrome’ fails to excite

Well I tried Chrome. Well I guess they have the address entry ok.

Otherwise – nothing too special.

  • Awful IE7/vista style corrupted ‘menu-button-in-the-wrong-spot’ system. Why does everyone want to copy the worst aspects of MS’s ui ‘innovation’ like moving menu’s and buttons to unfamiliar places – for no particular reason, and usually to the detriment of the usability. They’ve tried, top-left, bottom-left, and now top-right. You all know what’s coming next …
  • Window decorations/borders which don’t honour my ‘theme’.
  • More IE7 annoyances – the way tabbing works, i.e. open next, not open at end.
  • No advert blocking. No flash blocking.
  • Redundant and annoying ‘animations’ like sliding tabs.
  • Poor font handling. Not that other browsers are great either, but to make all sites readable I tell firefox to disable all font overrides and set a minimum font size I can actually read. Not this:

    Chrome, groklawMozilla, Groklaw

    Chrome, SMHMozilla, SMH

    The chrome ones were after I set the fonts to the same settings as Mozilla. On further probing if I set the font to some huge value it kinda works – but well, it kinda doesn’t work too. It becomes readable, but unpredictable, and I cannot get the typeface (font + size) I find most readable.

Moving the tabs above the address entry box is neither here nor there, although the overall minimalistic UI is a plus.

Well I guess some enterprising hacker with enough time can address most if not all of those issues (although I cannot see an advertising-revenue driven company putting advert blocking in the official version). However, I think someone needs to mention that right now it is just another webkit browser, light on features, and nothing particularly special. Well, apart from not having crashed so far – which most others I’ve tried do not seem to manage – although there isn’t that GNU version yet either, which seem to be more prone to that type of bahaviour.

Hmm, and here’s a nice headline that makes complaining about free browsers kinda pointless, although I’m sure it needs to be taken with a grain of salt.

how dumb is that

Hmm, Firefox are adding a pointless ‘preview’ thing when switching tabs, and changing the way it works to be more like windoze (ordering). According to a preview on ars technica.

Umm, wtf for? The preview thing is ok if your system takes forever to bring up the other application or it is iconised, but it makes no sense when it would be easier just showing the whole page. So I guess switching tabs will get even slower in 3.1 … sigh. The ordering thing is just dumb. People like using tabs because they aren’t windows, why make them act like them.

things

  • So google have re-invented ASN.1. Yay. Oh well I’m not sure why they didn’t just write a nice ASN.1 compiler and release that. At least they have something right. XML does suck for many uses.
  • After chatting about that a friend mentioned YAML. Hmm, YAML looks worse than XML. It purports to be human readable and writable but it needs a more complex context-senstive parser than XML. Avoid YAML.
  • That lead me on a meandering wander through the inter-tubes that visited this totally awesome anti-XML rant along the way. Wow, I could only dream of being able to flame like that, I just don’t have the patience. He has plenty more to say too. His perl and c++ observations are rather entertaining.
  • So I ended up reading a lot of lisp groups and rants. The lisp guys know how to put down just about everything. Along with a bit more reading and because I have a negative feeling toward guile (I forget why now – something to do with footprint and threading I think) – I’ve decided to skip scheme, and I think i’ll look at Lisp instead. After a bit of searching (damn google being hard to find things) I started reading Practical Common Lisp, and it seems a good read so far. After 2 chapters I can already start to see why lisper’s look with such disdain on other languages (although even the first bit of the scheme book I read gave me an inkling of that).
  • Chatting to a work-mate, larabee came up. It is all about a leap of fatih apparently. Hmm, looks dumb to me. I just hope this doesn’t mean every other gpu vendor gets caught in the catch up game when people start writing their shaders in x86. (which i’m sure is exactly what intel does hope).
  • I’m about to have a few weeks break! Although I should probably do stuff around the house, or do some exercise, I will probably just hack and read. And maybe some cooking, and watching the tour de france in the short term.

things and stuff

Things:

  • Sigh, along with seemingly every other broadcaster in the world, the ABC has gone the Silverlight route. All so they can over-charge for rentals for crappy old tv shows with DRM attached ($3/week for 1 show?). It’s amazing what influence money can buy (it only seems to be called bribery in third world countries). The ABC is funded by the government.
  • Of course there is a complete microsoft dick-sucking love-fest going on in general in the aust govt and businesses, so it should come as no surprise really. And if it isn’t them, it’s Oracle. It’s amazing what a corporate box at the rugby can buy you.
  • Firefox 3 appears to be getting (even) slower – this is on windoze too. I wonder if it’s an installation-age related thing? There seem to be plenty of threads on the mozilla support forums about it, and recently growing. Although they’re mostly ignored or censored.
  • The aweless bar isn’t getting any easier to use either, I’d rather it at least preferentially showed what i’ve actually typed in, and if it wasn’t so bulky (on my laptop i twiddled with the oh-so user-friendly ‘userChrome.js’ to at least stop it taking up the whole screen).
  • Kevin Rudd (Australia’s current Prime Minister) is getting all hot and bothered about naked children in art, and trying to impose his religious-right moralities on the nation. If he is so disgusted by childhood nudity I wonder how on earth he ever bathed his own children. It is disturbing that he (and most of the country I might add) can’t tell the difference between art and porn and turn pictures which simply represent the innocence of youth into salacious filth. Not that this sort of offensive attitude wasn’t unexpected of him, and is one reason I could never vote for him. Why not look at those poor-starving-african-sponsor-a-child adverts? There’s some good honest child exploitation if ever there was some.
  • Some vegetarian organisation is trying to ‘cash in’ on the ‘world food crisis’ and global warming and has cynically started running and ad campaign at the moment saying we should all stop eating meat so we don’t kill the planet. Not going to work. If we all ‘went veg’ the planet would just be able to support more people (apparently), and so we would just take longer to reach the limit of sustainability. History (and nature) shows us that every resource-constrained population with no controlling mechanism (e.g. predation, disease) will grow to exceed it’s resources and must move on to greener pastures and/or collapse. It might take a while to do it for the entire planet but it was only ever a matter of time once disease was conquered. It isn’t what we’re putting in our mouths, it is how many mouths there are in the first place. Watch what happens to The Philippines in the coming years, as their Catholic heritage comes to bear fruit (quite literally). Of course nature may beat us to it – Tuberculosis and other diseases are making a come-back, and the shifting climate will only make things worse.
  • I wonder when petrol will make $2/litre? Might take some cars off the road and make cycling a bit safer. Certainly the way roads are being built around here will not.
  • I hate winter. I’m sick of the cold. You Canadians may scoff at 10-15 degrees being ‘cold’, but with low building standards, insufficient clothing, and inefficient heating systems you’d be surprised how unpleasant it is sitting around in that.

Stuff:

  • Our project is in temporary(?) limbo, so we’ve been mothballing the code and data. Utterly, utterly boring, but I should get to have some time off soon, which I seem to need more often these days.
  • I’ve been poking away at my CMS – slowly. Not sleeping very well, or staying up late to watch The Tour so my concentration isn’t fantastic. I figured out how to do indices – it took longer than it should have but I think I have a simple and scalable solution worked out.
    1. Each time a node is updated I parse and extract index (and chapter structure) information.
    2. For each index entry:
    3.   Form a key of the form ‘type byte'[.’namespace/’].’index word’,’target node index’
    4.   Write the entry revision record as the ‘data’, sorted in reverse (i.e. newest first).

    So I can easily look up all records by type: “find key > ‘type byte'”, or all records in a given namespace “find key > ‘type byte’.’namespace/'”, etc. With the way Berkeley DB stores duplicate records I can then skip to each separate node easily as well (and hopefully quickly too) once i’ve found the latest/live revision which will normally be the first one. One thing I haven’t worked out yet is directing entries with a finer granularity than per node (page).

  • I’ve got some very preliminary code working to order nodes to generate table of contents. I’m already writing the meta-data records recording structural elements that is required to do the next step efficiently. I’m trying to use the @node node,next,prev,up links to order nodes (the hard part), but of course the tricky bit is doing sane things when the details aren’t fully specified or contradict themselves. The @node thing seems a little clumsy too, and I really have to nail down the namespace semantics fully.

    This sort of processing is one area where C (and any related imperative language – Java, C-hash, etc) seems to fall down a bit, scanning, searching, reordering lists and trees. It isn’t difficult to do, but it seems a bit clumsy and messy.

  • I needed a hashtable for linking up the nodes, and the one I wrote earlier was on a machine in another room and I was too lazy to get it – so I wrote one from scratch. Ok so I need a better hash function and add a re-hash loop (actually for this use I probably don’t), but hash tables are so awesomely simple I didn’t mind writing another one, and it only took a few minutes. I improved and simplified the node iterator anyway (iirc).

    I’m starting to settle on the api I most like for hash tables too (in c):

      typedef void *key; — key is anything
      struct ht_node { struct ht_node *next; }; — node you embed in your own structure, also note no ‘key’ – your own structure holds the key, perhaps implicitly. chain link only. it needn’t be embedded as first object either, and you can embed more for different tables
      struct ht_iterator { int index; struct ht_node *next; }; — iterator, set to 0 on first call.
      struct ht_table; — anonymous

      table = ht_create(hash node func, hash key func, compare node and key func);
      ht_destroy(ht); — frees only hash meta-data
      node = ht_add(ht, key, node); — It returns an existing node (and does nothing) if the key exists – this means you don’t have to perform a separate lookup first.
      node = ht_lookup(ht, key);
      node = ht_remove(ht, key); — remove by key. remove by address needs to look up the key anyway so why bother. returns the node removed if it was there.
      node = ht_iterate(ht, iterator struct); — list all nodes in unknown order. The node returned can be removed if you want. returns NULL and resets when done.

      Why I like this vs the 30 odd function monster in glib2, or even C-hash’s version?

      1. Memory management is simpler. The hash table only manages its base structure and the bucket array. You manage everything else. And you only manage one object, not two (key, data). No need for a special (and glib-unique?) ‘steal’ function either.
      2. Memory use is lower. No need for redundant ‘key’ OR ‘data’ pointers. Often the key is in the structure anyway so why duplicate it. You usually only store data in one table, so why not make that tables data structure’s part of yours. And if you really want a separate tuple-object you can do that too.
      3. Should have better locality of reference for certain data. But with modern cpu’s it’s hard to tell what will happen.
      4. The insert-existing key semantics are much simpler and clearer. With glib it isn’t obvious if the key or the data (or both) are replaced (without reading the documentation), leading to the rather unique ‘replace’ function. And c-hash throws an exception which is just a pita (it does it for lookups and removes too which is just silly, so you need an additional totally redundant ‘has key’ look up before looking it up again).
      5. Iterator mutability rules are clear and simple. In c-hash you often have to jump loops to do simple things which should be simple (e.g. remove some matching data), glib adds extra redundant api.
      6. Can be ‘sub-classed’ and extended. e.g. the base object could implement a ‘string’ hash table, and you could just embed that and use a different create function.
      7. In this case less is just more.

      What is bad about it though?

      1. Need at least 3 callbacks – hash key, hash node, compare key and node (could have compare node and node if you wanted to add ht_remove_node(ht, node)). And the node callbacks will have to be unique for that data type (although you could always extend and share that way).
      2. The node callbacks might need to do funny address arithmetic if the ht_node isn’t the first element or there are more than one.
      3. You must explicitly include details of the table it belongs to. e.g. one ht_node struct per table you might belong to. Of course you could create a basic tuple node-key-data table and use that too.
  • Some of the code is getting quite messy, and other bits are growing quite large. Most of the complication is actually in the database layer, but most of that code isn’t too bad. The texinfo parser is also generally pretty clean. The glue is all over the place at the moment, but i’m still feeling my way in the dark there a bit.
  • I had a little look at scheme. It seems quite a large language (compared to C, the language is much bigger, even if the syntax is much smaller). I think I will learn more of scheme when I have time, although because I think in terms of assembly language it will probably be tough going. One thing that convinced me that it is worth looking into is the way you can implement objects. I’ve implemented objects a few times in C, and although there’s nothing special about either (they both work more or less the same way once they’re going), in scheme it’s all done in a couple of dozen lines rather than a couple of hundred (admittedly the c objects had more features). It seems like it is simple and elegant, even if I don’t fully understand it yet (and hopefully that isn’t why it seems simple and elegant).

linux+gnu wins again, or does it?

So a workmate just had some baby snaps on his Sony Cybershot he wanted to send to his relatives. But all Windows XP (fully patched) wanted was a disc with the drivers on it, even when it was switched to use USB mass storage.

No problem for Fedora Core 6 (with no updates I might add). Booted it up, logged in, plugged in. There it was, copied the right photos to a thumb drive and problem fixed. Again.

On a competely unrelated note, what happened to totem to have it asking me for euro’s to play a dvd? Hmm, what happned to the GNU in GNOME.

midori, gnash

Partially from a comment from Jeff and mostly because Firefox 3 is giving me the shits, I started looking around for another browser. I gave kamikaze (or something – can’t remember exactly) a go, but that crashed as soon as i typed in a url. I also tried some pure java browser, but that’s got a long way to go yet, it’s rendering is all over the place (although interestingly it starts up faster and uses less memory than firefox). Midori on the other hand looks kinda interesting.

Surprised how fast it starts up, but of course it still isn’t quite ready, and has a few usability issues, but it could get there one day, and maybe if the code-base isn’t such a giant as firefox it wont be hard to customise. It wont run gmail yet, and the network seems to lock up too often. Pages take a bit longer to load and render less incrementally. But if a few other issues were worked out i’m sure I could live with that. I didn’t try epiphany but maybe once they’re done a webkit release i’ll look into it, I think ditching the renderer abstraction and going with a single implementation is a very wise idea – more projects could probably learn from this.

Since i’m on a 64 bit box there is no flash player for it (i’m not going to run nspluginwrapper), so I tried gnash. Hmm, it kinda works. Actually it works well enough to play most advertising just fine (although not great quality), and none of the video sites. Which is really the opposite of what anyone might want. If black listing was in the main menu and it went fully to sleep in hidden tabs – and video worked – it’d be a real winner.

On an unrelated note – if you’re ever trying to compile ffmpeg with ac3 support and it doesn’t seem to want to turn it on …

 ./configure --enable-decoder=ac3 --enable-gpl

And not:

 ./configure --enable-decoder=ac3 --enable-gpl --enable-liba52

Since it doesn’t work – although everything I saw said it should. In the end I just used mencoder (and that took a while to work out – although the answer ended up being in the manual). I was trying to convert a blender generated mjpeg avi into a dvd-compatible video spliced with silence for a dvd menu, and ffmpeg insisted I needed ac3, even though the source had no sound.

This was all a pain to find out. Another frustrating google search experience. You’re either hitting multiple copies of the same mail archive but in different ad-supported wrappers, or you get some comment of the lines of ‘install another package’. Nobody seems to know how to compile things from source any more.

Everyone just seems to want their code free as in beer, but none of that messy source please.

I think this a fundamental problem with the Open Source marketing ‘movement.’ By ignoring and de-valuing the Free nature of Free Software and focusing purely on the mechanism of development they have devalued the ‘Source’ part as well.

And speaking of source … along the way I thought I needed mjpegtools and had to resort to source to get it. Oh but it’s in C++. Great – every time i’ve tried to build anything C++ i’ve run into problems – and not just with gcc 3.4 on Fedora 9, i’ve had problems many times before too. Another application that needs patching just to compile because the compiler and the language moved on. I patched one of the tools and got it to build but lost interest when I realised I didn’t need it after all.

I’m still plugging away at my cms in between all these other side quests. I’ve integrated my flex based texinfo parser now, and cleaned up a few little things. Internally i’m using length+pointer based strings rather than 0-terminated. Both so it is binary-safe, and so I can reference string fragments without having to copy them around. I got sick of forming output strings by outputting tiny fragments in the right sequence (exactly like StringBuilder.Append), and implemented some custom printf format handlers. That way I can pass my special string * directly to obstack_printf and it stores the result, and even does html entity conversion or url encoding to boot. Oh, and sick of having to call ‘init’ functions in main I’m using static constructors too. __attribute__ ((constructor)); on a prototype, and it will get called automagically before main is entered. Horrah for gcc extensions.

I was also wondering how to implement some sort of namespace mechanism. texinfo has the solution there – it doesn’t really have namespaces apart from the current manual. So the namespace is like a single-level directory, when you have the nodes stored in separate files by (processed) node name. Since that is all I really need, that is what I will use, and not so oddly enough it fits well with the texinfo cross reference commands (they specify the manual(/’namespace’) as a separate part). I store the whole namespace/node name as a key of each ‘node’, and it makes looking up all or specific names in a given namespace very easy. Berkeley DB has a mechanism where is can compress keys if they share a common prefix, so i’m using that and it should make the overhead of storing the full path in each key low, although you wont be able to change the namespace after it is created (at least, not cheaply). (at least that is my understanding, the documentation isn’t very clear on when it helps a data-set, and google was no help).

Hmm, indices and structure next I guess. Enough of the parts are there, I need to start tying them together.

Fedora Install, Again

Ok, i’m getting plenty of practice here. My new laptop arrived today, and yes, despite the aust web page saying otherwise, it has a thinklight, yay! Don’t know what i’m going to do with the (hefty) monitor stand it came with – but it was part of the deal.

I downloaded and installed FC9 x86_64. I never started windows.

  • The installer graphics were a bit messy, but it was functional. (borders cut off/artefacts left behind)
  • I was looking some stuff up using a terminal, and left it there – then the machine shutdown (finished installation?), but it stayed turned on and it didn’t reboot.
  • Upon a power cycle I ended up with a ‘No Operating System Found’ message. Hmm, not good. Apparently there’s some Lenovo stuff on the first partition – or there was. But I don’t think the installer finished, and grub wasn’t installed, or it was installed in the wrong place.
  • After mucking about a lot to try to recover it I gave up and ran a re-install. This time I installed no ‘install options’ (to make it a bit faster), but made note to check the grub install – onto the MBR. I lost any windows/recovery crap, but I didn’t want it (and it was already lost).
  • Success!
  • No wireless. I installed the madwifi driver for the wireless card – but haven’t tested it so far. I hope it works, using my roamabout card on this would be more of an inconvenience than on my T40.
  • I got the finger print reader going for login – thought i may as well try it.
  • Played with desktop effects, yay.
  • Installed some packages. Hmm, the icons don’t show up in the menu’s. e.g. emacs.
  • IMO the default mouse acceleration is way too fast – maybe its ok if you have ADHD and are on speed, but I can’t keep up. Setting it in prefs kinda works (but not at the login window), but then the gspot is too slow. Oh well.

I played with ADA last night.

firefox 3 sluggish, fedora install notes

I actually wrote about 3 posts about my fedora installation issues, but I decided against boring everyone with the details (ok i will bore a little). The last straw came on Friday night when I discovered an application I long used (though occasionally) was ripped out of Ubuntu – i don’t really care if it is in Fedora, but it was the end of Ubuntu for me.

First – Firefox 3. I’m sure it isn’t just me, firefox 3 actually feels rather sluggish compared to 2 on my machine. Yes the RSS is lower, but switching tabs just has a soft laggy feel to it. I’m not sure if it is the videos I was looking at, but flash also seems terrible – it was always a bit of a cpu hog, but now the screen isn’t even close to keeping up. I can live without flash but I dunno, this isn’t what was advertised. I noticed fedora was using nspluginwrapper (not sure why, this is only a 32 bit box), and that was sucking some greenhouse gases, so i got rid of that, and it improved things ever so slightly, but it is still rather laggy. It could be X I guess – I will have to check that, but blender is very snappy (for this machine), and everything else runs fine. I’ve seen some forum threads where people suggest your machine is just too slow – umm, sure maybe, but if firefox 3 was meant to be so much faster than firefox 2 – how come it isn’t? And what have they done with the ‘invalid certificate’ window – made a right pigs breakfast of it, that’s what. I’ll have to try an older firefox to see how it compares.

Other than that – the machine is faster overall, although that is mostly because I am now running much leaner. No session manager (never wanted/used it), no crapplets (i do miss some), no file manager (no loss at all for me), blackbox window manager (never tried it before – it works, although i patched it to re-order the window decorations not to mirror windows 95). I can actually start openoffice (not that i ever need it) and firefox and emacs and a few shells and it still isn’t swapping! And updatedb running in the background doesn’t bring the machine to its knees any more – actually now i think about it, it was something I rarely noticed until I started using ubuntu. My usb drive even mounts – although it takes a long time and you get a lot of errors in syslog.

It literally feels like a different machine now. It even seems to boot faster.

I had a few install issues though. Setting up xdm was a bit of a hassle – the default configuration is broken, at login you get no access control, a clock and a twm with no way to run anything. I had to add “DisplayManager.authDir: /var/xdm” to xdm-config to fix the access control issue. I couldn’t work out the selinux stuff (or is it pam?) to get it to run an xterm on startup from the default fallback Xsession (no permission to open a pty), or execute my custom .xsession (permission denied), so I just disabled it and things got going (I am not a sysadmin any more and don’t have the patience to read up on all this stuff). The thinkpad suspend buttons do nothing – but then I haven’t used them for so long I don’t know if they worked in ubuntu either. I finally found “acpitool -s” will suspend and added a menu to blackbox for that (and suspending is really fast and so far reliable). I’m using autofs to mount media automatically – file manager windows (or worse – e.g. a dvd player that cannot play dvds) popping up while I am typing are a thing of the past (at least on linux – windows still does that even though i told it not to – why can’t they just silently add an icon to your desktop?), although autofs isn’t quite what I want either, it will do for now and I may well just get used to it. Not running gnome-setting-daemon means all your gtk apps take on the crappy default theme (ugh, curved scrollbars?), so I rediscovered .gtkrc-2.0 and hardcoded something reasonable and some smaller/nicer fonts (industrial & vera sans 9). I tried a couple of window managers till I settled on blackbox. It manages windows and virtual desktops with configurable keys and that is all anyone needs, about all I didn’t like was the location of the close button and I fixed that with a patch – Free Software rulez etc etc.

As to the install itself – I tried manually selecting packages this time. Big mistake. About 70% of the way though it asked for ‘Disk 1’. Hmm, but I only had the 1 DVD and wasn’t about to burn every CD as well. Oh well, reset and start again – only an hour or two of meticulous package selection and installation down the drain. So I restarted and took the defaults – only changing / not to use ext3. Yum isn’t bad on this box. It isn’t exactly snappy but it’s fast enough. So I just fiddled with lots of packages afterwards and did a full update (wow, a lot of updates for such a young release). And man pages and info files just work. About time. It didn’t detect my internal wireless (ipw2100 based) but that actually isn’t a bad thing since it never seems to work (i will probably try it though, you never know and all that – i’d prefer not to have a card sticking out the side). It isn’t starting eth0 (plugin roamabout card) at boot up, even though I told it to during the install.

So it wasn’t terribly easy (I cut out a lot above too) but I seem to have what i’m after. Yes, i’ve ended up with a 10 year old ‘linux desktop’ with a faster kernel and better emacs, but this machine is a tool that I use, not one which is using me. I’m sick of clicking away notifications and popups I didn’t ask for in the first place for things the computer should be deciding for itself.

And I think it’s just great you can still do this on a GNU system – install the latest and greatest, but without the unnecessary bells and whistles too.

flexibility

One more followup to the versioned data ideas. I thought of how I could version some other information like document structure and index entries and the like. It worked quite easily with the last design i came up with – I can just create another table for each type of data, and use the ‘entryrevid’ as the key. It means I have to write out a full set of associated data each time I write a version of the file, but the redundancy means I can look it up faster so it is worth it, and it just simplifies everything.

So after that, I went back to have a look at the content markup. I felt relatively happy with my original idea of using texinfo markup, so i’ve stuck with that.

I rewrote my texinfo parser to use flex. I haven’t used flex much – and not for a long time, so it took me a while to work out how to do some of the more ‘advanced’ things while sticking with getting flex to do as much of the work as I could. Although I will need a dynamic symbol table if I am to implement macros, for now all symbols are hard-coded directly into the flex file, which means less code to write and more speed as an added bonus. I’m not sure i’ll bother with macros actually – I haven’t seen any documents which use them, and there are implementation difficulties in a sparsely stored document. The texinfo grammar isn’t really documented too well either, and it is a little inconsistent in places, so I’ve had to run test files through texinfo to find out what I should do. I had been using texi2pdf but that is a lot less strict than makeinfo is – which is handy, since strictness simplifies my work (I don’t have to worry about random whitespace and the like). Although I don’t really need to be completely compatible with makeinfo, I should probably be for what I implement.

I’m looking at the parser to do two things – convert the format for presentation (i.e. convert to css-friendly html), and extract structure, footnote, and index information. Both will be used for presentation in various parts, and the later will be used for indexing and linking nodes. I have enough working to do both right now, but I might keep tweaking the code a bit more before starting to integrate it, while tweaking is still interesting anyway. I can’t use makeinfo because it isn’t available on the target system, and I don’t think it works with document fragments anyway, and besides, I need structure information and it is easier parsing texinfo than the html output.

I’m still not sure how i’m going to handle linking multiple nodes together into a sequence. Should the @node line be entered explicitly where it can be easily seen and edited, or should it be managed internally by the application and the user interface provide ways of linking the pages up? Hmm, I guess the former should do for now.

It’s sort of interesting to see how i’ve been progressing working on unfamiliar ground. I find the code goes through cycles of functionality, and stability but also of ‘coherence’ (code quality?). They are not distinct phases as such and there is always some overlap, but starting with an empty slate, the phases are something like:

  1. Discovery — You don’t know much or anything about the tool/idea you’re working on. You write tiny test files and cross reference with the manual or specification. You begin to grok the basic underlying theory of operation for the tool/idea.

  2. Expansion — You fill out the implementation, adding functionality very quickly. You rapidly create a tool which does 90% of the job for some% of the functionality. You don’t take too much care of the implementation details as you’re just getting something running. So the code gets a bit messy but it isn’t too messy, and the messiness is isolated and controlled.

  3. Consolidation — You start to notice patterns in your code. Loops or functionality which is repeated more than once. You consolidate this while thinking about how the consolidation might help in the future, but not too much since you don’t know what you’re going to do yet. The code structure slides into a more coherent pattern, the code quality is being increased.

  4. Growth — Now you find additional features you were putting off/didn’t look into earlier can be implement much easier now you’ve consolidated the code. You grow the functionality in perhaps a significant step whilst keeping the code fairly clean.

  5. Breakage — You break a few things. But not too many, and they’re fairly easy to fix. Mostly just syntax errors. Sometimes you over-consolidate and break functionality and have to back out a little bit.

  6. Perfection — You clean up all the little things and get it all working nicely again. You start to notice shortcomings or bugs in the logic now the bugs in the code are cleaned up. ‘Perfection’ here is entirely relative, and it is as perfect as you care to make it at the time (as you lose interest, it is likely to wane in strength).

  7. Expansion … the cycle repeats again.

Actually what often happens is after the first 1 or 2 iterations, you can throw most of the code away and start again. You quickly discover your mistakes when things start to bog down too much or they get too messy. But since you have gained a deeper understanding about the problem you proceed much quicker next time and avoid making similar architectural mistakes along the way. The proverbial ‘throw it all away and rewrite’. But doing it at such an early stage the cost is minimal and it was a learning exercise/prototype anyway. Depending on the problem, you’re only talking about a few hours or a day or two of effort at this point. And in a larger problem there are probably smaller problems within it that this applies to.

I think plenty of ‘RAD’ projects only really get to stage 2, and if they last that long they jump to a rewrite stage, and repeat. These projects tend to include lots of functionality very quickly — and often it does the job — but somewhere down the line they’re either headed for death by spaghetti or a ‘total rewrite’. Of course ‘doing the job’ is definitely good enough in many cases, but it isn’t the same as having a quality engineered product, and comparing the ‘productivity’ of RAD projects to more traditionally engineered ones is meaningless.

Step 3 is really the key to keeping the code quality up. If you skip that step you will save some time in the short term, but will always pay for it later. If you do it often enough that at each stage the problem is contained the cost is quite minimal anyway. It is particularly important with object oriented code (where it is called ‘refactoring’ apparently) as the gains and drains are higher from quality code re-use. If you just keep cutting and pasting objects, adding random methods and so on you will end up with bloated and hard to maintain code very quickly. And it can be harder to distil common functionality into re-usable shareable sets. With a procedural language you’re only considering it at the level of a single procedure, with an OO language you have a whole cohesive set of functionality to worry about.

What’s great about software being written in your spare time is you can ‘throw away and rewrite’ as much as you want. So long as there is some value in doing so, and that is quite subjective – so the value only has to have meaning to you. A list (in rough desirability order) might include:

  • Applying new knowledge‘I can do it better!’

    You’ve learnt enough from the previous attempt, and want to apply that knowledge to improve the functionality or quality of the code. Improving the functionality and code quality at the same time can be a very satisfying intellectual endeavour.

  • Bored, dead-ended, trying something else‘I’m bored with this.’

    Maybe you are going down a dead-ended path. This is related to applying new knowledge, but instead of having learnt something, you find you’re not learning anything and perhaps approaching it from another angle will be more rewarding or interesting.

  • Intellectual curiosity‘Time for a wank!’

    For no particular reason you just feel like doing it in a completely different way. A different language, a different platform, a different meta-programming tool. It’s a rainy day, you’ve got nothing better to do, so you go off in a tangent and blat out some code. Maybe it becomes the new trunk, maybe it gets thrown away, maybe you learn something from it, or maybe you don’t. It might only be one tiny bit of functionality you’re investigating, or the whole architecture.

  • Improving quality to reduce future maintenance‘Time to clean it up.’

    If this is just for the sake of it, and not for any of the other reasons, it will probably be very low on your list. This is maintenance work, and although it may save you pain down the track the immediate payoff is low. Perhaps you are meticulously pedantic and get a kick out of the momentary perfection of the rewrite, or perhaps it’s just that time of the month, and you feel you should.

This is pretty well how I write all code anyway, although at work i’m not going to try different ways for the hell of it so often, and once you go into maintenance mode things are a bit different. Still, the same ideas work at various granularities throughout any code base at almost any time. Software development is generally an iterative process — if you already knew what to write before you started you (or someone) would have to go through the same process anyway, but without the support of a compiler and live testing. Very rarely does the final result match what you started with and you have to keep an eye out for ways to improve it.