things and stuff

Things:

  • Sigh, along with seemingly every other broadcaster in the world, the ABC has gone the Silverlight route. All so they can over-charge for rentals for crappy old tv shows with DRM attached ($3/week for 1 show?). It’s amazing what influence money can buy (it only seems to be called bribery in third world countries). The ABC is funded by the government.
  • Of course there is a complete microsoft dick-sucking love-fest going on in general in the aust govt and businesses, so it should come as no surprise really. And if it isn’t them, it’s Oracle. It’s amazing what a corporate box at the rugby can buy you.
  • Firefox 3 appears to be getting (even) slower – this is on windoze too. I wonder if it’s an installation-age related thing? There seem to be plenty of threads on the mozilla support forums about it, and recently growing. Although they’re mostly ignored or censored.
  • The aweless bar isn’t getting any easier to use either, I’d rather it at least preferentially showed what i’ve actually typed in, and if it wasn’t so bulky (on my laptop i twiddled with the oh-so user-friendly ‘userChrome.js’ to at least stop it taking up the whole screen).
  • Kevin Rudd (Australia’s current Prime Minister) is getting all hot and bothered about naked children in art, and trying to impose his religious-right moralities on the nation. If he is so disgusted by childhood nudity I wonder how on earth he ever bathed his own children. It is disturbing that he (and most of the country I might add) can’t tell the difference between art and porn and turn pictures which simply represent the innocence of youth into salacious filth. Not that this sort of offensive attitude wasn’t unexpected of him, and is one reason I could never vote for him. Why not look at those poor-starving-african-sponsor-a-child adverts? There’s some good honest child exploitation if ever there was some.
  • Some vegetarian organisation is trying to ‘cash in’ on the ‘world food crisis’ and global warming and has cynically started running and ad campaign at the moment saying we should all stop eating meat so we don’t kill the planet. Not going to work. If we all ‘went veg’ the planet would just be able to support more people (apparently), and so we would just take longer to reach the limit of sustainability. History (and nature) shows us that every resource-constrained population with no controlling mechanism (e.g. predation, disease) will grow to exceed it’s resources and must move on to greener pastures and/or collapse. It might take a while to do it for the entire planet but it was only ever a matter of time once disease was conquered. It isn’t what we’re putting in our mouths, it is how many mouths there are in the first place. Watch what happens to The Philippines in the coming years, as their Catholic heritage comes to bear fruit (quite literally). Of course nature may beat us to it – Tuberculosis and other diseases are making a come-back, and the shifting climate will only make things worse.
  • I wonder when petrol will make $2/litre? Might take some cars off the road and make cycling a bit safer. Certainly the way roads are being built around here will not.
  • I hate winter. I’m sick of the cold. You Canadians may scoff at 10-15 degrees being ‘cold’, but with low building standards, insufficient clothing, and inefficient heating systems you’d be surprised how unpleasant it is sitting around in that.

Stuff:

  • Our project is in temporary(?) limbo, so we’ve been mothballing the code and data. Utterly, utterly boring, but I should get to have some time off soon, which I seem to need more often these days.
  • I’ve been poking away at my CMS – slowly. Not sleeping very well, or staying up late to watch The Tour so my concentration isn’t fantastic. I figured out how to do indices – it took longer than it should have but I think I have a simple and scalable solution worked out.
    1. Each time a node is updated I parse and extract index (and chapter structure) information.
    2. For each index entry:
    3.   Form a key of the form ‘type byte'[.’namespace/’].’index word’,’target node index’
    4.   Write the entry revision record as the ‘data’, sorted in reverse (i.e. newest first).

    So I can easily look up all records by type: “find key > ‘type byte'”, or all records in a given namespace “find key > ‘type byte’.’namespace/'”, etc. With the way Berkeley DB stores duplicate records I can then skip to each separate node easily as well (and hopefully quickly too) once i’ve found the latest/live revision which will normally be the first one. One thing I haven’t worked out yet is directing entries with a finer granularity than per node (page).

  • I’ve got some very preliminary code working to order nodes to generate table of contents. I’m already writing the meta-data records recording structural elements that is required to do the next step efficiently. I’m trying to use the @node node,next,prev,up links to order nodes (the hard part), but of course the tricky bit is doing sane things when the details aren’t fully specified or contradict themselves. The @node thing seems a little clumsy too, and I really have to nail down the namespace semantics fully.

    This sort of processing is one area where C (and any related imperative language – Java, C-hash, etc) seems to fall down a bit, scanning, searching, reordering lists and trees. It isn’t difficult to do, but it seems a bit clumsy and messy.

  • I needed a hashtable for linking up the nodes, and the one I wrote earlier was on a machine in another room and I was too lazy to get it – so I wrote one from scratch. Ok so I need a better hash function and add a re-hash loop (actually for this use I probably don’t), but hash tables are so awesomely simple I didn’t mind writing another one, and it only took a few minutes. I improved and simplified the node iterator anyway (iirc).

    I’m starting to settle on the api I most like for hash tables too (in c):

      typedef void *key; — key is anything
      struct ht_node { struct ht_node *next; }; — node you embed in your own structure, also note no ‘key’ – your own structure holds the key, perhaps implicitly. chain link only. it needn’t be embedded as first object either, and you can embed more for different tables
      struct ht_iterator { int index; struct ht_node *next; }; — iterator, set to 0 on first call.
      struct ht_table; — anonymous

      table = ht_create(hash node func, hash key func, compare node and key func);
      ht_destroy(ht); — frees only hash meta-data
      node = ht_add(ht, key, node); — It returns an existing node (and does nothing) if the key exists – this means you don’t have to perform a separate lookup first.
      node = ht_lookup(ht, key);
      node = ht_remove(ht, key); — remove by key. remove by address needs to look up the key anyway so why bother. returns the node removed if it was there.
      node = ht_iterate(ht, iterator struct); — list all nodes in unknown order. The node returned can be removed if you want. returns NULL and resets when done.

      Why I like this vs the 30 odd function monster in glib2, or even C-hash’s version?

      1. Memory management is simpler. The hash table only manages its base structure and the bucket array. You manage everything else. And you only manage one object, not two (key, data). No need for a special (and glib-unique?) ‘steal’ function either.
      2. Memory use is lower. No need for redundant ‘key’ OR ‘data’ pointers. Often the key is in the structure anyway so why duplicate it. You usually only store data in one table, so why not make that tables data structure’s part of yours. And if you really want a separate tuple-object you can do that too.
      3. Should have better locality of reference for certain data. But with modern cpu’s it’s hard to tell what will happen.
      4. The insert-existing key semantics are much simpler and clearer. With glib it isn’t obvious if the key or the data (or both) are replaced (without reading the documentation), leading to the rather unique ‘replace’ function. And c-hash throws an exception which is just a pita (it does it for lookups and removes too which is just silly, so you need an additional totally redundant ‘has key’ look up before looking it up again).
      5. Iterator mutability rules are clear and simple. In c-hash you often have to jump loops to do simple things which should be simple (e.g. remove some matching data), glib adds extra redundant api.
      6. Can be ‘sub-classed’ and extended. e.g. the base object could implement a ‘string’ hash table, and you could just embed that and use a different create function.
      7. In this case less is just more.

      What is bad about it though?

      1. Need at least 3 callbacks – hash key, hash node, compare key and node (could have compare node and node if you wanted to add ht_remove_node(ht, node)). And the node callbacks will have to be unique for that data type (although you could always extend and share that way).
      2. The node callbacks might need to do funny address arithmetic if the ht_node isn’t the first element or there are more than one.
      3. You must explicitly include details of the table it belongs to. e.g. one ht_node struct per table you might belong to. Of course you could create a basic tuple node-key-data table and use that too.
  • Some of the code is getting quite messy, and other bits are growing quite large. Most of the complication is actually in the database layer, but most of that code isn’t too bad. The texinfo parser is also generally pretty clean. The glue is all over the place at the moment, but i’m still feeling my way in the dark there a bit.
  • I had a little look at scheme. It seems quite a large language (compared to C, the language is much bigger, even if the syntax is much smaller). I think I will learn more of scheme when I have time, although because I think in terms of assembly language it will probably be tough going. One thing that convinced me that it is worth looking into is the way you can implement objects. I’ve implemented objects a few times in C, and although there’s nothing special about either (they both work more or less the same way once they’re going), in scheme it’s all done in a couple of dozen lines rather than a couple of hundred (admittedly the c objects had more features). It seems like it is simple and elegant, even if I don’t fully understand it yet (and hopefully that isn’t why it seems simple and elegant).

8 thoughts on “things and stuff”

  1. Hi,

    would you mind to publish the code of your hash table? It looks interesting …

  2. firefox has a default history time set to 180 days now. so it will build up quite a lot of data for the first 180 days, and then probably level out.

    you could set it back to 30 days, or whatever it was in firefox 2

  3. You seem to be pretty casual about letting people starve to death (and you call those vegetarian organizations cynical). Growth of the world population is a big problem, but not a hopeless one. Growth rate is declining, and there are means to bring it down even more, for example reliable livelihoods and education. If you would have done your home work, you would have noticed that there are many less developed countries in the world where people aren’t starving, but where birth rate is already relatively well under control (not to mention the developed countries).

    It’s a fact that meat production is facing it’s limits. Pastures are widely overgrazed and lost to erosion. Fields used for growing livestock feed are away from producing food, and meet production is a very ineffective way of producing food.

    One simple theory just isn’t enough to cover a problem involving entire populations and their diverse actions.

    Good points about Silverlight, bikes and others though :-)

  4. i don’t see the issue about silverlight, it is a lot more open than flash and moonlight is very good (though it doesn’t work with most sites as they only check for the official version of silverlight, a userscript could probably fix that)

  5. Man, anyone would think you’re a Linux fanboy ;)

    Are you bitter about the Silverlight decision because it’s MS technology? Or do you feel that Flash is a better option?

  6. oj, they all suck yeah. silverlight is particularly innocuous at this point in time – because it coming to all broadcasters all of a sudden is clearly a marketing effort. it isn’t widely deployed yet so it is an attempt to make it widely deployed, and as a tax payer i have no reason to be happy to be directly funding it. and it’s there (presumably) for it’s drm abilities. Why would any sane person like drm?

    rob, probably at some point I will, along with some other adt’s I have. The details are quite unremarkable, but I like simple things being simple.

    ari, yeah i’m cynical and pessimistic, but i’m just crapping on in a blog. Time will tell I suppose.

Leave a Reply

Your email address will not be published. Required fields are marked *