More musings about Smalltalk

Interesting discussion on pgo about embeddable scripting engines. Another option to experiment with is Syx Smalltalk, created by fellow Smalltalker Luca Bruno. It has full support for embedding and is quite portable as well.

Smalltalk can do closures as well. In Smalltalk, closures are known as blocks and are used for just about everything, from enumerating through collections to exception handling. Even logic control flow (if, while, etc) are implemented using blocks. For example, loops are implemented by sending the #whileTrue message (amongst others) to a block. The block is then executed repeatedly until it returns false. The beauty of this approach is that one can build custom user-defined control structures.

The only niggle is that classic Smalltalk-80 closures are not totally re-entrant, and piggy-back off their enclosing method activation. I intend to fix this in Panda Smalltalk, and I believe it has been fixed in other VMs as well.

Meta-Object Protocols

Thought a lot about Smalltalk’s meta-system recently, particularly in how its meta-level is explicitly separated from the base level. This is realized in the form of class objects which describe all objects in the system. Even classes themselves are instances of classes. This results in an elaborate unbounded metaclass hierarchy, of infinite regress. The Metaclass class is an instance of an instance of itself!

On the other hand, Self, Io and Javascript did away with the notion of classes. In these prototype-based languages, the only way to know an object is to send messages to it and see how it responds. One creates a new object by cloning another. Messages to a clone can optionally be delegated to its parent (the object from which it was cloned).

This is a monistic approach compared to Smalltalk’s dualistic form. Naturally, it makes the object system a bit more flexible and lightweight. No need to create a class just to represent a singleton instance. By manipulating the protos slot, one can even have multiple inheritance (as supported in Self and Io).

Eliminating classes provides some other useful benefits. Classes represent static (global) state, which raises thorny issues related to concurrent access and security.

Not everything about prototypes is rosy though. To compensate for the lack of classes (which are useful for describing shared behavior), Self and Io came up with traits objects. These are normal objects like any other, yet act as a repository for common state and behavior. Cloning these objects is akin to creating an instance of a class. The problem with trait objects is that they can respond to messages which are not meant for them (usually resulting in a VM exception).

Its not clear-cut whether prototypes are better than classes, or vice versa. At least one can emulate prototypes in Smalltalk with a bit of meta-object hackery.

Opinions on Concurrency

I am quite excited about Io‘s actor-based concurrency model. I think it would be great fit for Smalltalk, and I am keen to implement it in Panda Smalltalk. Io’s design seems much more conceptually sound than Smalltalk-80’s concurrency model (simple user-level threads). It really does fit in the notion of objects sending messages to other objects. With an actor scheme, objects will be able to send asynchronous messages, which return immediately and never block. The result of an async message is always a  future object. This kind of object is essentially a proxy for the real value of a method when it is invoked some time in the future by the scheduler. However, sending a message to a future will result in blocking until the real value becomes available.

Io’s actors are built upon lightweight threads, which is an aspect I also like. I would rather not make Panda’s VM fully multi-threaded, as it would add way too much complexity and non-determinism for my liking. I can get around the problems of blocking IO by using select() or some other IO multiplexer.

Multi-VMs seem to be all the rage now, having been implemented in Io and Rubinius. For VM laymen, multi-VMs are virtual machines which support multiple lightweight processes, each running in their own native thread. Each lightweight process has its own heap, garbage collector, and interpreter instance. They are thus distinct from lightweight threads.

They were experimented with in Java, as a means to reduce startup time by not having to spend time JIT-compiling that obscenely large class library multiple times. I believe Emacs has something similar. Erlang has them as well, but rather as a means of implementing scalable concurrency (think thousands of independent processes!).

I for one am not convinced of their benefit (at least in the context of Smalltalk). They have all the disadvantages of shared-nothing processes with few of the advantages. A crash in one of the lightweight processes will still bring down the entire VM. I think they make sense in Erlang, where a lot of static state is eliminated, such that forking a lightweight process is quite a cheap operation. In Smalltalk, all classes and other static state would have to be copied to the new process. This is a bit too expensive for my liking.

Hello planet-web

It seems I am now on planet gnome. Thanks Jeff! I’d better introduce myself. I am a 4th year CS student living in Cape Town, South Africa. I have been hacking on Glade 3 since 2006, and have committed bug fixes to various GNOME modules.

Since November, I have been working on a new project of mine, Panda, a Smalltalk system. Its a personal mastery project, aimed at improving my programming skills whilst producing a complex system. I didn’t think I was challenging myself with glade3, and realised that I no longer wanted to do application development using C and GObject. I was on the lookout for a new language, sampling Java, Ruby, and Objective-C. I finally made my bed with Smalltalk.

My long term goal is put a GNOME development environment on top of Panda. Smalltalk IDEs are very different to conventional ones such as Eclipse or Emacs. For one, there is no process boundary separating the IDE and the program being tested, allowing all sorts of rich refactoring and inspection. Gilad Bracha speaks more about this.

That’s going to be at least 18 months away though, since there’s still a lot of VM work to be done. We do have some pretty compelling VM technology in panda right now, including a bytecode interpreter, compacting garbage collector, and fast tagged integers. Of course this all normal stuff for most Smalltalk VMs out there. The next step for me is rewrite the compiler in Smalltalk, since the VM seems stable enough to support it.

Fibonacci Redux

Time for more microbenchmarks! Someone just pointed me out to this post which describes another fibonacci shootout. It gave the brand new Ruby 1.9 VM a test drive. Ruby 1.9’s execution model is quite different from 1.8, featuring a highly optimized bytecode interpreter (codenamed YARV). I must admit that I totally overlooked YARV. All the buzz around Rubinius distracted me.

YARV beats the pants off ruby 1.8. It also outperforms Panda Smalltalk, not by much though. I have prepared a fresh benchmark. This time I present the fibonacci code using images since wordpress refuses to let me format language code sanely. I also got a bit carried away with GIMP and made the code look all pretty.

fib-rb190.png

fib-st2.png

Progress

I am having great fun working on my Smalltalk VM. I spent the last few weeks mainly on implementing a compacting garbage collector. There have been some nice additions to the class library, including the Set, Dictionary, and System classes.What’s been particularly pleasing to me personally is that certain benchmarks have come out very fast, faster than Python or Ruby.

Fibonacci Benchmark

fibonacci
    self <= 2
        ifTrue: [^ 1]
        ifFalse: [^ (self - 2) fibonacci + (self - 1) fibonacci].

The naive recursive fibonacci function isn’t a very exemplary benchmark. Nevertheless, I am advertising the results here because I think its a good start. In my opinion, the fibonacci function does at least give a good benchmark for method invocation performance (i.e changing the current context or activation record of an interpreter). The benchmarks below are all based on calculating the 32nd fibonacci number. I assume that the Python and Ruby binaries are compiled using the optimal CFLAGS. If that’s not the case, then I guess Ubuntu has a bit of a problem.

Squeak Smalltalk: 0.427s

The winner, Squeak. I can’t reason about why its faster than Panda. I suspect squeak’s context management scheme is slightly more optimized at the moment.

Panda Smalltalk: 0.67s

Yes, that’s my VM! coming a respectable second. As I stated above, there’s still work to be done on context management. The executable was compiled using “-g -O3 -fomit-frame-pointer”.

Python: 1.18s

Python has a bytecode interpreter and uses the C stack to store activation records. So its context management is probably an order of magnitude faster than Panda. I suspect the python is slower as it allocates an actual object for each integer. In Panda, integers are just stuffed into pointers, which are then tagged in the two low-order bits so that the interpreter can differentiate between normal objects and integers.

Ruby: 4m0.2s

No excuse. This gives dynamic languages a bad name. I suspect the main performance killer here is that Ruby’s execution model is based upon an abstract syntax tree rather than bytecodes. The problem with AST’s is that the nodes are spread throughout the heap, reducing locality. A lot of indirect memory references are required during execution, which slows things down. Hopefully Rubinius or JRuby will improve matters since they include bytecode interpreters.

This work is licensed under a Attribution-ShareAlike 3.0.