Using LLVM to speed up function invocation in a dynamic language binding
With the recent release of PyGObject I decided to take a look into something I’ve been meaning to do for some time, using the excellent LLVM library to make things go faster.
But first some brief history. Back in the old PyGTK and pygtk-codegen days all functions were created at compile time generating all the Python<->C wrappers that you would ever need. This is problematic for various reasons:
- each binding needs to be generated before using it and thus you need a Python extension module for each library you use.
- the code generator generated straight-forward wrappers. Not a problem for a small library, but Gtk has almost 200 classes and more than 3000 functions/methods which all need to be registered when you import gtk, which takes time and memory.
and PyGI solves both of these problems. Typelibs are created in the upstream library and the classes and functions wrappers are created in runtime when you use them. One important thing changed though. Invokation of the native functions are no longer done through generated code, we’re using libffi
to do function invokation completely in runtime. This works pretty well but it’s much slower than the old generated functions.
This is where LLVM enters the picture. LLVM is a compiler infrastructure framework. It’s basically a full C/C++ compiler/assembler/linker split into reusable libraries. I decided to take a look into using LLVM to optimize the function invokation by generate native code for the functions lazily using the the LLVM JIT.
I can now announce that after experimenting with it I’ve been able to get it to a point where it’s usable enough to run benchmarks on. That doesn’t mean it’s feature complete, as only a few types such as double,int32,int64,object, so it’s far from supporting a complete application. Anyway, let me present some numbers. First function,
The first two ares an extreme examples as the functions does very little. But it’s great as it will *only* exercise the time it takes to convert an argument to/from native types.
The last function is not speed up considerably because the time spend most of its time inside the function itself, instantiating GObjects is far more computational expensive than converting Python wrappers back and forth.
This work in progress can be found in the llvm
branch of pygobject
This entry was posted in Blogroll
. Bookmark the permalink