Large number of XML Nodes and GXml performance

GXml performance has been improved since initial releases.

First implementation parse all to libxml2 tree and then to a GObject set of classes, in order to provide GObject Serialization framework.

Over time GXmlGom was added as a set of classes avoiding to use libxml2 tree improving both memory and performance on Serialization.

GXml has been used in many applications like parse Electrical Substation Configuration Language files by; to Mexican Tax Authority XML invoices format, among others.

QRSVG Performance

For my private projects, I need to create QR of size 61×61 = 3721 squares. This means at least 2700 XML nodes. This is a large number of nodes and because QRSVG depends on GSVG and it depends on GXml, all them depend on GXml’s implementation for performance.

Initial measurements suggest that, at no surprise, using a simple array of objects takes up to 0.5 seconds to add just a node, as maximum time measured.

So GXml’s implementation should be improved for large number of nodes. Now it uses Gee.ArrayList, is clean and easy to wrap a node list implementing W3C DOM4 API. But now I’m considering to use Gee.TreeMap, because it is designed for large collection of objects, from its documentation:

This implementation is especially well designed for large quantity of data. The (balanced) tree implementation insure that the set and get methods are in logarithmic complexity.

The problem is its Map interface, where I need to implement a Gee.BidirList interface over it, in order to ensure fit in W3C DOM4 API and get performance boost.

Lets see how evolves this. Any suggestion?

Author: despinosa

Linux and GNOME user, full time, since 2001. Actual maintainer of GXml and contributor to other projects mainly on GObject Introspection support.

Leave a Reply

Your email address will not be published. Required fields are marked *