GXml 0.13.90 Released

With lot of work to do on XSD, but certainly happy to see GXml.Gom* classes taking shape, fixed lot of bugs since last 0.13.2 and starting to port some projects to this new version, I hope to soon release 0.14, just after most translation are in place.

This new version, will provide a better supported XML GObject wrapped, using DOM4 API and initials of other technologies like XPath and XSD.

Hope some one takes some time to implement XPath recent W3C specification and complete XSD support implementation. May be this will be a good projects for Google Summer of Code 2017.

On the process, I’ll try to implement W3C SVG API using GXml.Gom* classes, to provide a fully supported XML/SVG API using GObject libraries, and all other languages supported by GObject Introspection.

Future GXml versions, can include:

  • Cleanup more interfaces/classes no useful any more, like TNode derived classes and other non DOM4 compliant interfaces. This will reduce number of classes to choose from, while recommend way will be on GNode and GomNode derived classes.
  • Improve XSD support
  • Explore how to manage large XML files
  • Improve XPath support
  • Explore XQuery

May be other ideas, from GXml users.

GInterface and GXml

I love GInterface definitions, more in Vala, because they are clean an easy to describe API. Interfaces are the way W3C defines their specifications, like SVG and DOM4.

Vala interfaces definition are realy close to be a copy and paste from W3C’s specification definitions. With some, well a bit, work, you can transform them in usable GObject Interfaces definitions.

GXml take DOM4 interfaces and implement them, using a set of instantiable classes.

GXml provides a GObject to XML and back serialization framework, allowing you to define your own classes and how you want your data is represented in XML.

In order to read back your information, GXml needs to create, on-the-fly, instances of your classes, this means you need GObject ones no GInterface.

Starting to implement XSD support in GXml, I’ve created a set of interfaces to interpret W3C specification, this is really helpful, but unusable when you need to instantiate an object it is declared as a Interface. For example, if you have an interface A and it has a property of type B, but at the same time B is a GInterface, you can’t implement A and have an instantiable object from B: I mean, using g_object_new().

Because GXml engine, requires instantiable objects, to create new element nodes when found, using a GObject type to parse attributes to properties, for example, I ended creating a set of interfaces, to help me design an clean API, makes room for other implementations engines, but creating a new classes that will implement XSD interfaces having its own “mirror” properties.

This is, while GXml.XsdSchema have a GXml.XsdListSimpleTypes property to access to all simple type definitions, GXml.GomXsdSchema will have two properties with same purpose: a GXml.GomXsdListSimpleTypes property AND a property of type GXml.XsdListSimpleTypes to fully implement GXml.XsdSchema. Second one, will mirror the first. These is more work to implement an interface but keeps your classes’ properties instantiable, and your users can choose to use just GXml.XsdSchema interfaces API to access your class implementation, keeping open to use different implementations.

Best of all, with GXml implementation of XML to GObject, more clearly using GXml.Gom* classes, you will have access to *all* nodes, attributes and child ones found in an XML file, without loose them in the process of de-serializing back to your class instance.

GXml and XSD

While on the road to release GXml 0.14, I started to port some of my projects to new GXml.Gom* objects, in order to take advantage on speed and reduced memory footprint.

In the process, my library requires to define a large set of strings to select on for an element’s attribute. This is, PostalCode number. They are defined as XSD enumeration in a SimpleType.

At the begining, I started to define an array to add to GXml.GomArrayString[1], but found they are too many to maintain and error prone.

Then I desired to take a look at W3C Schema Specification and started to define a new GXml.Xsd* interfaces and a new GXml.GomXsd* classes to implement them. The result is: GXml.GomXsdArrayString [1], a new class taking an XSD file, to search a SimpleType definition and parse all enumerations, add them to an array of strings you can choose or validate your property value from.

These are the first steps in the way to get XSD support in GXml. While, GXml.Xsd* interfaces are unstable, and will continue this way after 0.14 release, will open new opportunities to any one consuming XSD definitions.

May be in the future, some one can use this API to create an XSD to Vala (or C) code GObject classes.

May other wants to help adding more object definitions from XSD specification in order to get patterns and other restriction from schema definitions, and improve data handling and validation.

[1] GXml.GomCollections definitions

[2] GXml.Schema definitions and GXml.GomXsd implementations

 

GomDocument: Providing best of two worlds

In upcoming GXml 0.14, there will be a new DOM4 implementation called GomDocument, it along with GomElement and GomObject, provides support to serialize or deserialize GObject object to or from XML files.

I’ve made initial measures about what is the performance of GomDocument against other implementations in GXml: GDocument and TDocument, first one is a libxml2 implementation with bidings on GObject and DOM4 while last one is a pure GObject implementation with no support (yet?) of DOM4.

In performance test, we use a 3.5 MB file with a lot of nodes, read it and then create an internal in memory XML tree and then fill out your GObject class properties, we measure required time to deserialize and then time to write it back to an XML file.

We can see GDocument taking a lot of time on deserialize, because it uses libxml2  to create a tree and GXml.SerializableObjectModel to fill out your GObject class. Serialize is very competitive, because all implementations, use almost same engine: direct access to libxml2 by xmlWriter.

GomDocument, is very competitive if compared to TDocument, but can perform much better than GDocument.

Please note that GomDocument time to write to disk is not available, because serialization and deserialization is make in one go, then may be we case reduce this time (should be the same from others  because the file is loaded in memory before to read) and then this makes GomDocument to perform even better!!

On memory usage, TDocument requires much more memory and GDocument is the one to defeat. GomDocument now requires almost same memory than GDocument.

Conclusion

GomDocument is the way to go on Serialization framework for GXml. These results will help LibreSCL, to provide a very competitive serialization/deserialization framework and consider to create a WEB based application using GObject Introspection and Python to access very large files, without requiring lot of memory resources and may be a good response time on reading.

I’m starting to stabilize GXml to release 0.14 as soon as I can found most bugs on parse XML files. GXml’s GomDocument have a set of errors detection, based on DOM4, not present in other GXml implementations, making it more sensitive to some files and may not affecting (or detected) by libxml2, for example.

GXml 0.13.1 Released

Now you can convert your GObject classes in XML nodes. This is, you can read and write XML trees directly to object classes’ properties, from basic types to complex like object properties, representing XML element’s attributes, to other child elements, while you can use collection of child nodes.

This has been easiest to implement than GXml.SerializableObjectModel, which requires you to read an XML tree and then translate to your object properties. This should be slower than new GOM implementation included in this release.

Next, should be to test it in a real project like GSVG and make some performance tests.

These are old results of GDocument vs TwDocument, used by SerializableObjectModel, to serialize and deserialize large files (3.1MB). In future articles, hope to attach this GomDocument performance and resources comparations with all existing implementations in GXml.

gvstw-memory

gvstw-time

GXml: Objects and Collections to XML and back

Today I’ve finished to push last implementation for GXml GOM, to allow write, to XML:

  • Object Properties, as child nodes of current class
  • Object Properties, as element’s attributes
  • Object Collection Properties, referencing XML child nodes

Object Properties

In GXml GOM, any GomElement is an XML Element node. If it has GomElement as properties, they will be added as child nodes.

Object Properties as Attributes

If your GObject class implements GomProperty interface, and is a property in your object, it will be translated to an Element attributes with a name and a text value.

For simple types, this means you can control if an attribute is written or not, depending if it is not null. Standard properties, not GObject classes implementing GomProperty, they will be always written with its default value. This is, for example, a boolean will always use false by default.

Using GomProperty, you can define default actions when a property is omitted in XML file.

Complex Object Properties

Some times, your string representation of attributes include more information than just values, like units. In GSVG, you have an attribute like: length=”3.8 cm. You can implement an object with properties for each value component like:

public class Length : Object {

public double value { get; set; }

public UnitsEnum units { get; set; }

}

With GXml GOM, is now possible to implement W3C SVG 1.1 specification interfaces, most objects will be complex properties to be translated to Element’s attributes. Once you implement a way to parse a string representation to your object’s properties and back, you can have GomProperty objects in your GomElement to be de/serialized to attributes.

Collections

GomElement objects are containers, by definition, in DOM4. It can have child nodes of different local names and namespaces.

Once you have a set of different nodes, may you want classify them by their node’s names and for, for example, its id attribute.

GXml GOM, have added a set of basic collection classes, implementing GomCollection interface. ArrayList and HashMap, are classes you can use to access child nodes. All references are to child node’s indexes, no copy or ref-counted objects.

Examples

If you want to see how implement different kind of classes and properties, you can checkout GXml repository Unit Tests.

GXml 0.14 and Serialization

Now with all in place for DOM4, GXmlGom is getting support to derive classes from GXml.GomElement, making easy to serialize your GObject classes to XML.

Now you just need to prefix your GObject’s property nicks with “::” and it will be used as XML Element attribute. Now Gom have support for strings, integers, unsigned integers, double and enumerations properties types.

I you have this class:

  public class Taxes : GomElement {
[Description (nick=”::monthRate“)]
public double month_rate { get; set; }
[Description (nick=”::TaxFree“)]
public bool tax_free { get; set; }
[Description (nick=”::Month“)]
public Month month { get; set; }
construct {
_local_name = “Taxes”;
}
public string to_string () {
return (_document as GomDocument).to_string ();
}
public enum Month {
JANUARY,
FEBRUARY
}

You can use:

var t = new Taxes ();

printf (t.to_string ());

and you’ll get:

<Taxes monthRate=”16.5Month=”februaryTaxFree=”true“/>

Next steps will be to implement reading XML documents back to your GObject’s properties.

You will find more “examples” and advances at GXml repository.

Rust and Vala

This post is based on my experience on not just using but creating and maintaining Vala libraries.

Rust is on the horizon and have voices to use it instead Vala. For Vala, we can say, is true to be niche oriented language, because it just create GObject based applications and libraries. For Rust, it has a more general purpose, with save concurrency and save programming: true again.

I have downloaded Rust to start using it, because seems to be very convenient for C development replacement, specially may be suitable to replace old libxml2 library, hopping some some have started to write an XML parser and writer, I can re-use in my own projects.

After taking a time to check at Rust documentation and write a post about GObject and Rust, I would like to share my thoughts about Vala and Rust.

Vala have a very specific target: GObject. We can use Vala to create GObject classes and define API interfaces using GInterface, but in a sugar and very productive syntax. Doing Object Oriented Programming using Vala and GObject is easy and natural.

I’ve managed to get W3C set of API interfaces based on specifications for DOM and SVG, because Vala provides a kindly close interface definition syntax to the ones from W3C. Implement them have been a matter of, you know, work. W3C interfaces are really complex in a way of their relations ships and dependencies, which should be very difficult to implement if doing in C and GObject/GInterface. This is not false for Rust, because it doesn’t have a GObject/GInterface mechanism, at least not jet.

In my opinion, Rust should develop its own, more powerful and secure, implementation of GObject/GInterface, based in their own internal types, like Traits.

I don’t know if Rust is planning to provide an Object Oriented like mechanisms, like GObject/GInteface, with properties, signals, introspection and inheritance, but is very convenient for other languages and implementing Object Oriented API, like the ones from W3C.

While I write this article, I’m studding Rust, to find equivalences and possibilities, thinking on how to port my work to Rust if possible. At least, for now, I can’t, but I have been just little time, I need to follow its development and road map.

Vala development can be used by Rust applications and libraries, because is C and GObject, and because it produce GIR files to create bindings easily for Rust, along with other languages.

If GObject is your choice to write your next library or Application, I can recommend to use Vala: you’ll get a very productive syntax and a easy bindable API.

No, I’ve don’t want to stop my self to Vala. Just is very convenient to produce GObject based libraries, use any C based library too, and is really productive. My work on GXml, can’t be just dropped and its capabilities to Serialize GObject classes, by introspection of its properties, are very useful and productive. I’ll share more in these later.

Rust and GObject

First all, I’m not a Rust programmer, this is just a point of view of my first impressions about Rust, from documentation of it, and how I see it to use with GObject.

From documentation Rust provides a low level and high level API to access common operations. Provides a set of assumptions to help its great features like automatic memory management, secure and concurrent data access. On high level side, Rust provides a rich set of common collection, iterators, tuples and others.

For GObject interoperability, there is a project , and this too, I found to allow you to use GObject based libraries in Rust, while they depends on other project, or directly on GObject Introspection generated XML files to introspect these C libraries.

I don’t see a GObject equivalent, not jet at least, into Rust. From GNOME developers site, I found an introduction to GObject:

  • A generic type system to register arbitrary single-inherited flat and deep derived types as well as interfaces for structured types. It takes care of creation, initialization and memory management of the assorted object and class structures, maintains parent/child relationships and deals with dynamic implementations of such types. That is, their type specific implementations are relocatable/unloadable during runtime.
  • A collection of fundamental type implementations, such as integers, doubles, enums and structured types, to name a few.
  • A sample fundamental type implementation to base object hierarchies upon – the GObject fundamental type.
  • A signal system that allows very flexible user customization of virtual/overridable object methods and can serve as a powerful notification mechanism.
  • An extensible parameter/value system, supporting all the provided fundamental types that can be used to generically handle object properties or otherwise parameterized types.

One of the most powerful features on GObject is C, but at the same time it its weakest one, because GObject through GObject Introspection makes easy to create bindings to any languages, including Rust. But is hard to write code for GObject classes and interfaces. GObject provides an Object Oriented programing paradigm to C.

I don’t think any one is thinking to rewrite GObject based libraries to Rust, because you can. Then lets put this option aside for a moment.

While Rust have great features, I would like to find a way to write a Rust library and share it through GObject Introspection GIR, making it available to other languages at day 0. Just remember GObject Introspection, is better suitable for GObject based libraries.

I don’t find in Rust a direct GInterface equivalent, no classes, no object properties and signals, no error reporting equivalent to GError. All of them are getting in, by bindings from GLib, GObject and GIO libraries, written in C, using GIR. I don’t think any one will re-write that libraries to Rust.

GObject bindings to Rust are not stable jet and may mature enough in a few years, depending on demand, resources and their use in new written code.

My personal conclusion
  • GNOME will relay on GObject for next 5 to 10 years.
  • GObject will be improved and maintained in same period.
  • GObject Introspection will be a vehicle to easy access C libraries from other languages, including Rust.
  • Rust will grow and hope they’ll add object oriented mechanisms equivalent to GObject/GInterface, in order to provide equivalent more secure and concurrent safe API to create new libraries.
  • C language, will be here for next 20 years, but may gradually delegated to raw operations.

 

GObject and SVG

GSVG is a project to provide a GObject API, using Vala. It has almost all, with some complementary, interfaces from W3C SVG 1.1 specification.

GSVG is LGPL library. It will use GXml as XML engine. SVG 1.1 DOM interfaces relays on W3C DOM, then using GXml is a natural choice.

SVG is XML and its DOM interfaces, requires to use Object’s properties and be able to add child DOM Elements; then, we need a new set  of classes.

GXml, have a Serialization framework, it can be used to provide GObject properties to XML Element properties and collection of child nodes as GObject. I’ve created some other projects, like LibreSCL, using it.

Serialization framework, requires to create an XML tree first, then fill out GObject properties. This could add some delays on large files.

Considering LibreSCL have to deal with files about 10 MB to 60 MB, with thousand of XML nodes, this process XML Tree -> GObject properties, could take 10 to 20 seconds.

A few time ago, I imagined to have a GObject class as a XML Node. This is, an XML Element node, represent a GObject, XML Element’s properties should be mapped directly to GObject’s ones and XML Element’s child nodes, should be a collection inside GObject’s properties.

Now with SVG and GXml supporting DOM4, I face the opportunity to create a GObject class you can derive from, to convert your classes in XML nodes, making serialization/deserialization faster and reducing memory footprint.

Let’s see what is coming and how they evolve. As always, any help is welcome.

PD. As a side note, I’ve able to copy/paste with little modifications, W3C’s interfaces definitions to Vala ones in a short time, because Vala’s syntax.