Reference counting in GtkTreeModel, rules to play by

A couple of rules must be played by when using reference counting of GtkTreeModel nodes. These rules very much make sense when you reason about them. Also, the rules are paramount to implementing something like GtkTreeModelFilter, as we will see later. Let us first discuss the rules:

Never take a reference on a node, without owning a reference on its parent. This means that all parent nodes of a referenced node must be referenced as well. Looking at this from a GtkTreeView perspective, this rule will never be violated. GtkTreeView will always take a reference on every node that is present in GtkTreeView’s RB tree, which contains all children of expanded parents. Naturally, GtkTreeView cannot expand a parent which is not visible itself, so parent nodes are always referenced.
When you are implementing your own GtkTreeModel, you are free to reference whatever you want. In this case, it becomes important to adhere to the rules. When GtkTreeModelSort or GtkTreeModelFilter is set as the child model of your custom model, things quickly go wrong if you do not play by the rules. The sort and filter models aggressively empty their internal data structures of nodes that are no longer needed. This process is based on the current reference counts of the nodes. So, when you reference child nodes without holding a reference on their parents, a parent might eventually end up at a reference count of zero while its children are still visible. A parent level might wrongly be pruned from the cache, corrupting the data structures. (Of course, the sort and filter models could enforce the rule for you by referencing all parents when a node is reference, but this significantly hampers performance because many more virtual ref/unref calls are done than necessary).
Outstanding references on a deleted node are not released. So, once you receive a row-deleted signal for a node in your model, do not expect any unref call for that node to follow. You can implicitly erase all references.

If you are not the model implementor, but instead have a couple of references on nodes in this model, then do not release references of a node once you have received the signal that that node has been deleted. (This actually implies you will have to listen for row-deleted signals if you have references on nodes, or you use a GtkTreeRowReference which will be set to NULL once the node has been deleted).

It is clear why this rule is there: by the time you receive a row-deleted signal for a node, the node no longer exists in the model so you cannot get an iter to this node to perform the unref call.
Models are not obligated to emit a signal on rows of which none of its siblings are referenced. To phrase this differently, signals are only required for levels in which nodes are referenced. For the root level however, signals must be emitted at all times (however the root level is always referenced when any view is attached).

What is this rule trying to say at all? Imagine you have a ref-counted model, some node in the root level has a couple of child nodes and only the nodes in the root level are referenced. This means that none of the nodes is currently expanded. The rule says that, as a result, any signals for the child nodes (e.g. row-changed if a child nodes changes) do not have to be emitted.

Makes sense — what view is interested in signals of nodes that are not displayed in the view anyway? In general, you are indeed not interested. But the for GtkTreeModelFilter this rule is very important in order to properly handle a sensible use case. Imagine that you have a filter on child nodes. You only want to show an expander arrow next to the parent node if any of the child nodes is visible. To handle the expander arrow correctly, we must receive events for the child nodes, also when these nodes are not referenced. GtkTreeModelFilter handles this by explicitly obtaining a reference on one of the child nodes if a parent is referenced. Due to this rule, a reference on a single node is sufficient to start receiving signals for this level.

In summary, we can say that good understanding of these rules, which seem so obvious, is crucial when implementing a model like GtkTreeModelFilter. The rules are used to determine when it is required to emit signals and when levels in the internal data structures can be cleaned up. I can go into more detail on what is involved in GtkTreeModelFilter, but that is probably not of that much interest. What is for example required is to for each node maintain two reference counts; it’s actual, real, reference count and it’s external reference count. The external reference count maintains the reference count based on the ref and unref calls received from the model’s observer. This determines a node’s visibility, so should not be mixed up with any reference on the node taken internally!

As we have seen, these rules also put limits on what can be done with a visible function in GtkTreeModelFilter. More on that in a later blog post.