How Much DocBook

2005-10-21

Following Federico’s suggestion, I whipped up a script to see how often we use which DocBook elements in our help files. The top four are para (10499), entry (3415), listitem (3114), title (1948). None of these came as a surprise to me. Here’s some interesting data points:

The rundown of how often the basic sectioning elements are used: sect2 (1201), sect1 (502), sect3 (205), section (8), sect4 (2). We have very few documents using the section element. In general, I favor using section, but the numbered ones do provide more information with this script (not that it would be hard to write another depth-checking script). Since sect2 is used more than twice as often as sect1, it seems two-level section is common. Deeper levels seem rather uncommon, although three-level isn’t rare.

Articles (70) and books (4), right about what I expected.

On basic inline markup: guilabel (1858), application (1527), keycap (1032), guimenuitem (792), guibutton (744), filename (702), guimenu (647), menuchoice (605), literal (281), keycombo (214), phrase (206), replaceable (170), command (140), guisubmenu (134), userinput (109), and stuff that didn’t manage to hit 100. Those in the know will know that guilabel (1858) is used as a catch-all for most things on the screen. Its high usage amuses me, because it means DocBook’s various gui* elements can’t manage to catch everything. I think it should give up. Note that menuchoice (605) is used far more often than keycombo (134). I was actually surprised that userinput (109) was used as often as it was. I’ll have to take a look at where it’s being used.

On lists: listitem (3114), varlistentry (1135), itemizedlist (309), variablelist (276), orderedlist (267), simplelist (3). I didn’t expect a high turnout from simplelist (3). Since listitem (3114) is used in most lists (just as li is used in both ol and ul in HTML), its number wasn’t surprising. There’s not a huge difference in numbers between the three common list types.

The titleabbrev element was used only once. I’ll bet I’m the one that used it, too.

We used indexterm 242 times, a primary term 241 times (huh?), a secondary term 128 times, and a tertiary term only 17 times.

We used the general-purpose synopsis element 123 times. As expected, we didn’t use any of the special-purpose *synopsis elements at all.

Admonition breakdown: note (96), tip (26), caution (6), warning (5), important (1). I still don’t have a clear idea on the difference between caution and warning. I was surprised that important was used only once.

There were 232 imageobject elements, but only 204 textobject elements. That means we have images without accessible text.

Some block elements: figure (188), screenshot (187), informaltable (172), mediaobject (164), screen (68), table (64), literallayout (27), programlisting (13), highlights (13).

Finally, we used just 146 out of DocBook’s 411 elements.