Understanding XInclude
2011-07-21
I’m often surprised when people don’t know about XInclude, but I suppose not everybody eats and breathes XML the way I do. XInclude is a way to include other files (or portions of other files) into a single XML file. We actually use them throughout GNOME documentation, though few people realize it. XInclude isn’t tied to any particular XML vocabulary like Mallard or DocBook. It’s an XML feature defined by the W3C, and you can use it in any XML file, as long as your processing tools support it.
If you’ve used SYSTEM entities in XML before, it’s important to understand a key difference. (If you haven’t, skip this paragraph.) SYSTEM entities are a pre-parse text slurp. The text of the included file are inserted, byte-for-byte, at the inclusion point, and the resultant run of characters is then parsed. With XInclude, the included file is parsed, and its infoset is merged into the inclusion point.
Basic XInclude
The simplest use of XInclude is to include the entirety of an external XML file. We use this in many of our Mallard and DocBook documents to include common legal information. In gnome-help, for example, we have a file called legal.xml that looks like this:
<license xmlns="http://projectmallard.org/1.0/"> <p>Creative Commons Share Alike 3.0</p> </license>
Then, in the info element of every page file, we use this:
<include href="legal.xml" xmlns="http://www.w3.org/2001/XInclude"/>
When the file is parsed, the entirety of the license element from legal.xml is inserted in place of the include element.
Text XInclude
By default, XInclude expects the included file to be well-formed XML. You can tell it to treat the file as text instead. This is useful if you want to show the text contents of a file, such as inside a Mallard code element.
Just add parse=”text” to the XInclude element, like so:
<include href="somefile.txt" parse="text" xmlns="http://www.w3.org/2001/XInclude"/>
I use this on the source pages of the tutorials on projectmallard.org. Look at the Ten Minute Tour Source page, for example. This shows the entire XML source of the Ten Minute Tour inside a Mallard code block.
The XML markup for the Ten Minute Tour Source page uses a text XInclude. The nice thing about this is that you don’t have to worry about escaping characters in the included file. So if you’re writing a lot of code examples with angle brackets, text XIncludes can be a convenient alternative to escaping or using CDATA blocks.
It’s important to note that an XInclude processor does not care what the file extension or reported MIME type of the included file is. The file is either parsed as XML or as text, and this depends solely on the parse attribute.
Parts of Documents
In the first example, we included a single standard element. You might wonder if you can include lots of boilerplate element. If all the pages in a document share the same authors, you might want to put them all in one file and XInclude them in.
<credit><name>Shaun McCance</name></credit> <credit><name>Phil Bull</name></credit> <credit><name>Jim Campbell</name></credit>
If you put this into a file and try to XInclude it, you’ll get an error. Any file you XInclude (with the XML parse type) must be fully well-formed XML. Among other things, that means there must be a single root element. The example above has three root elements. So just wrap them with another element:
<info xmlns="http://projectmallard.org/1.0/"> <credit><name>Shaun McCance</name></credit> <credit><name>Phil Bull</name></credit> <credit><name>Jim Campbell</name></credit> </info>
Notice also that you do need the xmlns declaration to use namespaces. This will now parse, and XInclude works. But you’ll have an extra info element nested in the including document’s info element. That’s not right.
You can include only a portion of the included XML using XPointer. XPointer is a W3C syntax for pointing to pieces of documents. There are different schemes you can use with it to select data in different ways, but we’ll just stick to the xpointer() scheme, which uses XPath. The basic syntax looks like this:
<include href="credits.xml" xpointer="xpointer(/info/credit)" xmlns="http://www.w3.org/2001/XInclude"/>
This won’t work, however, because we’re using XML namespaces. You need to declare a namespace prefix and use it in your XPath. To do that, use the xmlns() XPointer scheme:
<include href="credits.xml" xpointer="xmlns(mal=http://projectmallard.org/1.0/)xpointer(/mal:info/mal:credit)" xmlns="http://www.w3.org/2001/XInclude"/>
This does exactly what we need. We wrap the credit elements with an info element to make the included file well-formed. (It also gives us a convenient single place to declare namespaces.) Then we select only the credit elements to XInclude with an XPath expression. I won’t go into all the details of what XPath can do, but for simple cases like this, it looks basically like a directory path.
This does require you to keep and distribute an extra file. Instead of doing that, you could keep the information in one of your page files, then XInclude portions of that file in every other page file. Since every Mallard document has an index page, you could do this for index.page:
<page xmlns="http://projectmallard.org/1.0/"> <info> <credit><name>Shaun McCance</name></credit> <credit><name>Phil Bull</name></credit> <credit><name>Jim Campbell</name></credit> </info> <title>My Index Page</title> <page>
Then in every page except index.page, use this in the info element:
<include href="credits.xml" xpointer="xmlns(mal=http://projectmallard.org/1.0/)xpointer(/mal:page/mal:info/mal:credit)" xmlns="http://www.w3.org/2001/XInclude"/>
You might wonder if you can use XPointer to include only a portion of a file included with parse=”text”, such as a certain range of lines. XPointer allows extension schemes to be defined. In fact, there are a couple dozen schemes registered with the W3C. One of them allows you to select a string range from an XML file, although there is no registered scheme to select a range from a text file.
You don’t need a registered scheme, though. All you need is for your XML processor to understand the scheme you’re using. Unfortunately, libxml2 only supports the xpointer(), xmlns(), and element() schemes at this time. But if you really need this kind of functionality, you can probably hire an expert to implement it for you.
Mallard Glossaries
2011-07-07
Mallard has been successful as a software help format in large part because it doesn’t include every feature under the sun. It provides a strong core language for dynamic, topic-oriented documents, and that’s what most people need most of the time. Sometimes you need some extra bells and whistles, though. So Mallard was designed to be extended, allowing you to add features without bloating the core language.
I’ve been working on a few extensions over the last few months. The one that seems to be in the most demand is the Mallard Glossaries extension. Right now, the story for glossaries is that you should use a term list on a dedicated page. And that really is enough for a simple, static list of terms. But there are disadvantages:
- Term lists are static, and that’s not a very Mallard thing to do.
- Term lists are manually sorted, which is a pain to begin with, but an even bigger pain for translations.
- You can’t link to individual terms. The payload of a page is basically opaque to the linking system.
- There’s no potential for more dynamic presentation, such as showing a short definition when you hover over a term on a topic page.
With the Glossaries extension, any page can declare a term and provide a definition in its info element. So to provide a definition for “Notifications”:
<gloss:term> <title>Notifications</title> <p><em>Notifications</em> are messages that pop up at the bottom of the screen, telling you that something just happened. For example, when someone chatting with you sends a message, a message will pop up to tell you. If you don't want to deal with a message right now, it is hidden in your messaging tray. Move your mouse to the bottom-right corner to see your messaging tray.</p> </gloss:term>
This gets put in shell-notifications.page, which is the page that talks about notifications. The glossary page then collects terms from different pages and shows them, together with a link to the pages that defined them.
Since this automatically provides links to defining pages, it also serves as a sort of index. (Professional indexers might get upset with me right now. Relax, I said “sort of index”.) Pages can even declare glossary terms without providing definitions. Just don’t include any block content other than the title. Then the entry on the glossary page will link back to the right pages.
Multiple pages can even provide full definitions. The glossary page will then show all definitions, collating the links to keep them next to their definitions. Here’s a very contrived example:
Note that the first definition doesn’t have an associated link. That’s because I defined the term on the glossary page itself. There’s little point in having the glossary link to itself.
This is very basic right now. Plans and goals include:
- Linking to individual terms from anywhere in any page
- Showing short definitions of terms when hovering over those links
- A tag-based selection system, so you could have glossary pages that only display a subset of the terms (e.g. symbols that were new in 3.0)
Mallard Training
2011-06-21
Mallard makes it easy to create dynamic, topic-oriented help documents, but even the simplest technologies have some learning curve, best practices, and advanced topics. To help developers and technical writers make the most of Mallard, I’m offering professional Mallard training services.
Mallard training starts with the basics: outlining a document, creating topics, and writing pages. You’ll explore Mallard’s unique linking and navigation system and learn how to create navigational structures that reflect what your readers are looking for. You’ll learn best practices on writing topics culled from years of in-the-trenches experience with Mallard and other documentation formats. All of this is done hands-on, creating actual documents from start to finish.
Training can be customized to your needs. You can also learn about topics such as using and developing Mallard extensions, integrating status tracking into your workflow, and working collaboratively with multiple contributors.
If you’re interested in Mallard, contact Syllogist for more information.
Testing With Real Users
2011-06-16
Whenever possible, I try to test user interfaces with real users. This gives me a much better sense of what people don’t understand, which helps me write better help. I don’t generally have the resources to run concerted usability studies, but even observing a single user can be very enlightening.
After reading Jakub’s “Killing Mode Switch” post, I was concerned about how discoverable this would be. I decided to do a quick test of our current overview. My test subject was a college-educated but non-technical Windows/Office user. I sat her in front of an empty workspace and said “Open the System Monitor application from the Activities overview.” Note that I was very exact in my language, because that’s how we say it in the help. I’m testing our instructions as much as I’m testing the UI. I also intentionally chose an application that you need to scroll to access.
She immediately saw and clicked Activities. She didn’t know about the hot corner. I think that’s OK. A hot corner on a target you click anyway is easily accidentally discovered. She then scrubbed the icons on the dash, reading the tooltips. So there’s a +1 that users readily recognize the dash as where application launchers live. Of course, I didn’t give her an application that’s in the dash, so it wasn’t there.
I watched her mouse and eye/head movement as best I could. She did seem to look off to the right, where the workspace thumbnails live, but she didn’t activate them with the mouse. After looking for a couple seconds, she clicked on Applications. She scanned what was there for a moment, realized she had to scroll, and found and launched System Monitor. I didn’t time it, but it seemed like around ten seconds total.
She said afterwards that she was confused at first because she didn’t realize that Windows and Applications were “tabs” (her word) and that she could click on them. This seems to be a trend. At the Open Help doc sprint, a user didn’t realize she could click the “Account disabled” button next to Password in the Users settings panel. This is even in spite of the fact that she had just read the help instruction telling her to do so. It doesn’t look like a button. It doesn’t look clickable.
Pretty is good. But I fear that some of the prettiness is coming at the cost of discoverability. I realize I’m working with a very small sample size here, but the general notion of affordance of clickability is not new.
I don’t know how discoverable Jakub’s new design will be. The “…” button at least looks clickable, but I don’t know that its meaning is clear. (Phil and I will probably have a long argument about what the heck to call that thing in the help.) I really doubt people will grok the pager dots on the right. In fact, I’m not even sure how they work. But I can only speculate at this point.
I really encourage people to do these kinds of quick tests on real users. Just grab a random person and ask her to do a simple task. It takes a few minutes out of your day. You might be surprised at what people don’t see.
Yelp Tools
2011-05-26
Over the last year or so, I’ve been restructuring the various bits of code that have traditionally been Yelp and gnome-doc-utils. The reusable XSLT stylesheets used by Yelp now live in yelp-xsl. I’ve worked on itstool as a successor to xml2po. And now yelp-tools holds some really handy command-line tools for Mallard and DocBook, plus some simple build magic for autotools projects.
I think hardly anybody knows about the tools in yelp-tools. Let’s fix that.
yelp-new lets you create a new Mallard page file from a template. If you want to use the “task” template (shipped with yelp-tools), to create a page with the ID ducksinarow, do this:
yelp-new task ducksinarow
Or, give it a title straight away
yelp-new task ducksinarow "Put your ducks in a row"
You can create your own templates by giving them the extension .page.tmpl. yelp-new picks up both installed templates and templates in the current directory. It substitutes variables surrounded by @. This should look familiar to people who’ve used autoconf. The easiest way to create a new template is by basing it on an existing template with the –tmpl option, like so:
yelp-new --tmpl task my-new-task
yelp-build allows you to build output formats from Mallard or DocBook files. Currently, it can create HTML or XHTML, Mallard cache files, and EPUB. EPUB only works for Mallard right now. Create HTML for some Mallard page files:
yelp-build html *.page
You can even create your own XSLT customizations. You’ll need to learn the templates and modes used by yelp-xsl. (It’s pretty well commented with a home-brewed XSLT documentation system I made.) yelp-build lets you pass a customization with the -x option. The nice part is that your XSLT doesn’t have to xsl:import anything, so you don’t have to worry about the correct file path. When you pass a file with -x, yelp-build automatically creates a new stylesheet that imports yelp-xsl and includes your customization.
yelp-build html -x my-customization.xsl *.page
Finally, yelp-check is full of handy routines that help you keep track of your work while you write. You can check to make sure all xref attributes point to valid IDs:
yelp-check links *.page
You can check which pages can’t be reached by topic links from the index page:
yelp-check orphans *.page
You can get a report of the revision status of all your page files:
yelp-check status *.page
The status subcommand takes options to let you specify version, docversion, and pkgversion attributes to match against, provide an upper or lower cutoff date, restrict the output to only a few status markers, or just print totals. See the –help output for details.
Finally, you can validate your document against a schema:
yelp-check validate *.page
The validate subcommand implements dynamic schemas with the Mallard version attribute, allowing you to do stricter validation when using extensions like conditional processing, faceted navigation, or glossary indexes.
What else would you find useful?
Mallard Conditional Processing
2011-04-29
We’ve talked about doing run-time conditional processing in our help for a very long time, nearly as long as I’ve been around. When you have applications that can run in multiple environments, you sometimes need to write different help text for difference cases.
For example, take an instant messaging application like Empathy. In GNOME 3, the Telepathy integration means that users can chat directly from the tray. Hackers know it’s not actually Empathy, but users don’t care. It’s chat. In GNOME 2, you usually interact with Empathy through the notification tray. In Unity, it integrates with the messaging menu. These are all things we should write about in the help.
But it’s hard to write about things when you don’t know where your help is read. The only way to deal with this is to patch the help for different environments. My experience is that that rarely happens. Plus, you don’t always create different packages for different environments. The exact same Empathy package in Fedora is used in GNOME Shell, GNOME fallback, and XFCE.
Other XML formats have conditional processing, but their design isn’t suitable for our needs. They usually work by having a set of attributes that take some tags. The behavior isn’t usually specified, because it’s assumed that they’ll be handled by a vendor-specific build tool that strips things out before creating a deliverable. Our source format is our deliverable format, and we want to adapt it at run-time.
So I’ve just landed some changes in Yelp to support a Mallard extension format for run-time conditional processing. Any block element can have the if:test conditional attribute added. This is evaluated as an XPath expression with extension functions that can tell you things about the environment. This is much more flexible than a set of tagging attributes, because you can combine multiple tests with “and” and “or“.
<p if:test="if:env('gnome-shell')">This is only shown in GNOME Shell</p>
<p if:test="if:env('gnome-shell') or if:env('gnome-panel')">
This is shown under GNOME Shell or GNOME Panel
</p>
The if:test attribute is a convenient short-hand syntax. There are also conditional processing elements that allow you to do basic branching and fallback.
<if:choose>
<if:if test="if:env('unity')">
<p>You're using Unity</p>
</if>
<if:if test="if:env('xfce')">
<p>You're using XFCE</p>
</if:if>
<if:else>
<p>You're using something else</p>
</if:else>
</if:choose>
The full syntax has another advantage. What happens if you hand a document with conditional processing to a Mallard processor that doesn’t support it? It turns out that Mallard has very well-defined rules for how unknown extensions are handled. In the first case, you just have an external attribute on a known element. The attribute is ignored and the p element is processed as normal.
In the second case, you have an external element in a block context. A Mallard implementation that doesn’t do conditional processing will visit the children in restricted block context. In restricted block context, unknown external elements are ignored.
Yelp 3.0 doesn’t have the conditional processing support. The following two snippets will do what they say in Yelp 3.0, but they will be exactly equivalent to Yelp 3.2 with conditional processing support.
<p if:test="if:env('foo')">
This will be displayed in Yelp 3.0
</p>
<if:choose>
<if:if test="if:env('foo')">
<p>This will not be displayed in Yelp 3.0</p>
</if:test>
</if:choose>
Details are all subject to change, of course. This does basically follow a proposal I made to the Mallard mailing list last August. But for now, it’s in an experimental namespace. I just need to iron out the kinks and write up a specification on projectmallard.org.
Passive Voice Day
2011-04-26
It has been decided that tomorrow, April 27, is Passive Voice Day. (It might be asked, “It has been decided by whom?” Exactly.) Passive voice should be written and spoken in by everybody. For one day only, active voice will be frowned upon. It might be considered silly by you, but it will be found to be fun by many.
The #passivevoiceday hashtag should not be forgotten when tweets are written, although it is doubted that passive voice sentences will be able to be fit into 140 characters.
ITS Tool Released
2011-04-26
Last October, I blogged about itstool, a tool I developed to translate XML documents with PO files using ITS rules. Today, I released version 1.0.0 of ITS Tool on the new ITS Tool web site. If you’ve used xml2po before, you’re familiar with the basic idea: PO messages are extracted from an XML files, and translated messages are merged with the source to produce localized XML files. If you’re not already translating your documents using a message-based format, you need to start. Your translators will thank you.
ITS Tool takes the same idea as tools like xml2po, but the implementation is done entirely in terms of rules from the W3C Internationalization Tag Set. You don’t have to patch it to create a mode for a new XML format. You just need to provide a standard ITS file. Better still, if you mix XML vocabularies in a single file, ITS Tool can apply the rules for all matching formats.
Translators will be happy to know that we can now mark things as untranslatable using the standard its:translate attribute, or using custom its:translateRule elements. This is a long-requested feature that will help cut down the amount of unnecessary cruft that translators have to look at.
In addition to the features we get from standard ITS data categories, ITS Tool provides some custom extension rules to support features like translator credits and external file tracking. There are a few more features I’d like to provide as well, such as adding extra Mallard link titles and specifying transliteration-only messages.
I’ll be working on the GNOME build tools to switch GNOME’s documentation over to itstool for 3.2. Most messages in the PO files will be the same as with xml2po, so it won’t introduce much extra work for translators.
But ITS Tool is not just a GNOME project. It’s free software, under the GPL 3. It’s built on Python and libxml2, and can be used by any project for their XML documents. If you use an XML format that isn’t handled by the built-in ITS rules, you can pass your own custom ITS rules. Or if it’s a common format, submit those rules upstream. I encourage everybody working with XML documents to try ITS Tool and let me know how well it works and what can be done to improve it.
GNOME 3
2011-04-07
I’ve never been more excited to be a GNOME developer. After years of hard work and planning, GNOME 3 was released yesterday. Check out the Introduction to GNOME from our brand-new help to learn all about it.
GNOME 3 shows the innovation that open source communities can bring. Hundreds of developers, designers, writers, testers, and translators worked hard to deliver an amazing new user experience. Among them are the fantastic people who helped create an all-new help system that rivals anything I’ve seen elsewhere. My release notes list Phil Bull, Jim Campbell, Tiffany Antapolski, Natalia Ruz Leiva, Shaun McCance, Paul Frields, Mike Hill, Aline Bessa, Marina Zhurakhinskaya, and Kelly Sinnott. If I missed anyone, I’m really sorry. It’s hard to keep track of so much awesome.
We tossed out the old manual (who reads those?) and started fresh with topic-oriented help, building on the dynamic Mallard language. The results are amazing. The initial release has 214 pages, carefully organized and cross-linked to help you find the information you need quickly and get back to your life. What is a workspace? Select files by pattern. Enter special characters.
Frederic Peters did amazing work on library.gnome.org so all the new help is available online. But remember that all of this is available from the Help application on your GNOME 3 desktop. The help viewer was completely rewritten for GNOME 3, and I think you’ll really like what we’ve done.
By the way, if you want to meet up and learn about GNOME’s fresh approach to help, you should come to the Open Help Conference this June. At least Phil, Jim, and I will be there, and we’ll be joined by documentation teams from other great open source projects.
We’re not done yet.
Help needs to constantly improve and evolve as we learn more about what our users need. The GNOME documentation team is already hard at work on more pages and revisions to existing pages. We’ll be pushing updates to the help weekly. If you want to get involved, subscribe to our mailing list and send an email to gnome-doc-list@gnome.org.
We also welcome drive-by contributions. One of the really nice things about topic-oriented documents with Mallard is that it’s easy to just write up a page about something without worrying about making revisions to an entire book. If you love using GNOME, a great way to contribute is by writing a short page about an awesome time-saving trick. We’ll put these under Tips & Tricks, and users worldwide will learn something cool because of you.
Air Canada is on my shit list
2011-03-29
Air Canada is on my shit, or as I like to call it, my “Continental list”. (Guess which airline I hate the most.)
At this point, I’m seriously contemplating cancelling my upcoming conferences. I’m tired of the airline industry. I’m tired of their complete disregard for their customers. And I’m tired of shelling out over $200 because they can’t do their job. I can’t afford to do this anymore.
I was supposed to fly back from the GNOME documentation sprint last Wednesday. I had a 5:00 flight on Air Canada, direct from Toronto to Cincinnati. I got through customs and security without much incident. (I was misinformed about the customs procedure by an Air Canada employee, and I was selected for additional screening, but that’s all minor.) I got an overpriced chicken sandwich and sat at my gate. I pulled out my laptop and worked on the help.
It snowed in Toronto Wednesday. There was a dusting of snow on the ground when I woke up, and it really picked up as I headed to the airport. I grew up near Chicago. I understand crazy weather and lake-effect snow. I was fully expecting delays. So when they announced a delay, I sighed and kept working. When they changed gates, I sighed and walked to the new gate. When they announced they were down to one runway and there would be further delays, I sighed again. You have to deal with what nature hands you. As long as I could get home, I didn’t mind waiting.
Then they cancelled the flight. This was maybe two hours after when our flight was supposed to have left, which struck me as very early to cancel flights. They still had hours and hours to try to push flights out. Not only did they cancel the flight, they told us that there was no guarantee we’d be rebooked. Really. We all paid for a service, and they had no obligation to provide that service.
They sent us to a customer service counter behind security. Disheartened, we all shuffled down the concourse and stood in line. We waited. Then we were told that they couldn’t rebook us at that counter, and that we’d have to leave security and go through customs to rebook.
When you fly to the US from Toronto, you actually go through US customs in Canada. You’re in a special part of the airport that is, effectively, the USA. It’s usually very convenient. You land in a domestic gate and don’t have to deal with customs at your destination. But if you need to go back out of security, as we did, you have to go through Canadian customs. And they ask you questions like “Where are you coming from?” Um, “Gate 161″. “When’s the next flight?” Oh, hey, good question.
What this means is that, before you can get to the ticketing counter to rebook, you have to collect your bags. Another thing we had to wait for. When we got to the baggage carousal, there weren’t many people there. I think we were one of the first flights cancelled. But our bags didn’t come. We waited for nearly an hour and a half for our bags. Meanwhile, other flights were cancelled. Those passengers arrived, their bags came, and they went through customs. They were already out there filling up the rebooking line, while we were still waiting for our bags.
When we finally got through, we headed to the customer service desk, the one we were told to go to. There were surprisingly only about 20 people there. We waited. Airline employees told us we were in the right place to be rebooked. Then another employee told us we were in the wrong line. We had to go downstairs. More wasted time.
We went down to a mob of 300 people. I saw three employees working. If they can each process a customer in five minutes (and that’s being very generous, in my experience), that’s eight hours in line. And you just know they’re going to close when you get near the front of the line. (Yes, it’s happened. See paragraph 1 about my least favorite airline.) Eight hours to not even talk to anybody about the flight that they’ve already said they probably won’t even rebook? No thanks.
So I found a couple of guys who decided to drive the next morning. They offered to let me tag along. (Dean and Tony, if you happen across this blog, you helped a stranger in need, and I thank you.) But we were driving at 5:00 in the morning, so I had to get a hotel room. The only hotel near the airport with rooms available was the Crowne Plaza. Five hours of sleep at the Crowne Plaza: $172 USD. (This blog post is already too long, so I won’t go into how badly they screwed up when I tried to get a quick bite to eat at the bar.)
Here’s the kicker: While driving to the hotel, there was no snow. It stopped. This was not a surprise. Every TV in the airport is showing CP24. All the passengers were looking at the radar on their smart phones. We all knew the snow was going to stop. I have a hard time believing Air Canada didn’t know as well. And we could see two planes coming in for a landing at the same time, which means they got more runways open.
Flights were getting out. My friend Phil got home to the UK, and he flew later than I did. The snow stopped. The runways opened. Why did they cancel our flights? Because we were a small flight without enough passengers to care about. It wasn’t profitable to try to get us home. We didn’t matter. And that’s what pisses me off the most.




