More text rendering updates

There is a Pango 1.44 release now. It contains all the changes I outlined recently. We also managed to sneak in a few features and fixes for longstanding bugs. That is the topic of this post.

Line breaking

One area for improvements in this release is line breaking.

Hyphenation

We don’t have TeX-style automatic hyphenation yet (although it may happen eventually). But at least, Pango inserts hyphens now when it breaks a line in the middle of a word (for example, at a soft hyphen character).

Example with soft hyphens

This is something i have wanted to do for a very long time, so I am quite happy that switching to harfbuzz for shaping on all platforms has finally enabled us to do this without too much effort.

Better line breaks

Pango follows Unicode UAX14 and UAX29 for finding word boundaries and line break opportunities.  The algorithm described in there is language-independent, but allows for language-specific tweaks. The Unicode standard calls this tailoring.

While Pango has had implementations for both the language-independent and -dependent parts before, we didn’t have them clearly separated in the API, until now.

In 1.44, we introduce a new pango_tailor_break() function which applies language-specific tweaks to a segment of text that has a uniform language. It is meant to be called after pango_default_break().

Line break control

Since my focus was on line-breaking already, I’ve added support for a text attribute to control line breaking. You can now say:

Don't break <span allow_break="false">here!</span>

in Pango markup, and Pango will obey.

In the hyphenation example above, the words showing possible hyphenation points (like im‧peachment) are marked up in this way.

Placement

Another area with significant changes is placement, both of lines and of individual glyphs.

Line height

Up to now, Pango has been placing the lines of a paragraph directly below each other, possibly with a fixed amount of spacing between them. While this works ok most of the time, a more typographically correct way to go about this is to control the baseline-to-baseline distance between lines.

Fonts contain a recommended value for this distance, so the first step was to make this value available with a new pango_font_metrics_get_height() API.

To make use of it, we added a new parameter to PangoLayout that tells it to place lines according to baseline-to-baseline distance. Once we had this, it was very easy to turn the parameter into a floating point number and allow things like double-spaced lines, by saying

pango_layout_set_line_spacing (layout, 2.0)
Line spacing 1, 1.5, and 2

You can still use the old way of spacing if you set line-spacing to 0.

Subpixel positions

Pango no longer rounds glyph positions and font metrics to integral pixel numbers. This lets consumers of the formatted glyphs (basically, implementations of PangoRenderer) decide for themselves if they want to place glyphs at subpixel positions or pixel-aligned.

Non-integral extents

The cairo renderer in libpangocairo will do subpixel positioning, but you need cairo master for best results. GTK master will soon have the necessary changes to take advantage of it for its GL and Vulkan renderers too.

This is likely one of the more controversial changes in this release—any change to font rendering causes strong reactions. One of the reasons for doing the release now is that it gives us enough time to make sure it works ok for all users of Pango before going out in the next round of upstream and distro releases in the fall.

Visualization

Finally, I spent some time implementing  some long-requested features around missing glyphs, and their rendering as hex boxes. These are also known as tofu (which is the origin of the name for the Noto fonts – ‘no tofu’).

Invisible space

Some fonts don’t have a glyph for the space character – after all, there is nothing to draw. In the past, Pango would sometimes draw a hex box in this case. This is entirely unnecessary – we can just leave a gap of the right size and pretend that nothing happened.  Pango 1.44 will do just that: no more hex boxes for space.

Visible space

On the other hand, sometimes you do want to see where spaces and other whitespace characters such as tabs, are. We’ve added an attribute that lets you request visible rendering of whitespace:

<span show="spaces">Some space here</span>
Visible space

This is implemented in the cairo backend, so you will need to use pangocairo to see it.

Special characters

In the same vein, sometimes it is helpful to see special characters such as left-to-right controls in the output.  Unicode calls these characters default-ignorable.

The show attribute also lets you make default-ignorables visible:

<span show=”ignorables”>Hidden treasures</span>

Visible default-ignorable characters

As you can see, we use nicknames for ignorables.

Font information

Pango has been shipping a simple tool called pango-list for a while. It produces a list of all the fonts Pango can find.  This can be very helpful in tracking down changes between systems that are caused by differences in the available fonts.

In 1.44, pango-list can optionally show font metrics and variation axes as well. This may be a little obsure, but it has helped me fix the CI tests for Pango.

Summary

This release contains a significant amount of change; I’ve closed a good number of ‘teenage’ bugs while working on it. Please let us know if you see problems or unexpected changes with it!

8 thoughts on “More text rendering updates”

  1. Dropping bitmap font support is a no-no as far as I am concerned…

    So, I guess I’ll create my own private fork, to be still able to enjoy Adobe Helvetica on my Linux systems… :-(

  2. Lovely work, thanks! I searched for “Pango text markup” to learn more, and Google produced links to the “wrong” Pango doc: obsolete pygtk, broken link on developer.gnome, and information for gtk 2.6.

  3. That kind of small (but difficult) improvements that make me happy. Good job!

  4. Is there a way to disable the “new way” of glyph placement through configuration alone? I’m using [autohinter=off, hintslight, 96dpi, subpixel rendering, lcddefault filter] and the update to 1.44 caused gnome native apps to have letter spacing issues that are so bad that I had to revert to 1.43. With 1.44 majority of letters in words are too close together and some are too far apart. Some letter combinations look like ligatures, like in the word “Terminal” the ‘e’ is tucked under the crest of letter ‘T’ and the remaining ‘rminal’ is too far from the first two letters, which looks really awkward. Also it seems that there are “verticality” issues as well, for example in the word “All” the two ‘L’s look like they are slightly above the baseline of letter ‘A’ and those two ‘L’s are not of the same width. Not all parts of gnome exhibit this problem, Nautilus file list is OK but Nautilus menus and preferences pages are a mess.

Comments are closed.