One area for improvements in this release is line breaking.
We don’t have TeX-style automatic hyphenation yet (although it may happen eventually). But at least, Pango inserts hyphens now when it breaks a line in the middle of a word (for example, at a soft hyphen character).
This is something i have wanted to do for a very long time, so I am quite happy that switching to harfbuzz for shaping on all platforms has finally enabled us to do this without too much effort.
Better line breaks
Pango follows Unicode UAX14 and UAX29 for finding word boundaries and line break opportunities. The algorithm described in there is language-independent, but allows for language-specific tweaks. The Unicode standard calls this tailoring.
While Pango has had implementations for both the language-independent and -dependent parts before, we didn’t have them clearly separated in the API, until now.
In 1.44, we introduce a new pango_tailor_break() function which applies language-specific tweaks to a segment of text that has a uniform language. It is meant to be called after pango_default_break().
Line break control
Since my focus was on line-breaking already, I’ve added support for a text attribute to control line breaking. You can now say:
Don't break <span allow_break="false">here!</span>
in Pango markup, and Pango will obey.
In the hyphenation example above, the words showing possible hyphenation points (like im‧peachment) are marked up in this way.
Another area with significant changes is placement, both of lines and of individual glyphs.
Up to now, Pango has been placing the lines of a paragraph directly below each other, possibly with a fixed amount of spacing between them. While this works ok most of the time, a more typographically correct way to go about this is to control the baseline-to-baseline distance between lines.
Fonts contain a recommended value for this distance, so the first step was to make this value available with a new pango_font_metrics_get_height() API.
To make use of it, we added a new parameter to PangoLayout that tells it to place lines according to baseline-to-baseline distance. Once we had this, it was very easy to turn the parameter into a floating point number and allow things like double-spaced lines, by saying
pango_layout_set_line_spacing (layout, 2.0)
You can still use the old way of spacing if you set line-spacing to 0.
Pango no longer rounds glyph positions and font metrics to integral pixel numbers. This lets consumers of the formatted glyphs (basically, implementations of PangoRenderer) decide for themselves if they want to place glyphs at subpixel positions or pixel-aligned.
The cairo renderer in libpangocairo will do subpixel positioning, but you need cairo master for best results. GTK master will soon have the necessary changes to take advantage of it for its GL and Vulkan renderers too.
This is likely one of the more controversial changes in this release—any change to font rendering causes strong reactions. One of the reasons for doing the release now is that it gives us enough time to make sure it works ok for all users of Pango before going out in the next round of upstream and distro releases in the fall.
Finally, I spent some time implementing some long-requested features around missing glyphs, and their rendering as hex boxes. These are also known as tofu (which is the origin of the name for the Noto fonts – ‘no tofu’).
Some fonts don’t have a glyph for the space character – after all, there is nothing to draw. In the past, Pango would sometimes draw a hex box in this case. This is entirely unnecessary – we can just leave a gap of the right size and pretend that nothing happened. Pango 1.44 will do just that: no more hex boxes for space.
On the other hand, sometimes you do want to see where spaces and other whitespace characters such as tabs, are. We’ve added an attribute that lets you request visible rendering of whitespace:
<span show="spaces">Some space here</span>
This is implemented in the cairo backend, so you will need to use pangocairo to see it.
In the same vein, sometimes it is helpful to see special characters such as left-to-right controls in the output. Unicode calls these characters default-ignorable.
The show attribute also lets you make default-ignorables visible:
<span show=”ignorables”>Hidden treasures</span>
As you can see, we use nicknames for ignorables.
Pango has been shipping a simple tool called pango-list for a while. It produces a list of all the fonts Pango can find. This can be very helpful in tracking down changes between systems that are caused by differences in the available fonts.
In 1.44, pango-list can optionally show font metrics and variation axes as well. This may be a little obsure, but it has helped me fix the CI tests for Pango.
This release contains a significant amount of change; I’ve closed a good number of ‘teenage’ bugs while working on it. Please let us know if you see problems or unexpected changes with it!