in two ways, really.
apparently rendering of kanji on ubuntu is really awful. each character looks different from the next. a dirty secret of pango comes out.
i attended behdad’s talk about how all of this works at guadec and pango decides what glyph will be used for a specific character on a character-at-a-time basis. this means that you can easily have different fonts chosen for different characters in the same string. what’s more — pango seems to like to pick from fonts with fewer glyphs (thinking that they’re more specific and therefore better) — “less is more”.
the problem is that this is often an evil policy.
these fonts look quite different from each other. if you look at the screenshot, the right radical (“bird”) in each shot should look identical. they are very different.
sure enough, if you put these characters side by side in a string, you get one drawn in each font.
in this case, arial unicode (which i installed myself) has complete coverage of (at least nearly all) kanji. however, on a default ubuntu install you have kochi gothic which is used by preference. remove kochi gothic and you have baekmuk dotum.
if you remove both of these fonts, however, you’re left with just arial unicode (if you’re blessed enough to have this font). now things are beautiful. less is more.
i’m not up on all of this font stuff but why is it that we don’t have a big free font with all of the glyphs for every language in it? if it does exist then why isn’t ubuntu shipping it?
also — why can’t pango do a sort of prescan on the string to look at all of the characters in the string and do its hardest to try to pick a font in which all of the characters exist and use that for every character? even if there are higher quality glyphs for some of the characters in one font, it sort of seems more important that the string is displayed in a consistent font. would this be entirely too expensive?
of course, on the other side of this argument, you have the case of a single (for example) kanji character appearing inside a huge block of latin text and causing the latin to be rendered in a lower quality junk font that just happened to be inclded in the same file as the kanji…