How to put Pinyin with tone marks on Web pages.

Displaying Pinyin without tone marks needs no explanation, because Pinyin uses no letters not found in English, other than ü, which is coded as ü. (Moreover, ü is seldom needed in Pinyin.) But if you want to display tone marks -- and in many cases you should -- then Unicode is the way to go.

In order to have a Web site with tonal Pinyin work across as many operating systems and browsers as possible, you need to use all of the following:

In addition, I have two recommendations:

proper "charset" declaration

Pages with tonal Pinyin need to be in Unicode, not in Big5 (used primarily in Taiwan) or GuoBiao (used primarily in China). Using any "charset" other than utf-8 (Unicode) is asking for trouble.

The only way in which utf-8 could be a problem is if your page uses rare characters that appear in Big5 or GB but do not yet appear in Unicode.

Here's what you should have in your code in the "head" of your Web page's HTML:

<meta http-equiv="Content-Type"
      content="text/css; charset=utf-8" />

Note: If your page contains Chinese characters and has a charset other than utf-8, changing "big5" to "utf-8" is not enough to solve your problem; you will also need to change the encoding of the Chinese characters to Unicode or to Unicode numerical character references (NCRs).

Tonal Pinyin and CSS

CSS is the best thing to hit the Web in years. If you're a webmaster and don't know CSS, it's time to learn. The basics are simple; and even just these will change for the better the way you approach making Web sites.

Because not all fonts have the necessary characters, if you want to put tonal Pinyin on a Web page you should include a font-family declaration in your style sheet. Here's what I use:

.py     {
   font-family: "arial unicode ms", "lucida sans unicode", 
         sans-serif !important;
   font-family: serif;
        }

This line and the one above are an IE hack, since IE doesn't recognize 'important'

Actually, I can probably come up with a better hack than that, since Firefox now behaves largely as IE does (at least in the good ways).

Accordingly, Pinyin text in your Web page needs to be assigned to a class. I use "py". Thus, to have "Hànyǔ Pīnyīn" in the middle of a phrase, you would use:

to have "<span class="py">H&agrave;ny&#468; P&#299;ny&#299;n</span>" in the middle of a phrase....

optimal coding for individual characters

This part can get a little tricky. See my test charts for tonal Pinyin in Unicode Web pages.

Guidelines:

Additional recommendations

Avoid bold in Pinyin text

Another tip: Avoid using bold, strong, or CSS's font-weight above 500 if your text has any umlauts (ü), because the dots will merge together (compare standard weight, ü ǖ ǘ ǚ ǜ, and bold, ü ǖ ǘ ǚ ǜ) in most of the fonts needed for tonal Pinyin. For bold-looking text, instead use the relatively heavy-looking Lucida Sans Unicode rather than the thinner Arial Unicode MS as the first choice in your font-family declaration.

Avoid italic in Pinyin text

Italic works a little better than bold; but it is still best avoided, especially for long runs of text.

Tonal Pinyin in italic:

ĀÁǍÀ ĒÉĚÈ ĪÍǏÌ ŌÓǑÒ ŪÚǓÙ ǕǗǙǛ NĀNÁNǍNÀN NĒNÉNĚNÈN NĪNÍNǏNÌN NŌNÓNǑNÒN NŪNÚNǓNÙN NǕNǗNǙNǛN NNĀNNÁNNǍNNÀNN NNĒNNÉNNĚNNÈNN NNĪNNÍNNǏNNÌNN NNŌNNÓNNǑNNÒNN NNŪNNÚNNǓNNÙNN NNǕNNǗNNǙNNǛNN āáǎà ēéěè īíǐì ōóǒò ūúǔù ǖǘǚǜ nānánǎnàn nēnéněnèn nīnínǐnìn nōnónǒnòn nūnúnǔnùn nǖnǘnǚnǜn nnānnánnǎnnànn nnēnnénněnnènn nnīnnínnǐnnìnn nnōnnónnǒnnònn nnūnnúnnǔnnùnn nnǖnnǘnnǚnnǜnn

Last updated: March 13, 2005