Home
Bio
Fridge
Goldfish
Blog
Patriot Savant
Grumble Magazine
Bray New World
Sounding Board
Folded

Unicode for the Universe:

How To Put Thøse Weïrd Chæracters
on Your Web Page

"There are more things in heaven and earth, Qwerty, than are dreamt of in your keyboard."

Today's lesson: getting more mileage out of your character set; how to stop typing "I <3 something"; fün wïth ümläüts.


Once upon a time, computers came with a mere 256 characters. That was more than enough to cover the alphabet (upper- and lowercase), punctuation, and the accent marks used by freedom-loving people like the Belgians. But it didn't have all the weird letters used by the Poles or Hungarians – but screw them, because they were godless Communists.

Soon, however, our Eastern European brethren threw off their oppressive yoke and embraced Pepsi and American PCs, and the weird brown and yellow people seemed to want to be able to type their stuff, too, and so we got ISO Latin and then Unicode on our computers.

Sadly, many folks still seem stuck with the old-school 256. They resort to the colon for an umlaut (Fra:ulein) or an apostrophe for an accent mark (resume'). Tragic, because the common fonts like Times New Roman, Arial, and Courier (that is, the ones you find on the Web) all support Unicode nowadays.

Well, maybe not all of Unicode; it's really long, and the folks who made your browser and fonts didn't feel it necessary to implement the weird upside-down J used by the Upper West Basques or whatever. But they've supported it enough to cover your needs. And hey, if your browser can't handle the character, it'll usually display it as a harmless box or question mark.


Web shortcuts: the lovely gang at the W3C have defined a bunch of easy shortcuts for wacky characters. For instance, to get an "a" with an umlaut over it, you type &auml; into your text. That's an ampersand to say "this is a special character", then an "a", then "uml" for umlaut, and then a semicolon to say "my special character is done now".

If you want a capital "A", use a capital A so that you get &Auml; instead. This works for most vowels. (Sadly, it does not work for the letter "n", so you cannot yet spell "Spinal Tap" correctly.)

For acute and grave accent marks (as in French and Spanish) like á or à, use &aacute; or &agrave;. Again, capitalize the letter to get a capital. Or perhaps you've been bitten by a møøse using &oslash;.

There are a crapload of handy "character entities", as they are known, like ® (&reg;), © (&copy;), † (&dagger;) and € (&euro;). There's also the full Greek alphabet, so that I can properly type ΑΔΦ quite easily as &Alpha;&Delta;&Phi;. And I can say "I ♥ you" with &hearts;. Or "I ♦ you", "I ♠ you", or "I ♣ you" if you're the type who prefers &diams;, &spades; or &clubs;.

Stop typing "--" and use en dashes (–) and em dashes (—) instead with &ndash; and &mdash;.

Find out more with these quick lists of HTML entities.


Unicode: Unicode is a massively comprehensive list of every letter you'd ever want, like a Scrabble set designed by an obsessive-compulsive. Characters are indicated with a number. On the web, you type these as &#NNNN; where NNNN is the number.

Check out this massive list of what numbers generate what. Boggle at the number of Korean, Chinese and Japanese characters therein; rejoice that you speak a language with 26 letters and no diacriticals.

Observe, my friends, how your favorite fonts contain characters for drawing boxes, chess pieces and musical notes, zodiac signs (if you have no faith in science or reason), along with a lot of what used to exist as a separate font called "Wingdings" or "Dingbats". (IE users may find these unsupported; yet another reason to use Firefox).

So now your blog can now say "♫ Happy birthday to me! ♬" if you are so inclined. Or "Happy ", whatever the hell that thing is. You can ask for gifts from Toys Я Us.

Unicode does not support Klingon or Tolkien's Elvish, because there is a God.

So cut loose! Discuss poker (♥ ♦ ♠ ♣) or even Lucky Charms ( ). Blog in Polish using the słashes ŧhrough ŧhe consonanŧs. I'm sure you'll ♥ it.