Monday, October 13, 2003

No, I don't want a T-shirt of this. It's incomplete. It omits the secret characters for codes 0x07-0x0A and 0x0D. It is apparently somewhat difficult to get Windows to display these, which probably explains their absense. Notepad, for instance, will display 7 and 8 if you use the Terminal font, but the rest are interpreted as literal control characters, which is of course what you would expect. If a program emits any of these characters to stdout, all four are intepreted as controls and no characters are displayed. Which leads me to believe that the afroementioned GIF is a screenshot of such a program running in a console window.

But they exist, and they have official Unicode mappings. See this document, which identifies the missing characters as:

• 2022 07 -- # BULLET
◘ 25D8 08 -- # INVERSE BULLET
○ 25CB 09 -- # WHITE CIRCLE
◙ 25D9 0A -- # INVERSE WHITE CIRCLE
♪ 266A 0D 02 # EIGHTH NOTE


The characters displayed above may or may not look like what was actually used on the IBM PC. I'm sitting here looking at pictures of the real deal in my Pink Shirt Book, and they are vaugely similar. On Windows, at least, the "Courier New" font does them better justice than our old friend Fixedsys.

Note that although the official Unicode name for U+266A is "eighth note", the actual glyph shown in the Pink Shirt Book looks more like a sixteenth note. Is this a genuine error in the mapping file supplied by Unicode.org?

The original way to display these character was to poke them directly into video memory with BASIC or assembly language, skipping stdout altogether. Mr. Price obviously didn't have time to make a program to do that, and so far neither have I. I have decided to cheat by consulting the Unicode specs.

Comments: