* <<G12.0914>> Unicode sucks. And so does ASCII. And everything in-between. The problem with plain-text is that it isn't really an encoding for ~written language~, is it? What we call "plain-text" is a sequence of graphemes, numerals, punctuation (written human-language elements), and miscellaneous graphic symbols, interleaved with control codes for the operation of a teletypewriter. CARRIAGE RETURN and LINE-FEED are, of course, not things you do with a pen or pencil – you may be willing to concede that your arm is a carriage, and that you mentally "feed" paper away from you as you move down the page, but I am not – nor are they things a compositor does with his composing stick, galleys, and formes. *** -a family of devices that have their own characteristics quite outside those of the stylus, brush, or printing press. *** Some of the ASCII control codes that should be used/thought-of as written-language codes instead of teletype codes: 01 SOH Start-of-Heading Actually a useful semantic code. Headings are generally indicated by placement of text within a page. If we want to decouple the formatting from the language, we need markers like this to indicate when text is a heading, and when it is body text. 10 LF Line Feed; should be New Line The UNIX \n "newline" character. That's what this should represent: a new line, not a 1-line paper feed operation. 12 FF Form Feed; should be New Page 32 space; should be Word Separator -- Excerpted from: PUBLIC NOTES (G) http://alph.laemeur.com/txt/PUBNOTES-G ©2016 Adam C. Moore (LÆMEUR) <adam@laemeur.com>