Recently, I've been thinking that domain-specific one-byte code pages are nicer than troublesome UTF-8 decoding or dealing with the atrocious mess of Unicode. See homoglyph attacks, zalgo text, ZWJ combining pictographs, etc. The only real upside I see to it is that it's nearly universally supported and allows uniformly encoded, multi-language text.
ASCII was designed at a time where combining characters were a useful hack and out-of-band communication with the teletype wasn't an option. Does anyone still use the ASCII separator, device control, or vertical tab characters for what they were designed for?
# This is written sort of like a Unicode mapping. # '#' signs are comments, the left hex number is the source byte, # and the right hex number is the Unicode equivalent of that character. # Shifted subset of ASCII 0x00 0x0020 # Space 0x01 0x0021 # Exclamation mark 0x02 0x0022 # Double quotation mark 0x03 0x0023 # Number sign 0x04 0x00A4 # Currency sign 0x05 0x0025 # Percent sign 0x06 0x0026 # Ampersand 0x07 0x0027 # Apostrophe, Right single quotation mark 0x08 0x0028 # Left parenthesis 0x09 0x0029 # Right parenthesis 0x0A 0x002A # Asterisk 0x0B 0x002B # Plus sign 0x0C 0x002C # Comma 0x0D 0x002D # Hyphen, Minus 0x0E 0x002E # Full stop 0x0F 0x002F # Solidus, Forward slash 0x10 0x0030 # Digit Zero 0x11 0x0031 # Digit One 0x12 0x0032 # Digit Two 0x13 0x0033 # Digit Three 0x14 0x0034 # Digit Four 0x15 0x0035 # Digit Five 0x16 0x0036 # Digit Six 0x17 0x0037 # Digit Seven 0x18 0x0038 # Digit Eight 0x19 0x0039 # Digit Nine 0x1A 0x003A # Colon 0x1B 0x003B # Semicolon 0x1C 0x003C # Less-than sign, Left angle-bracket 0x1D 0x003D # Equals sign 0x1E 0x003E # Greater-than sign, Right angle-bracket 0x1F 0x003F # Question Mark 0x20 0x0040 # 'At' sign 0x21 0x0041 # Latin uppercase 'A' 0x22 0x0042 # Latin uppercase 'B' 0x23 0x0043 # Latin uppercase 'C' 0x24 0x0044 # Latin uppercase 'D' 0x25 0x0045 # Latin uppercase 'E' 0x26 0x0046 # Latin uppercase 'F' 0x27 0x0047 # Latin uppercase 'G' 0x28 0x0048 # Latin uppercase 'H' 0x29 0x0049 # Latin uppercase 'I' 0x2A 0x004A # Latin uppercase 'J' 0x2B 0x004B # Latin uppercase 'K' 0x2C 0x004C # Latin uppercase 'L' 0x2D 0x004D # Latin uppercase 'M' 0x2E 0x004E # Latin uppercase 'N' 0x2F 0x004F # Latin uppercase 'O' 0x30 0x0050 # Latin uppercase 'P' 0x31 0x0051 # Latin uppercase 'Q' 0x32 0x0052 # Latin uppercase 'R' 0x33 0x0053 # Latin uppercase 'S' 0x34 0x0054 # Latin uppercase 'T' 0x35 0x0055 # Latin uppercase 'U' 0x36 0x0056 # Latin uppercase 'V' 0x37 0x0057 # Latin uppercase 'W' 0x38 0x0058 # Latin uppercase 'X' 0x39 0x0059 # Latin uppercase 'Y' 0x3A 0x005A # Latin uppercase 'Z' 0x3B 0x005B # Left square bracket 0x3C 0x005C # Reverse solidus, Backslash 0x3D 0x005D # Right square bracket 0x3E 0x005E # Caret 0x3F 0x005F # Underscore 0x40 0x0060 # Backtick, Left single quotation mark 0x41 0x0061 # Latin lowercase 'a' 0x42 0x0062 # Latin lowercase 'b' 0x43 0x0063 # Latin lowercase 'c' 0x44 0x0064 # Latin lowercase 'd' 0x45 0x0065 # Latin lowercase 'e' 0x46 0x0066 # Latin lowercase 'f' 0x47 0x0067 # Latin lowercase 'g' 0x48 0x0068 # Latin lowercase 'h' 0x49 0x0069 # Latin lowercase 'i' 0x4A 0x006A # Latin lowercase 'j' 0x4B 0x006B # Latin lowercase 'k' 0x4C 0x006C # Latin lowercase 'l' 0x4D 0x006D # Latin lowercase 'm' 0x4E 0x006E # Latin lowercase 'n' 0x4F 0x006F # Latin lowercase 'o' 0x50 0x0070 # Latin lowercase 'p' 0x51 0x0071 # Latin lowercase 'q' 0x52 0x0072 # Latin lowercase 'r' 0x53 0x0073 # Latin lowercase 's' 0x54 0x0074 # Latin lowercase 't' 0x55 0x0075 # Latin lowercase 'u' 0x56 0x0076 # Latin lowercase 'v' 0x57 0x0077 # Latin lowercase 'w' 0x58 0x0078 # Latin lowercase 'x' 0x59 0x0079 # Latin lowercase 'y' 0x5A 0x007A # Latin lowercase 'z' 0x5B 0x007B # Left curly bracket 0x5C 0x007C # Vertical line, Pipe 0x5D 0x007D # Right curly bracket 0x5E 0x007E # Tilde 0x5F 0x000A # Line feed, Newline # Greek letters, intended for mathematical notation 0x60 0x03B1 # Greek lowercase Alpha 0x61 0x03B2 # Greek lowercase Beta 0x62 0x03B3 # Greek lowercase Gamma 0x63 0x03B4 # Greek lowercase Delta 0x64 0x03B8 # Greek lowercase Theta 0x65 0x03BB # Greek lowercase Lamda 0x66 0x03BC # Greek lowercase Mu 0x67 0x03C0 # Greek lowercase Pi 0x68 0x03C4 # Greek lowercase Tau 0x69 0x03C6 # Greek lowercase Phi 0x6A 0x03C8 # Greek lowercase Psi 0x6B 0x03C9 # Greek lowercase Omega 0x6C 0x0394 # Greek uppercase Delta 0x6D 0x03A0 # Greek uppercase Pi 0x6E 0x03A3 # Greek uppercase Sigma 0x6F 0x03A9 # Greek uppercase Omega 0x70 0x00A1 # Inverted exclamation mark 0x71 0x00BF # Inverted question mark 0x72 0x2022 # Black bullet 0x73 0x25E6 # White bullet 0x74 0x00D7 # Multiplication sign 0x75 0x00F7 # Division sign 0x76 0x221A # Square root 0x77 0x221E # Infinity sign 0x78 0x263A # Outlined smiley face 0x79 0x263B # Filled smiley face 0x7A 0x2665 # Heart suit 0x7B 0x2666 # Diamond suit 0x7C 0x2663 # Club suit 0x7D 0x2660 # Spade suit 0x7E 0x00A7 # Section sign 0x7F 0x2588 # Full block