ASCII: Difference between revisions
No edit summary |
(Prepared the page for translation) |
||
Line 1:
<languages/>
{{Short description|Character encoding standard}}
{{lipu pona
|blurb='''[[Special:MyLanguage/ASCII|ASCII]]''' is a character encoding standard that features 95 printable characters, limited to the basic Latin alphabet and symbols on the United States keyboard layout. This character set is the basis of several [[Special:MyLanguage/Toki Pona|Toki Pona]] projects, due to its availability compared to specialized keyboards and [[Special:MyLanguage/input method|input method]]s.
}}
'''ASCII''' ('''American Standard Code for Information Interchange''') is a character encoding standard updated from 1963 to 1986, and a precursor to [[Special:MyLanguage/Unicode|Unicode]].
ASCII features only 128 codepoints, many of which are {{w|control character}}s, and its 94 {{w|printable character}}s are limited to the {{w|ISO basic Latin alphabet|basic Latin alphabet}} (the 26 letters also comprising the {{w|English alphabet}}) and {{w|Typographical symbol|symbols}} on the {{w|British and American keyboards|United States keyboard layout}}. This character set is the basis of several [[Special:MyLanguage/Toki Pona|Toki Pona]] projects, due to its availability compared to specialized keyboards and [[Special:MyLanguage/input method|input method]]s.
</translate>
{{Printable ASCII}}
<translate>
==In writing systems==
Multiple Toki Pona [[writing system]]s have been based on ASCII.▼
▲Multiple Toki Pona [[Special:MyLanguage/writing system|writing system]]s have been based on ASCII.
{{tp|[[sitelen akesi]]}} is an ASCII-compatible adaptation of {{tp|[[sitelen pona]]}}. Glyphs may be represented as one or more characters.▼
</translate>
{{tok|[[jan Misali]]}}'s [[toki pona ASCII syllabary]] assigns each [[phonotactic]]ally allowed [[syllable]] in Toki Pona to a single ASCII character.▼
{{tp|[[Special:MyLanguage/sitelen akesi|sitelen akesi]]}}
▲
</translate>
{{tok|[[Special:MyLanguage/jan Misali|jan Misali]]}}
▲
==In fonts==
===ASCII transcription===
<div style="float:right;margin-left:1em;">
{|class="wikitable" style="width:180px;"
|+style="font-size:smaller;"|Sample output of a [[Special:MyLanguage/sitelen pona font|{{tp|sitelen pona}} font]] with ASCII transcription
|-
!Rich text
Line 38 ⟶ 49:
|}
</div>
Many [[Special:MyLanguage/font|font]]s for original Toki Pona writing systems include a feature called '''ASCII transcription''',<ref>{{cite web|url=//docs.google.com/spreadsheets/d/1xwgTAxwgn4ZAc4DBnHte0cqta1aaxe112Wh1rv9w5Yk/preview|title={{tok|nimi Linku}}|author=|username=|date=|website=Google Sheets|publisher=|access-date=2024-01-17|quote=}}</ref> where a run of characters typed in {{tp|[[Special:MyLanguage/sitelen Lasina|sitelen Lasina]]}} is visually substituted with a single glyph. This simulates the effects of an {{abbr|IME|input method editor}} without the need to install [[Special:MyLanguage/software|software]] beyond the font, but unlike these, it does not actually replace the underlying text. Other ASCII characters are commonly used for additional features in the writing system or font, such as the hyphen (-) being used for {{tp|sitelen pona}} [[Special:MyLanguage/combined glyph|combined glyph]]s.
</translate>
[[File:Word default no ligatures.tiff|thumb|180px|<translate>''ASCII transcription'' works like automatic Latin ligatures, where multiple glyphs are visually substituted with one.</translate>]]
<translate>
This feature uses {{w|OpenType}} {{w|Ligature (writing)|ligatures}}. It works in the same way that some Latin-script fonts convert the letters <code>ffi</code> into a ligature that <em>looks</em> like the single glyph <code>ffi</code>, but is actually still 3 underlying characters.
Line 55 ⟶ 68:
|{{tok|<code>toki-pona</code>}}
|
*Falls back to legible {{tp|[[Special:MyLanguage/sitelen Lasina|sitelen Lasina]]}}
*Writing system not specified
*Less standardization for font features
|-
![[Special:MyLanguage/UCSUR|UCSUR]] encoding
|{{tok|1=<span class="sitelen-pona">󱥬‍󱥔</span>}}
|{{tok|<code>󱥬‍󱥔</code>}}
|
*Falls back to illegible {{w|.notdef|small rectangles}} (colloquially "tofu")
*Writing system specified as {{tp|[[Special:MyLanguage/sitelen pona|sitelen pona]]}}
*More standardization for font features
|}
Line 71 ⟶ 84:
==References==
<references />
</translate>
{{SP nav}}
{{Software}}
{{Fonts|collapsed=yes}}
[[Category:Encodings{{#translation:}}]]
|
Revision as of 11:35, 4 May 2024
ASCII (American Standard Code for Information Interchange) is a character encoding standard updated from 1963 to 1986, and a precursor to Unicode.
ASCII features only 128 codepoints, many of which are control characters, and its 94 printable characters are limited to the basic Latin alphabet (the 26 letters also comprising the English alphabet) and symbols on the United States keyboard layout. This character set is the basis of several Toki Pona projects, due to its availability compared to specialized keyboards and input methods.
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
2x | SP | !
|
"
|
#
|
$
|
%
|
&
|
'
|
(
|
)
|
*
|
+
|
,
|
-
|
.
|
/
|
3x | 0
|
1
|
2
|
3
|
4
|
5
|
6
|
7
|
8
|
9
|
:
|
;
|
<
|
=
|
>
|
?
|
4x | @
|
A
|
B
|
C
|
D
|
E
|
F
|
G
|
H
|
I
|
J
|
K
|
L
|
M
|
N
|
O
|
5x | P
|
Q
|
R
|
S
|
T
|
U
|
V
|
W
|
X
|
Y
|
Z
|
[
|
\
|
]
|
^
|
_
|
6x | `
|
a
|
b
|
c
|
d
|
e
|
f
|
g
|
h
|
i
|
j
|
k
|
l
|
m
|
n
|
o
|
7x | p
|
q
|
r
|
s
|
t
|
u
|
v
|
w
|
x
|
y
|
z
|
{
|
|
|
}
|
~
|
DEL |
In writing systems
Multiple Toki Pona writing systems have been based on ASCII.
is an ASCII-compatible adaptation of sitelen pona. Glyphs may be represented as one or more characters.
jan Misali 's toki pona ASCII syllabary assigns each phonotactically allowed syllable in Toki Pona to a single ASCII character.
In fonts
ASCII transcription
Rich text | Plain text |
---|---|
m | m
|
mu | mu
|
mut | mut
|
mute | mute
|
Many fonts for original Toki Pona writing systems include a feature called ASCII transcription,[1] where a run of characters typed in sitelen Lasina is visually substituted with a single glyph. This simulates the effects of an IME without the need to install software beyond the font, but unlike these, it does not actually replace the underlying text. Other ASCII characters are commonly used for additional features in the writing system or font, such as the hyphen (-) being used for sitelen pona combined glyphs.
This feature uses OpenType ligatures. It works in the same way that some Latin-script fonts convert the letters ffi
into a ligature that looks like the single glyph ffi
, but is actually still 3 underlying characters.
As a result, when ASCII transcription is used, the underlying characters are still ASCII-compatible instead of being converted to other codepoints. This may or may not be desired. You can confirm the difference by copying the rich text below, and pasting it into a plain text environment such as Windows Notepad or a search bar.
Rich text | Plain text | Comparison | |
---|---|---|---|
ASCII transcription | toki-pona | toki-pona
|
|
UCSUR encoding | |
|
|
Fonts with ASCII transcription may help with memorizing words, as the substitution will only occur when the word is spelled correctly, and in logographies (such as sitelen pona) the resulting glyph often relates to the word's meaning.
References
- ↑ "nimi Linku". Google Sheets. Retrieved 17 January 2024.
Features | Words · Combined glyphs · Extended glyphs · Radicals · nasin sitelen kalama (pi linja lili) |
---|---|
Usage | History · Literature · Fonts (Guidelines) · UCSUR · ASCII · Wakalito |
Text input | Fonts · Wakalito · Autocorrect · Text-to-speech · ASCII |
---|---|
Standardization | ISO 639-3 · UCSUR |