Extended glyphs

From sona pona, the Toki Pona wiki

In sitelen pona, extended glyphs or long glyphs have a low horizontal line that extends under adjacent words.

Long pi[edit | edit source]

Under construction This section needs work:
  • Origin of long pi?
  • Image of nested pi
If you know about this topic, you can help us by editing it. (See all)

While not universal, an extended form of the word pi (pi) is very common. The low line continues under all glyphs in the pi phrase: pi(ijo ijo)

ma-pona pi(toki-pona)

ma pona pi toki pona

Nested pi[edit | edit source]

Caution: The subject of this section is nonstandard and will not be understood by most speakers.
If you are a learner, this information will not help you speak the language. It is recommended to familiarize yourself with the standard style, and to be informed and selective about which nonstandard styles you adopt.

Long pi makes it possible to visually represent nested pi phrases, a contentious grammatical structure that is ambiguous in the Latin script and in other modalities.

The extension of the outer pi would be wrapped under that of the inner pi. Because current sitelen pona fonts do not support nested extensions, here is an approximation with the inner pi unextended:

pi(ijo ijopiijo ijo)

This is not commonly used. Proficient speakers generally avoiding phrasings that use multiple pi, and if they do use it, they don't necessarily make a distinction between nested and unnested versions of extended pi.

Other extended glyphs[edit | edit source]

More recently, some speakers extend other characters:

  • a, typically without putting other characters on the extension line: a( ) This is meant to distinguish stretched a (aaaaa) from repeated a (a a a), although other solutions have been proposed.
    • n, similarly used across extended hesitations in speech: n( )
  • ala (ala) across the question-marking pattern X ala X: {ijo}ala(ijo)
  • anu (anu)
  • Most prepositions across their phrases, by analogy with pi phrases
    • kepeken (kepeken): kepeken(ijo)
    • lon (lon): lon(ijo)
    • tawa (tawa): tawa(ijo)
  • awen (awen) and kama (kama), by visual similarity to tawa: {ijo}awen(ijo) {ijo}kama

Long lon[edit | edit source]

Some speakers also use an underline on its own to mark a prepositional phrase with lon. This can be analyzed as omitting the dot from its glyph (lon), or merging it with the glyph above.

For example, mi lon tomo sona, normally written mi lon tomo-sona, would become mi (tomo-sona).

This style may stem from fonts that require pi to be manually extended under each character, making it easy to insert a low line not attached to any pi.

Text encoding[edit | edit source]

In the UCSUR, the following codepoints are assigned to extended glyph control characters:

  • 󱦓 U+F1993 SITELEN PONA START OF LONG PI
  • 󱦔 U+F1994 SITELEN PONA COMBINING LONG PI EXTENSION
  • 󱦗 U+F1997 SITELEN PONA START OF LONG GLYPH
  • 󱦘 U+F1998 SITELEN PONA END OF LONG GLYPH
  • 󱦙 U+F1999 SITELEN PONA COMBINING LONG GLYPH EXTENSION
  • 󱦚 U+F199A SITELEN PONA START OF REVERSE LONG GLYPH
  • 󱦛 U+F199B SITELEN PONA END OF REVERSE LONG GLYPH

Discouraged encodings[edit | edit source]

The "start of long pi", "combining long pi extension", and "combining long glyph extension" (as well as the "combining cartouche extension" that precedes the list above) are present for compatibility with fonts and environments without OpenType support. The fonts addressed require special, character-by-character input to invoke these features, such as typing an underscore before every glyph that should have an extension line below. These encodings are now discouraged.[1]

It is expected that proper Unicode support would deprecate these discouraged encodings in favor of the "start/end of (reverse) long glyph", with automatic OpenType extensions generated across all of the glyphs between.

Reverse extensions[edit | edit source]

The presence of the "reverse" versions is to disambiguate whether the extension line should connect with the glyph before, after, or both, when all are possible:

tawa(ma) ala tawa start of long glyph ma end of long glyph ala
tawa {ma}ala tawa start of reverse long glyph ma end of reverse long glyph ala
tawa(ma}ala tawa start of long glyph ma end of reverse long glyph ala

Zero-width extensions[edit | edit source]

Some fonts, such as sitelen seli kiwen, allow an extension line to be set up that spans no glyphs, ending immediately after it starts. This can be used to write implied extended glyphs, where a preposition is adjusted so that it would connect to an extension line, as in tawa() mi (implying tawa(mi)). Much like explicit extended glyphs, these variants can clarify a word being used as a preposition rather than a modifier.

Forward–reverse zero-width extensions can join multiple consecutive extendable glyphs at the bottom, as in ala(}ala(}ala, tawa(}awen(}kama, or tawa(}mi.

References[edit | edit source]

  1. jan Lepeka, jan Tepo. "Sitelen Pona: U+F1900 - U+F19FF". Kreative Korp. Retrieved 8 November 2023.