Text compression

From sona pona, the Toki Pona wiki
Revision as of 23:57, 10 January 2024 by Menasewi (talk | contribs) (Created page with "Toki Pona's small size has attracted interest in '''text compression''' techniques. Some writing systems are created expressly for this purpose. In March 2010, inspired to compress Toki Pona text to use fewer characters on {{w|Twitter}}, {{tok|jan Mato}} collated several potential lossy and lossless compression schemes. Of the options presented, Toki Pona Script was noted as having the best {{w|Data compression ratio|compression ratio}},<ref>{{cite web|url=h...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Toki Pona's small size has attracted interest in text compression techniques. Some writing systems are created expressly for this purpose.

In March 2010, inspired to compress Toki Pona text to use fewer characters on Twitter, jan Mato collated several potential lossy and lossless compression schemes. Of the options presented, Toki Pona Script was noted as having the best compression ratio,[1] and is lossless. Owing to poor Unicode support for Toki Pona Script at the time, jan Josan and jan Mato created a sitelen Kansi character set in July of that year.[2] Later equivalents to Toki Pona Script include the Sitelen Pona UCSUR block and the sitelen Emosi writing systems, which also only use one Unicode character per word.

jan Misali's ASCII syllabary allows each syllable to be reduced to 7 bits. Any punctuation would be lost upon conversion into this system, and there is no recommendation for how to mark proper names.

References

English Wikipedia has an article on
text compression.
  1. [janMato (original poster), zeme]. (19 March 2010). "Best compression for toki lili?". Toki Pona Forums. Retrieved 10 January 2024.
  2. [janMato (original poster), janKipo, jan Josan, et al.]. (18 July 2010). "toki pona in chinese/kanji?". Toki Pona Forums. Retrieved 10 January 2024.
This page is a stub. You can help us by expanding it.