Tatoeba

From sona pona, the Toki Pona wiki
Under construction This article needs work:

more everything

If you know about this topic, you can help us by editing it. (See all)

Tatoeba is a multilingual database of sentences and their translations. All sentences are released under the CC BY 2.0 FR license, allowing people to reuse them with credit. Support for Toki Pona was first added on 13 November, 2010[1] (initially under the unofficial code toki, before switching to the ISO identifier on January 31, 2022[2]). As of March 2024, Tatoeba has over 56 000 sentences in Toki Pona, making it the second largest conlang on the site (trailing only Esperanto) and the 28th largest language overall (above e.g. Modern Greek, Bulgarian, Indonesian, and Hindi).[3]

Functionality[edit | edit source]

Under construction: This section is empty. You can help us by adding to it.

Moderation[edit | edit source]

There is no initial approval before a sentence is made public, so some sentences might be incorrect or might not align with commonly accepted Toki Pona usage. Sentences can be edited by their current owner or by a corpus maintainer; as of March 22, 2024, the sole corpus maintainer for Toki Pona is jan Tepan.[4] Advanced users can link and unlink sentences with analogous ones in other languages. A review feature is currently in beta and can be enabled in user settings.

Usage[edit | edit source]

Sentences from Tatoeba can be downloaded in bulk as TSV files, updated weekly.[5] The following Toki Pona tools are known to use the Tatoeba corpus:

External links[edit | edit source]

References[edit | edit source]

  1. (13 November 2010). "re #225. · Tatoeba/tatoeba2@243aecd". GitHub. Retrieved 22 March 2024.
  2. (31 January 2022). "Rename references to Toki Pona in database · Tatoeba/tatoeba2@26b38e3". GitHub. Retrieved 22 March 2024.
  3. "Number of sentences by language". Tatoeba. Retrieved 22 March 2024.
  4. "Tepan (Stephan Schneider)". Tatoeba. Retrieved 22 March 2024.
  5. "Download sentences". Tatoeba. Retrieved 22 March 2024.