Linku

From sona pona, the Toki Pona wiki
Revision as of 10:43, 10 September 2023 by Menasewi (talk | contribs)

ijo Linku are the collection of data and tools around the Linku dictionary project.

Tools using Linku data

Linku provides the data set as a public json file, jasima Linku

  • lipu Linku, the main dictionary page
  • nimi.li, an interactive dictionary that extends the Linku data with addition words too obscure or unused to be considered for Linku yet.

Word usage surveys

The Linku team puts out annual word usage surveys, to update the database with the best information, and to allow users to filter the dictionary by their preferred usage cutoff point. kala Asi discussed surveying words in a segment for suno pi toki pona 2023.

Usage categories

Based on those surveys, Linku has assigned words to a few broad categories since 2022. These are a more granular, more frequently updated replacement for the book presence categories.

In the following tables, a bold line represents the cutoff for the categories that are selected by default.

Numbers are rounded to the nearest percentage point (0.5% rounds to 1%).

2023
Category Users
n = 868
Core [90%, 100%]
Widespread [70%, 90%)
Common [50%, 70%)
Uncommon [20%, 50%)
Rare [10%, 20%)
Obscure[a][b] [2%, 10%)
2022
Category Users
n = 345
Core [90%, 100%]
Widespread [70%, 90%)
Common [50%, 70%)
Uncommon [20%, 50%)
Rare [10%, 20%)
Obscure [1%, 10%)

  1. In the 2023 results post, the obscure category is split into a high end [5%, 10%) and low end [2%, 5%) purely for readability.
  2. New words below 2% usage are considered not notable for inclusion in the dictionary. Words below this threshold that are already included are planned to be moved into a separate sandbox resource. As of the publication of the 2023 results, this is yet to be done.

Survey results