User:.hecko/on nimi ku

From sona pona, the Toki Pona wiki

tl;dr jan sin o stop treating the damb book like gospel

nimi ku is not a checklist

i've seen a few people (jan sin, granted) proclaim that they use "all nimi ku", which i find very strange

nimi ku was never meant to be a coherent word set (see e.g. the inclusion of both san and tuli)

if you have 4 words that are each used by 1 in 10 people, then (assuming independence) all 4 of them will be used by 1 in 10000 people which is more than we have tokiponists

of course the probabilities aren't truly independent, as people who use nimisin tend to use multiple at a time, but even adjusting for that i highly doubt there's anyone who uses all nimi ku other than because they're in a book

nimi ku lili is a low standard

out of the 40 nimi ku lili listed in the free version of ku, a whopping 18 are used by less than 10% of people

and given that the surveys usually had around 15 people each, that means many of those might've only been written once, by one person, in one survey

pray tell, why would kalamARR (a word made up on the spot by one person during survey time) be "more legitimate" than jetuno (a word made up on the spot by me just now), other than the fact that people started using it after it was included in the book

[tom scott voice] if you accept the definition that a toki pona word is some letters written in a dictionary, then tsii, aAANUSEMEmailMahjong, and poak23087[i23ej&(^!(#!@&$_(HEQq0j[1] are all words

note that "all nimi ku lili are valid unless they break phonotactics" isn't much better; i've seen more proficient speakers use nja (2+) than pomotolo (0)

  1. canonically that should be AAAAAAAAAAAAAAAAA but i typed it here minutes before that message so not gonna let it ruin my meme format; also symbols aren't letters but see previous

ku is biased

this one's a three-parter hold on to your trousers

community bias

the ku surveys were done entirely and solely in ma pona, which while being large isn't representative of the entire tp community, so kokosila was included but konsi wasn't (ok to be fair a separate korean poll might've been too difficult but you get the point)

time bias

puwa was invented just a bit too late to be included in the surveys, but lipu Linku la as of 2022-08 it's twice as popular as e.g. kan

doesn't help that ma pona was particularly nimisinful at that time, which might've led to the em-ku·suli-ment of epiku

prompt bias

ok so here's how the surveys worked: every so often jan Sonja posts a google form with a few dozen english words/phrases, people write them out in toki pona, rinse and repeat

most of them are from a list of some thousand common words but there are some prompts injected manually by jan Sonja, and Gee I Wonder if putting "speak another language in a Toki Pona only environment" might add enough bias to scoot it up to the 50% threshold (yes i'm gonna clown on kokosila all i want)

popularity ≠ goodness

a word being commonly used doesn't mean it's good: epiku is a mark of jan sin, oko is controversial, kokosila is kokosila (oh baby a triple), even the pu gender words are starting to become passé

what popularity does imply is understandability, but even that's kinda iffy due to the aforementioned biases

personally i like the lipu tenpo approach of allowing writers to use nimi ku suli, but then explaining them in footnotes using nimi pu taso (which does call into question the point of using non-pu words at all)