Lexicalization

From sona pona, the Toki Pona wiki

Lexicalization occurs when a phrase becomes solidified as a unit with a fixed meaning. An English example is "high school", which only means a secondary school for higher education. It cannot refer to a school that is physically high up. Even though some phrases are in danger of becoming lexicalized through common use, Toki Pona tries to avoid lexicalization for various reasons.

Philosophy[edit | edit source]

The goal of Toki Pona is to break complicated concepts down into their important aspects, from the speaker's own perspective. This is a dynamic process as different features will be important at different times, in different contexts, and in different perspectives. The lack of lexicalization is by design.

The Toki Pona Dictionary confirms this with a "Warning Against Lexicalization!" in its "About the Dictionary" section:

[...] the whole point of Toki Pona is to meditate about what things mean to you personally, paying attention to the unique context around them, and to construct your own phrases using the building blocks provided by Toki Pona. Don’t think of the translations listed in this dictionary as the answers [...]

Toki Pona: The Language of Good presents a car as an example against lexicalization. To a passenger, a car might be tomo tawa ("moving room")[a]. To its driver, it might be ilo tawa ("going tool"). To a pedestrian that the car hit, it might be kiwen tawa ("hard moving thing") or kiwen utala ("hard hitting thing").[1] Beyond these examples, a parked car might not be tawa at all, but awen ("staying, unmoving"). Any phrase can refer to a car as long as there is appropriate context.

Avoid trying to find "the phrase" for whatever concept you're trying to express. Think about it deeply. What is important about it to you? What is important to mention?

Many concepts also come with cultural baggage, not fitting into Toki Pona's perspective. Any such lexicalization would lose a lot of nuance or import meaning dependent on a language not spoken by all listeners. It would also defeat the insight that Toki Pona is meant to provide.

For example, friendship means different things in different cultures. But what if jan pona ("good person") were lexicalized and always meant "friend"? Even if you think dogs aren't jan, you would call "man's best friend" jan pona instead of soweli pona ("good animal"). You would also lose the insight that a bad friend, jan pona ike, is a contradiction. This is because jan pona would be read as a unit, without thinking about what the individual words mean.

Size constraints[edit | edit source]

Because Toki Pona's vocabulary is so small, there are only so many phrases of convenient length to go around. In other words, Toki Pona has limited space for lexicalized compounds.

Let's estimate that there are about 120 content words[b]. The amount of 2-word phrases would be about 1202 = 14 400. That might sound like a lot, but many of these would be for single words from other languages, like tomo tawa for "car" or jan pona for "friend". English alone has over ten times as many words in current use, according to the Oxford English Dictionary.

If enough head–modifier phrases were reserved in this way, modifiers would become much less useful. For example, you could not translate "red ball" as sike loje, because that would refer to a fixed, more specific concept.

Considering the millions of concepts and phrases from all cultures, languages, and fields and subcultures with dedicated jargon, all 1 728 000 of the 3-word phrases could conceivably be filled up. Modifiers and simple prepositional phrases would become nearly useless—you could not translate "big car" as tomo tawa suli, because it would mean something else like "truck" instead.

Learning[edit | edit source]

Every phrase lexicalized is another thing for everyone to memorize.

Much of Toki Pona's popularity and charm comes from its small lexicon. There are only 134 commonly accepted words as of 2022. Even if you include the 16 multi-word phrases in pu's Phrase Book[c], and a couple dozen other lexicalized phrases, this would all still be well under 200 lexemes to learn.

If Toki Pona were more eager to lexicalize, that count would almost certainly enter the thousands. This would make the language far more difficult to learn, while costing it the appeal of its simplicity. The phrases would also be quite arbitrarily assigned, creating even more rote memorization.

Saying "the term for 'car' is tomo tawa" asserts it as the only recognizable term, rather than one possibility. In natural languages, if you don't use established phrases like these, you will sound weird and unnatural. Calling mittens "handcoats", or the Sun "the spacelight", will raise eyebrows, even though they are accurate descriptions; it is not a valid circumlocution tactic. By avoiding lexicalizations, Toki Pona has no register of "natural-sounding speech" beyond following its very few grammar rules.

Notes[edit | edit source]

  1. Ironically, tomo tawa has become semi-lexicalized anyway.
  2. The exact number of content words would vary.
  3. However, several of the Phrase Book entries can be interpreted as the literal sum of their words. ike a, mi olin e sina, etc. are completely transparent in their given meanings.

References[edit | edit source]

  1. Roc Morin (15 July 2015). "How to Say Everything in a Hundred-Word Language". The Atlantic.

    “What is a car?” Lang mused recently via phone from her home in Toronto.

    “You might say that a car is a space that's used for movement,” she proposed. “That would be tomo tawa. If you’re struck by a car though, it might be a hard object that’s hitting me. That’s kiwen utala.”