rayiner 19 hours ago

The article’s title is misleading: “The Man Who Created a Written Language for the Cherokee Did It So Efficiently and Elegantly, His Peers Thought It Was Magic.”

His peers thought it was magic because they were unfamiliar with the concept of writing, not because his writing system was so efficient. He was put on trial for witchcraft because people thought he was communicating via magic. https://education.nationalgeographic.org/resource/sequoyah-a....

  • Modified3019 19 hours ago

    For those just encountering this like me, the man in question was Sequoyah, a monolingual Cherokee. His own tribe put him on trial, being overseen by his Chief.

    Slightly different from what I’d normally assume had happened from just reading the above comment.

    Really impressive on his part, basically saw it was possible and looked as some examples of what others had done, then got to work.

    • rayiner 18 hours ago

      The notion that Sequoyah was a monolingual Cherokee is dubious. He had a European father (though he was raised with his mother) and worked as a trader and served in the U.S. Army. His cousin, to whom he presented his syllabary, was also half European, “George Lowery.” He had extensive contact with Europeans. Moreover, his syllabary includes adaptations of Latin, Greek, and Cyrillic letters. Part of the story is that he copied some character shapes from his wife’s family’s Bible. (Presumably they could read English if they had a Bible.) He was obviously exposed to a variety of European writing. He completed his syllabary in 1821, many years after his military service. It seems highly unlikely that someone who was so linguistically gifted to be able to invent a syllabary would not have picked up some familiarity with spoken and written English through that exposure.

      This article does a good job of reviewing the conflicting narratives of his history: https://www.jstor.org/stable/26467045. It’s all very uncertain, and there’s a lot of mythology.

      • TedDoesntTalk 15 hours ago

        I found it interesting that you used the term European several times, but never once the term American. He served in the American military, lived in America, had an American father (according to the article).

        So you consider 19th century America to be Europe, or is there another reason for your choice of words?

        • eesmith 13 hours ago

          I find it interesting that you think "served in the U.S. Army" isn't American enough for you.

          Foreigners can serve in the US Army. Native Americans weren't automatically US citizens until 1924, but were considered citizens of their sovereign tribe.

          European here clearly means both "from Europe" (eg, Latin, Greek, and Cyrillic letters are European, not American), as well as "European Americans" (ie, Americans of European ancestry, and often with cultural ties to Europe.) Just like how "Asian" doesn't always mean "born in Asia", or how "Anglo" can refer to non-Hispanic white Americans rather than being specifically related to England.

          Trading with the Spanish in Florida, English ships, or French trappers would all count as "contact with Europeans", and not simply "Americans".

          Finally, recall that at the time "American" was a state of mind. A Loyalist at the time would not consider themselves "American", and a Patriot considered a Loyalist to be "inimical to the liberties of America". How do you know if Sequoyah’s father was an American or a Loyalist?

          • TedDoesntTalk 6 hours ago

            Loyalists were gone by from America by the mid-1780s.

            • bcjdjsndon 5 hours ago

              He's owned you there Ted, made you look a right tool

            • karavelov 2 hours ago

              Article says he was born in 1770s, so his missing father can be Loyalist

        • simiones 7 hours ago

          "American" would be ambiguous in this context, right? Both the Cherokee and the English-speaking residents of the USA are American, but the specific point here was about whether Sequoyah was only a Cherokee speaker or if he had any knowledge of English, Spanish, French, Latin or any other European language. In this context, saying that Sequoyah's father was "American" would not make any point - it wouldn't tell anyone reading the comment that he was not Cherokee nor from any other native population; whereas European makes that point succinctly.

    • IIAOPSW 17 hours ago

      Its a real shame we don't have any transcripts or other court records from that hearing...for obvious reasons.

  • dang 15 hours ago

    Ok, we've changed the title using more representative language from the article.

    It's plenty interesting without superfluous claims!

    • rayiner 15 hours ago

      I didn’t mean to criticize the HN title—it accurately reflected the title on the linked page. I just thought the article’s choice of title was interesting given the rest of the story.

      • dang 15 hours ago

        On the contrary, your point was a great one and we want HN titles to be accurate! This is implicit in the ancient PG lore: "Please use the original title, unless it is misleading or linkbait (https://news.ycombinator.com/newsguidelines.html)

        It's helpful when HN readers do the actual work of understanding for us because we can't read even a tiny fraction of what gets posted here (and my capacity for even that is declining monotonically). But we're always happy to swap a title when someone posts an apt observation.

        • mrandish 14 hours ago

          > we can't read even a tiny fraction of what gets posted here

          I'll bet it's exhausting but your note did make ponder: If a soul was condemned to the eternal torment of reading nothing but all the user posts of one social media site for all eternity, HN would be a pretty excellent choice. I shudder to think of the alternatives.

  • onlypassingthru 13 hours ago

    There's a 1991 film (and earlier novel) called Black Robe that fictionalizes what it might've been like when the first Jesuit missionaries introduced this powerful black magic to the North American natives in the 17th century.[0]

    [0]https://www.youtube.com/watch?v=7cj_bSkuKVA

steve-atx-7600 16 hours ago

Not even an example of the glyphs??? Smithsonian must be another repository of clickbait like Forbes.

torben-friis 19 hours ago

>The syllabary was widely lauded, as its phonetic accuracy and simplicity made it far easier to grasp than English.

I mean, that feels like it's bound to happen when an alphabet is built to represent current language or pronunciation. English is notoriously awful for not doing that.

  • colechristensen 19 hours ago

    English is three* languages in a trenchcoat, all languages borrow but English in particular is a cobbled together mess. Like a salors' pidgin language except instead of sailors, driven by the ruling class of Britain at the boundary of several language families who kept conquering each other.

    *(or 7 or whatever number makes you feel best)

    • dataflow 19 hours ago

      Might be a mess linguistically, but it's sure nice to have only 26 letters with no accents on a keyboard.

      • colechristensen 19 hours ago

        >only 26 letters with no accents on a keyboard

        This was caused by the printing press and the typewriter (keyboard) both of which forced simplifications in the written English language.

        • lmm 17 hours ago

          And yet other languages have managed to resist those simplifications. So it's clearly not 100% forced.

          • colechristensen 3 hours ago

            Who says other languages haven't undergone significant changes?

        • ummonk 16 hours ago

          You just press backspace and hit the accent mark key or for a printing press stack the accent mark on top of the letter. People ditched accents because they were rarely used in English writing (only really being used for some loanwords), not because simplifications were forced by typewriters or the printing press (which handle non-English languages just fine).

          • colechristensen 14 hours ago

            For printing presses we're talking about the influence of the first printing presses hundreds of years before industrialization which were imported from Germany and even when they started making their own in England they were more like clones and used imported designs and parts. The early machines had a heavy influence on the written language particularly at times when under 1 in 10 people could write, and with the advent of movable type the people who learned to write were heavily influenced by what they read... books printed on German-design machines. You really only need one generation in a situation like that to dramatically change the language. Losing þ, æ, and ð

      • pocksuppet 18 hours ago

        long s and thorn would like to have a word with you, but they can't because they were removed from the keyboard

        In Unicode, that's ſ and þ. Both historical English letters that are no longer used.

        • colechristensen 17 hours ago

          "Ye Olde Mill" or whatever archaic silliness you'll find at fairs and whatnot was the result of the printing press dropping þ (as in þe, þ is just th-) and was never supposed to be pronounced with a "y" sound.

          "Ye Olde" ye was not the same word as "Hear ye, hear ye!", that ye is a plural 'you' basically the same word as "y'all" and never had a thorn.

          • cguess 13 hours ago

            Just to expand on this:

            "ye" in "ye Olde mill" is actually just "the" but originally "þe"/"þee". The first printing presses to England were imported from Germany, which never used þ, so printers used something that looked sorta similar, thus "y".

            "Ye" was a different word, the 2nd person non-formal version of "you" (which was historically formal: see-Shakespeare and how he played with "ye" and "you"). Thorn was on its way out along with "ð" both of which were in Middle English. The sounds didn't leave English, but we merged it into one letter cluster "th" (think "that" and "the", which have different th sounds).

          • tengwar2 5 hours ago

            This happened with more than one letter. For instance the Scots language had a letter yogh (https://en.wikipedia.org/wiki/Yogh), which was written somewhat like a rounded "3" but lower on the line. Early printers had only the characters of the English language, and since this character looked like a hand-written z, that is what they used in its place. Hence the name "Menzies" is pronounced "Ming-is", since that isn't actually a z.

            Welsh suffered more: it used to be full of "k"s. When the first Welsh Bible was printed, the English printer did not have enough "k"s, and substituted "c", and the language now does not use "k" at all. Apparently the printer's note on the matter still exists.

      • mootothemax 17 hours ago

        It’s great compression: Y sometimes a vowel, sometimes a consonant.

        And while not encoded on a keyboard, it still blows my mind that English has a crazy number of past tenses - and a such a bad hack of a future tense that it’s hard to classify as such.

        Linguistics is fun. The accents are alright.

        • tengwar2 5 hours ago

          Or English has only two tenses (present and past perfect) and everything else is done with modifiers.

      • Dylan16807 15 hours ago

        The pronunciation is so bad though. The consonants are mostly fine, but the way we write vowels is a total mess. We'd need at least a dozen vowel letters to sanely represent English. And we could cut a couple consonant letters to help make room, for maybe 30 letters total, still no accents.

        • krapp 15 hours ago

          Come now. English can be understood well enough through tough thorough thought.

          • nneonneo 3 hours ago

            Just today the NYT Strands puzzle gave a great example: you can find one set of prefixes that make each of the following rhyme, and a different set of prefixes that make them all sound different:

            -ooze -oose -ews -ues -use -oes -uise

            You can do this purely with prefixes ending in consonants, i.e. not by turning -use into -ouse, for example.

          • nneonneo 3 hours ago

            (spoilers for the little -ooze puzzle: for rhymes, booze choose brews blues ruse shoes cruise; for non-rhymes, snooze loose pews plagues obtuse toes guise; many others are possible, and rhyming or lack thereof may depend on accent).

        • mcswell 4 hours ago

          Blame that on Latin, which had only five vowels (not counting the long and short vowels separately).

    • yellowapple 16 hours ago

      Good languages borrow, great languages steal?

      • colechristensen 15 hours ago

        More like the repressed underclasses who kept getting conquered by foreign powers didn't overthrow their new masters but assimilated them and part of their language instead. Many times. Romans, early German-ish people, early norwegians, early french, early french who had been conquered by early norwegians, etc. (historical sticklers give me a break, it's two sentences not a doctoral thesis)

    • ianburrell 16 hours ago

      English is a West Germanic language with vocabulary from other languages, primarily French and Latin. But most of the core words are Germanic. It is not a pidgin whose defining feature is simplified grammar.

      • mcswell 4 hours ago

        English has sometimes been called a creole, i.e. what was a pidgin language but after it has been spoken by several generations of native speakers. One thing it lost some time around the Norman Conquest was the case marking phonology (apart from some pronouns).

        • zhengyi13 4 hours ago

          I think your point about the loss of case generally stands, but surely the genitive isn't lost?

  • Animats 19 hours ago

    There's an International Phonetic Alphabet for transcribing speech literally.[1] Automation is now available. Languages to IPA, IPA to various languages, text to speech, speech to text, evaluation of pronunciation.

    [1] https://easypronunciation.com/en/english-phonetic-transcript...

    • alex0015 19 hours ago

      The IPA still relies on convention to transcribe sounds. There's plenty of academic papers out there describing lesser studied languages and, if those conventions don't yet exist, the papers often contradict each other.

      A writing system that used strict phonetic transcription for everything would be unusably bad. Everyone pronounces words differently than the writing system prescribes, in every language. Words are shortened and blended together constantly in connected speech.

      • retroflexzy 18 hours ago

        > A writing system that used strict phonetic transcription for everything would be unusably bad.

        This is, for better or worse, what is being done to incorporate aboriginal names into things like streets and bridges in places like Vancouver.

        - [stal̕əw̓asəm Bridge](https://en.wikipedia.org/wiki/Stal%CC%95%C9%99w%CC%93as%C9%9...) - [šxʷməθkʷəy̓əmasəm Street](https://vancouver.ca/news-calendar/musqueamview-street-signs...)

        I see the practicalities of adopting this IPA-lite form, but it's a struggle to use, even though I've previously been trained in IPA.

        • alex0015 16 hours ago

          That's not quite what I meant by unusably bad, though that does have its own set of challenges for sure. I was just in Toronto for the first time and appreciated the designers of the Ojibwe Latin alphabet for pulling it off without diacritics.

          What's happening with your example is just that the symbols chosen for the phonemic transcription are non-Latin so they're unfamiliar to read aloud and harder to type for non-speakers. What I meant was if we all wrote with all of our individual idiosyncrasies of speech without converging on a prescribed standard (a writing system separate from speech transcription).

          "Amnu ge sum'm frum upsterz, gimmi u sek" but even more so, with IPA characters for all the 40-odd individual sounds of my dialect of English - then you write your response in the same level of phonetic detail. Exactly what a writing system shouldn't do.

        • mcswell 4 hours ago

          I'm not the person you're responding to, but I think what he meant when he said that a "strict phonetic transcription" would be bad is phonetic vs. phonemic. Most writing systems (apart from things like Chinese) represent (some of) the phonemes of the language, not the phones (not phonetic). For example, in English we have two kinds of p-sounds: one is found in words like 'pill', the other in words like 'spill'. We write them both the same, because which sound the letter should take is determined by the environment: after an /s/, it's pronounced without a puff of air, elsewhere (or mostly elsewhere) it's pronounced followed by a puff of air. It's actually hard for most native speakers of English to tell the difference, although speakers of languages like Thai, where the two sounds can appear in the same environment and can be used to distinguish different words, can hear the difference just fine.

          Bottom line: writing systems that are easy for native speakers to use, usually represent the phonemes of the language, not each phone.

  • reissbaker 19 hours ago

    Fun fact: all (non-Cherokee?) alphabets in use today stem from an ancient Canaanite alphabet called the proto-Sinaitic script [1]. This is why Hebrew's alphabet near-perfectly phonetically represents the spoken language: Hebrew is just a dialect of Canaanite, and all Canaanite dialects are mutually intelligible, and alphabets were invented to represent spoken Canaanite. As the alphabet was cribbed by the Greeks (who were taught a simplified version by seafaring Canaanites — the Phoenicians — and termed it the "Phoenician alphabet" [2] despite the Phoenicians not specifically inventing it), significant alterations had to be made and it's been an imperfect match for most Western languages ever since.

    1: https://en.wikipedia.org/wiki/Proto-Sinaitic_script

    2: https://en.wikipedia.org/wiki/Phoenician_alphabet

    • nvader 19 hours ago

      At least one counter-example: https://en.wikipedia.org/wiki/Hangul is technically an alphabet, and is non-Canaanite derived.

      • amluto 19 hours ago

        It's not quite in the same category, but there's also Zhuyin Fuhao:

        https://en.wikipedia.org/wiki/Bopomofo

        • komali2 15 hours ago

          I think the idea is that since the inventers of bopomofo were exposed to other alphabets, it's still considered a descendant alphabet. I usually think of descendant as something that visibly manifests its ancestry, so for example modern traditional characters look somewhat like the earliest Chinese characters, or, all romance languages sharing some sounds or even words. So maybe we need a different way to describe things like wheels and alphabets.

      • reissbaker 19 hours ago

        It wasn't directly cribbed (unlike Western alphabets), but given that Hangul was invented in the 1400s after exposure to Western alphabets, most scholars still consider alphabets to have only been invented once [1] and then copied, much like the wheel. Although I suppose that's true of Cherokee too!

        1: https://en.wikipedia.org/wiki/History_of_the_alphabet

    • rayiner 19 hours ago

      Egyptian hieroglyphics already had alphabetic elements, and the canaanites borrowed those: https://en.wikipedia.org/wiki/Egyptian_hieroglyphs (“Egyptian hieroglyphs are the ultimate ancestor of the Phoenician alphabet, the first widely adopted phonetic writing system”).

      • reissbaker 19 hours ago

        Egyptian heiroglyphs were not an alphabet, even if they had alphabetic elements (in addition to pictographic ones). Scholars generally agree that proto-Sinaitic was the first alphabet, and all subsequent alphabets used today are either direct descendants or directly inspired by it. https://en.wikipedia.org/wiki/History_of_the_alphabet

        • ummonk 17 hours ago

          Protp-Sinaitic was an abjad not an alphabet.

          • reissbaker 12 hours ago

            As per the Wikipedia links, it's generally considered by scholars to be the origin of all alphabets and an early alphabetic script. Abjad is a term invented in 1990 to distinguish early alphabetic scripts without vowels from later scripts with them. Effectively every scholar agrees that Canaanite/Aramaic/Hebrew/Arabic are alphabetic systems (while also acknowledging them as abjads).

    • fnordpiglet 19 hours ago

      My understanding is it’s the earliest known alphabet but not the ancestor to all alphabetic languages as there are Asian and other alphabetic languages that are not derived from western or Arabic alphabets. Specifically Greek and Latin alphabets and their descendants are based on it. Specifically Japanese Hiragana and Katakana are syllabic alphabets derived from kanji (and Chinese pictograms) as a simplification of the pictographic language and not derived from proto sinaitic. Others are possibly linked, like Thai, Khmer, etc through an Aramaic -> Brami-> Pallava->Khmer linkage but the Brami link is not fully established to be true.

    • andsoitis 18 hours ago

      Technically, the proto-Sinaitic script is an abjad, with the Greek alphabet being the first true alphabet (symbols for both consonants and vowels).

      Proto-Sinaitic/Phoenician can be described as the “first alphabetic system,” Greek the “first true alphabet.”

      Fun fact: Greek is the world’s oldest recorded living language.

      The Greek alphabet has been in use for approximately 2,800 years; previously, Greek was recorded in writing systems such as Linear B and the Cypriot syllabary.

      • applicative 17 hours ago

        Canaanite and its abjad have been in continuous use, in various versions, for more than 2,800 years. It's true there's no Linear B.

    • austin-cheney 18 hours ago

      Another counter-example is Phags Pa Script.

      https://en.wikipedia.org/wiki/%CA%BCPhags-pa_script

      • buildsjets 17 hours ago

        Explain. The wiki you linked to specifically states that it is descended from Tibetan script, which is in turn descended from Proto-Sinaitic script.

        • austin-cheney 16 hours ago

          The article said nothing like that. It was an original script invented by a Tibetan monk at the paid directions of Kublai Khan.

          > Descending from Tibetan script, it is part of the Brahmic family of scripts, which includes Devanagari and scripts used throughout Southeast Asia and Central Asia.[5] It is unique among Brahmic scripts in that it is written from top to bottom,[5] as how classical Chinese used to be written, and as the Mongolian alphabet or later Manchu alphabet is still written.

          https://en.wikipedia.org/wiki/%CA%BCPhags-pa_script

          > The origin of the script is still much debated, with most scholars stating that Brahmi was derived from or at least influenced by one or more contemporary Semitic scripts. Some scholars favour the idea of an indigenous origin,[19] or connection to the much older and as yet undeciphered Indus script[20][21] but the evidence is insufficient at best.

          https://en.wikipedia.org/wiki/Brahmi_script

          So maybe, but probably not and this particular language though it has roots elsewhere of debated origin was an original spontaneous creation.

          • cwnyth 15 hours ago

            "but probably not": Actually, probably so. The scholars who favor the indigenous explanation are a small minority outside of India. It's possible it was independent, but very, very doubtful, and none can explain the enormous gap in time between the Indus script and later Brahmic scripts.

            • austin-cheney 14 hours ago

              What does it matter if some scholars are from outside of India? All I am seeing are conclusions from unstated assumptions that appear to be drawn from a bias.

              My conclusions are coming directly from the Wikipedia articles that I linked to. If I am that wrong then edit the Wikipedia articles.

              • reissbaker 12 hours ago

                The Wikipedia articles say the majority of scholars believe it's based on Aramaic, while a minority of people (primarily non-linguistic-specialists in India) disagree. I think you're the one drawing from bias.

    • ummonk 17 hours ago

      "This is why Hebrew's alphabet near-perfectly phonetically represents the spoken language" - nonsense. That's just because modern Hebrew is based on the written language and thus reflects spelling pronunciation rather than historical pronunciation.

      Also, proto-Sinaitic is not an alphabet. That's why Persian writing became harder to read when they switched from the nearly alphabetic Old Persian cuneiform to Aramaic abjad descended from proto-Sinaitic.

      • dang 15 hours ago

        > nonsense

        Can you please make your substantive points without directing pejoratives at the other? This is covered in the site guidelines (https://news.ycombinator.com/newsguidelines.html):

        "When disagreeing, please reply to the argument instead of calling names. 'That is idiotic; 1 + 1 is 2, not 3' can be shortened to '1 + 1 is 2, not 3."

        Your comment would be just fine (indeed, excellent) without that bit.

      • reissbaker 12 hours ago

        No, modern Hebrew and ancient Hebrew mapped similarly well to the written script — the primary difference between the two is just consonant drift. Both used the same structure of triconsonant roots with affixed patterns, and modern Hebrew morphology is identical to ancient Hebrew (phonemes changed primarily due to consonant drift, but not its structure). Arabic, for example, is similar and similarly well-mapped to its script, as are other Semitic languages that are closely related to ancient Canaanite.

    • QuiDortDine 16 hours ago

      "and all Canaanite dialects are mutually intelligible": That is the definition of a dialect.

      Also, I don't know how you can claim Hebrew is phonetically represented by its alphabet rather than the other way around, as a revived language the pronunciations are largely a matter of convention based on Yiddish. It would be more accurate to say that modern Hebrew uses an ancient writing system, which happens to be closely related to the ancestor of modern European alphabets.

      See https://en.wikipedia.org/wiki/Revival_of_the_Hebrew_language

      • yellowapple 16 hours ago

        > That is the definition of a dialect.

        I dunno, some English dialects don't seem particularly intelligible to me, and I'm a natively fluent speaker of it.

        • QuiDortDine 16 hours ago

          This is like speciation but for languages: there's no "ah-ha!" moment, but we know a lemur can't produce viable offsprings with a zebra. Likewise we know Italian isn't French even though some words are kinda similar. If you want to be technical about it, it's a spectrum: I understand British people and people from the American deep South, but it's far from certain they will understand each other. Hard to be precise with social sciences.

          That said, two people who understand each other are, by any reasonable definition, speaking dialects of the same tongue (if not, obviously, the very same dialect).

      • reissbaker 12 hours ago

        Hebrew is not based on Yiddish, lol; only Ashkenazi Hebrew pronunciation was influenced by Yiddish. Modern Israeli Hebrew uses primarily Sephardi pronunciation, and Ashkenazi is mocked (i.e. Shabbat is Sephardi, Shabbos is Ashkenazi; modern Israeli Hebrew uses Shabbat). I grew up around Ashkenazi pronunciation in America, and had to unlearn it when I spent time in Israel. Nonetheless, Yemenite, Sephardi, and Ashkenazi Hebrew — the three major extant pronunciations, only one of which was ever influenced by Yiddish (Ashkenazi) — are all extremely similar and mutually intelligible, and thus all of them are extremely well mapped to the alphabet. Yemenite is most likely closest to the original spoken language, specifically the ע, but there are very few differences. And a modern Hebrew speaker can easily understand Biblical Hebrew — they're closer than even Modern English and Shakespearean.

        Also, not all colloquial dialects are mutually intelligible. Different Chinese dialects are still often referred to as "dialects," despite not being mutually intelligible (e.g. Cantonese vs Mandarin). While that's typically mostly the case for Western languages, there's a spectrum even there.

        • simiones 7 hours ago

          > And a modern Hebrew speaker can easily understand Biblical Hebrew — they're closer than even Modern English and Shakespearean.

          Of course, because modern Hebrew was constructed based on (the modern understanding of) Biblical Hebrew around the 1920s or slightly earlier, whereas Modern English naturally evolved for ~400 years from Shakespearean English and other forms of English.

          • rafram 5 hours ago

            That’s simply incorrect. Most of the innovations in Modern Hebrew (relative to Biblical Hebrew) came in the Mishnaic period, early CE. Hebrew continued to be used as a liturgical language, and occasionally a business language, both in its Biblical and Mishnaic forms, until the 1880s (not 1920s), when the Zionist movement brought it back into use for casual speech. The Hebrew used in the Mishnah is quite close to the modern written language, though it lacks modern words and some very recent innovations like topic-first sentences.

      • rafram 5 hours ago

        No, there is no linguistic definition of a dialect. It’s a purely political term. Hindi and Urdu are “languages” despite being nearly identical in their spoken forms; Moroccan Arabic is a “dialect” even though Lebanese Arabic speakers can’t understand it; Galician and Portuguese are separate “languages,” with a mysteriously precise dividing line right at the Portuguese border!

        Linguists elide over the whole thing by using the term “language variety.”

    • rafram 7 hours ago

      > This is why Hebrew's alphabet near-perfectly phonetically represents the spoken language

      It most certainly does not!

      I think you could more or less accurately make that claim about Standard Arabic, which has preserved a distinct sound for each letter and only rarely does things that you wouldn’t expect (tanween…).

      Modern Hebrew, by contrast, has merged many consonant sounds without merging their letters (sin and samekh, tav and tet), dropped the ayin sound and left the letter as a pseudo-vowel, and decoupled long vowel sounds from their long vowel carrier letters to the point that they’re essentially arbitrary (for each letter, you can find an example of it representing every single vowel sound).

      To your main point, though, the main commonality between Semitic scripts and western Latin/Greek-derived scripts is the rough order of the letters and some of the shapes. Latin alphabet isn’t an abjad, it has lots of letters that have no equivalent in Semitic… and it actually represents many languages very faithfully! English is an outlier. So I am not convinced by your argument.

    • kkkqkqkqkqlqlql 6 hours ago

      > This is why Hebrew's alphabet near-perfectly phonetically represents the spoken language

      Wasn't Hebrew dead for like 2000 years or something until the Israeli state was set up? Not hard to have a faithful alphabet when your spoken language is frozen in time. Hell, even evolving languages, like Spanish, can have somewhat phonetically accurate alphabets. As said in the other comment, English is more of an exception in that regard.

CPLX 20 hours ago
  • paleotrope 19 hours ago

    Amazing "By 1825, the majority of Cherokees could read and write in their newly developed orthography.[5]". It even has a reference so it must be true.

    • paleotrope 19 hours ago

      Anyway I put in a request to get a copy at my local library so I will update here in a few months when I have a copy of the book.

    • mcswell 4 hours ago

      Around the same time, Christian missionaries introduced writing (using an adapted Latin alphabet) to Hawai`i. Within ten years nearly the entire population (I would guess with the exception of older people) was literate. Mark Twain remarked on Hawai`ian literacy a few decades later.

  • tjmc 16 hours ago

    Thank you. A big omission from the original article.

HoldOnAMinute 17 hours ago

Now you have me wondering what is theoretically the most compact and efficient language, without using compression

  • sometimelurker 17 hours ago

    and now this reminds me of kolmogorov complexity

  • Wowfunhappy 17 hours ago

    I feel like you're going to run up against the definitions of "efficient" and "compression".

    For example, a language with a larger alphabet will be able to express more in fewer characters. Is that more efficient?

    Similarly, you could think of each word as a sort of lookup table for information in the mind of the reader. We don't define words as we're writing, we expect the speaker to know them already. If a language has more words, each word is more precise, and fewer words can be used to express an idea—but is that efficiency? You're just relying on the reader having more preexisting knowledge.

    • mcswell 4 hours ago

      > a language with a larger alphabet will be able to express more > in fewer characters.

      True, although it's not really the alphabet that determines this, it's the number of phonemes (distinctive sounds) in the language. For example, writing /s/ (the sound) sometimes with 's' and sometimes with 'c' does nothing to shorten words in English or Spanish.

      But in general, languages with fewer phonemes tend to have longer words (and tone languages often have very short words---in a sense, they have more phonemes than non-tone languages). Morphology (particularly compounding) often obscures this.

  • zhoBEENG 17 hours ago

    Claude Shannon talks about this in A Mathematical Theory of Communication. He defines redundancy as one minus relative entropy, where relative entropy is the ratio of the language's actual average uncertainty per symbol to the maximum possible uncertainty if all alphabet symbols were completely random and equally likely.

    He gives some rather cute examples, like the language of Finnegans Wake by Joyce being very low redundancy (high efficiency in your words). He also states that crossword puzzles don't work in a perfectly efficient language, that 50% redundancy is pretty good for 2-d puzzles, and 33% redundancy good for 3-d puzzles. This has always been one of my favorite and in my mind most random corollaries in a paper.

    https://people.math.harvard.edu/~ctm/home/text/others/shanno...