points by janalsncm 2 years ago

Pretty cool, a js implementation of n-grams!

What is amazing to me is this: imagine that English only had 10,000 words. For each of those 10,000 words there’s 100 valid subsequent words. So there’s 1 million valid bigrams. Now if you want trigrams that takes you to 100 million, and for 4-grams it’ll be 10 billion. Just for that, you’d need 14 bytes per word and gigabytes of storage.

LLMs typically have context windows in the hundreds if not thousands. (Back in my day GPT2 had a context window of 1024 and we called that an LLM. And we liked it.) So it’s kind of amazing that a model that can fit on a flash drive can make reasonable next token prediction on the whole internet and with a context size that can fit a whole book.

10000truths 2 years ago

That's because a vast majority of the set of grammatically correct n-grams are not actually useful in practical language. "Colorless green ideas sleep furiously" is valid English, but it is semantically nonsense.

cchance 2 years ago

I mean ... technically Gemini 1.5 has 10 million context which just pushes the insanity further lol