The article seems quite editorialized, shifting between describing "large-scale AI models" and "neural network-based approaches".
The underlying paper itself is more precise, comparing against LUAR, a 2021 method based on bert-style embeddings (i.e. a model with 82M parameters, which is 0.2% the size of e.g. the recent OS Gemma models). I don't fault the authors of the paper at all for this, their method is interesting and more interpretable! But you can check the publication history, their paper was uploaded originally in 2024: https://arxiv.org/abs/2403.08462
A good example of why some folks are bearish on journals.
"AI bad" seems to sell in some circles, and while there are many level-headed criticisms to be made of current AI fads, I don't think this qualifies.
Are you prepared to demonstrate a superior result with models newer than those available when the research was done? Can you suggest a candidate experiment design to test your hypothesis?
I don't see it. Seems even-keeled for the most part. Not a polemic.
"Researchers found that a relatively simple, linguistically grounded method can perform as well as - and in some cases better than - complex artificial intelligence systems in identifying authorship.
The study suggests that increasingly sophisticated AI is not always necessary for high-performing writing analysis, particularly when methods are designed around established principles of how language works."
If there's one problem that LLMs have solved, it's language. While an LLM may hallucinate, it does so in grammatically correct English sentences. Additionally, even the local version of gemma-4-26B can seamlessly switch between languages in the midst of a conversation while maintaining context. That's perhaps the most exciting part for me: We have a bonafide universal translator (that's Star Trek territory) and people seem more focused on its factual accuracy.
Kind of a tangent I guess, but the coolest thing about Star Trek’s universal translator to me was that it could translate new languages mid-conversation with an extremely small amount of data. Makes me wonder how close we might be able to get to that eventually
I don't know. I think the translator functioned perfectly. It accurately provided a translation of all of the information that the alien intended to communicate. The translator can't translate things that are not intended to be communicated via spoken language.
Like...it doesn't work on cats. It doesn't translate smiles.
If "nafnfowmfowl grtakkan ssshelpik" means "Temba, his arms wide" then the translation was correct. The fact that the crew mostly faileld to understand the communication is on them, not the translator.
This can also be seen in Data. Just because he can speak all sorts of languages and understand a majority of communications, he doesn't really know Riker's love of Troi because his translation matrix isn't designed for that in the same way that the ship's isn't designed to interpret intent.
You don't even need to go to Darmok to see this limitation, really. The fact that the translator provides human translations of klingon speech is not enough to understand their cultural intricacies. The same could be said of "pon farr" being a "time of mating". That knowledge alone could not have prepared Kirk for the trial or even hinted at there being a trial.
The translator is neither mind reader nor cultural liason. If it were we wouldn't have needed Troi.
Language is not about just grammatically correct sentences, it’s about expression, intent, and communication that goes beyond the spoken, written or even motioned word—not one of these things is in the realm of possibility for current (and dare I say even future) AI.
Your Star Trek comparison is also incorrect. Following your logic, we’ve had a “bonafide universal translator” for a while now with websites like Google Translate (and so on). But none of these websites or AI tools are capable of learning languages on the fly purely from context and with minimal input data (that’s the magic of Trek’s UT, what they call linguacode).
Using LLMs for everything is going to be seen as a big fad in a few years. First we try them for everything, then we find what use cases actually make sense, then we scale back. Woe betide our 401(k)s when it happens, though.
The stock market crashes once in a while. Shit happens. The long-term outlook is unlikely to change nearly as much, unless you think there will be systemic macroeconomic changes.
I just cannot wait for the “bubble bursting” moment, so to speak. It’s tiresome to be force fed this AI bullshit all the damn time, knowing full well it is not going to last.
Ha! To think that we're finally back to asking ourselves why we are using generative models for categorization and extraction. I wonder how much money has collectively been wasted by companies wittling away at square pegs.
Yeah, LLMs are a solution to the cold start problem plus they are easy to integrate and if you know what you're doing in terms of evals, post processing and so on you can get excellent performance out of them, plus they can do semantic classification and reasoning that you won't get out of some bespoke traditional DS/ML model.
They amortized the creation of corpuses with trainable features, not the myriad of methods that can categorize text with a success rate in the levels required by high-stakes industries.
The article seems quite editorialized, shifting between describing "large-scale AI models" and "neural network-based approaches".
The underlying paper itself is more precise, comparing against LUAR, a 2021 method based on bert-style embeddings (i.e. a model with 82M parameters, which is 0.2% the size of e.g. the recent OS Gemma models). I don't fault the authors of the paper at all for this, their method is interesting and more interpretable! But you can check the publication history, their paper was uploaded originally in 2024: https://arxiv.org/abs/2403.08462
A good example of why some folks are bearish on journals.
"AI bad" seems to sell in some circles, and while there are many level-headed criticisms to be made of current AI fads, I don't think this qualifies.
Are you prepared to demonstrate a superior result with models newer than those available when the research was done? Can you suggest a candidate experiment design to test your hypothesis?
I don't see it. Seems even-keeled for the most part. Not a polemic.
"Researchers found that a relatively simple, linguistically grounded method can perform as well as - and in some cases better than - complex artificial intelligence systems in identifying authorship.
The study suggests that increasingly sophisticated AI is not always necessary for high-performing writing analysis, particularly when methods are designed around established principles of how language works."
I might be misinterpreting but the LUAR model (which is a transformer) seems to do decently well
https://www.nature.com/articles/s41599-025-06340-3/figures/2
Yes, the paper itself tells a different story than the bullet points in this article.
If there's one problem that LLMs have solved, it's language. While an LLM may hallucinate, it does so in grammatically correct English sentences. Additionally, even the local version of gemma-4-26B can seamlessly switch between languages in the midst of a conversation while maintaining context. That's perhaps the most exciting part for me: We have a bonafide universal translator (that's Star Trek territory) and people seem more focused on its factual accuracy.
Tbh. The accuracy of translation is, while much better than prior methods, not that great yet. For tamil atleast.
Kind of a tangent I guess, but the coolest thing about Star Trek’s universal translator to me was that it could translate new languages mid-conversation with an extremely small amount of data. Makes me wonder how close we might be able to get to that eventually
https://en.wikipedia.org/wiki/Darmok
TL;DR, probably never.
I don't know. I think the translator functioned perfectly. It accurately provided a translation of all of the information that the alien intended to communicate. The translator can't translate things that are not intended to be communicated via spoken language.
Like...it doesn't work on cats. It doesn't translate smiles.
If "nafnfowmfowl grtakkan ssshelpik" means "Temba, his arms wide" then the translation was correct. The fact that the crew mostly faileld to understand the communication is on them, not the translator.
This can also be seen in Data. Just because he can speak all sorts of languages and understand a majority of communications, he doesn't really know Riker's love of Troi because his translation matrix isn't designed for that in the same way that the ship's isn't designed to interpret intent.
You don't even need to go to Darmok to see this limitation, really. The fact that the translator provides human translations of klingon speech is not enough to understand their cultural intricacies. The same could be said of "pon farr" being a "time of mating". That knowledge alone could not have prepared Kirk for the trial or even hinted at there being a trial.
The translator is neither mind reader nor cultural liason. If it were we wouldn't have needed Troi.
Language is not about just grammatically correct sentences, it’s about expression, intent, and communication that goes beyond the spoken, written or even motioned word—not one of these things is in the realm of possibility for current (and dare I say even future) AI.
Your Star Trek comparison is also incorrect. Following your logic, we’ve had a “bonafide universal translator” for a while now with websites like Google Translate (and so on). But none of these websites or AI tools are capable of learning languages on the fly purely from context and with minimal input data (that’s the magic of Trek’s UT, what they call linguacode).
No, AIs have not “solved” language in any way.
For the most widespread languages. There are thousands it still fails badly on.
I wonder if this approach can be used to determine whether a text was generated by a specific LLM.
Take a guess which llm provided this response to your comment:
wait ,, imagine actually clocking the LLM like that ??? lowkey big brain energy tbh . no cap that would eat frfr bro is literally cooking,,
I'd be curious to see them try and find Satoshi Yakamoto with back-to-basics and see if they do beter than the guy in the nyt last week https://www.nytimes.com/2026/04/08/business/bitcoin-satoshi-...
It should be obvious that LLMs would be able to beat this with ease. Not sure why this paper deliberately skipped comparing to current LLMs
Example of LLMs doing well in similar tasks: https://arxiv.org/abs/2602.16800
Using LLMs for everything is going to be seen as a big fad in a few years. First we try them for everything, then we find what use cases actually make sense, then we scale back. Woe betide our 401(k)s when it happens, though.
This is a concise statement of what I've tried to articulate by analogizing it to railroad infra buildout.
What applications do you think make the most sense so far?
Search, code review, some form of translation...
The paper did not compare against LLMs though.
> Woe betide our 401(k)s when it happens, though.
The stock market crashes once in a while. Shit happens. The long-term outlook is unlikely to change nearly as much, unless you think there will be systemic macroeconomic changes.
Long-term relative to lifespan of the 401K holder. Outcome changes a lot for those who are ready to retire.
I just cannot wait for the “bubble bursting” moment, so to speak. It’s tiresome to be force fed this AI bullshit all the damn time, knowing full well it is not going to last.
Agreed .. your two comments cancel each other out? https://news.ycombinator.com/item?id=47787688
Ha! To think that we're finally back to asking ourselves why we are using generative models for categorization and extraction. I wonder how much money has collectively been wasted by companies wittling away at square pegs.
> why we are using generative models for categorization and extraction
Because LLM models have already amortized the man-years cost of collecting, curating and training on text corpuses?
Yeah, LLMs are a solution to the cold start problem plus they are easy to integrate and if you know what you're doing in terms of evals, post processing and so on you can get excellent performance out of them, plus they can do semantic classification and reasoning that you won't get out of some bespoke traditional DS/ML model.
They amortized the creation of corpuses with trainable features, not the myriad of methods that can categorize text with a success rate in the levels required by high-stakes industries.