After dissing Anthropic for limiting Mythos, OpenAI restricts access to Cyber

techcrunch.com

139 points by gbourne1 1 day ago

2ndorderthought 1 day ago

"my model is the most dangerous"

"No mine is the most dangerous"

"Nuh uh mine is"

"Mine could kill everyone!"

"Mine could do it faster!"

"Prove it!!!"

This is where we are

davidgrenier 1 day ago

Yeah I guess two companies who would otherwise be considered going for bankruptcy have models too expensive to run. As they don't see themselves making money any time soon, they have to turn every future model into a weird fascination.
- cyanydeez 1 day ago
  
  think about it in the form of who can pay. theyre at b2b. and swiftly moving to government.
  
  2ndorderthought 1 day ago
  
  All that user data is a huge asset for government contracts.
- DivingForGold 23 hours ago
  
  China’s DeepSeek prices new V4 AI model at 97% below OpenAI’s GPT-5.5
  Did somebody say that Elon is stealthly funding: Seven lawsuits filed against OpenAI by families of Canada mass-shooting victims
  As always, when the going get's tough, the tough ultimately resort to lawsuits.
  
  VorpalWay 22 hours ago
  
  If the difference is that large, it seems plausible to me that the Chinese models are subsidized in order to gain market share, this is not exactly the first time the Chinese government has done so (or at least been rumoured to have done so).
  You should assume that everyone has a hidden agenda when money is involved.
  
  joe_mamba 22 hours ago
  
  > it seems plausible to me that the Chinese models are subsidized in order to gain market share
  In this case, this point is kinda moot since the entire US and SV tech ecosystem, has been subsidized first by the US defense industry during the cold war, and after by the US government funded VCs by its unique cheat-code ability to infinitely print the world reserve currency with little to no inflation consequences upon its own economy, and dump it on its tech sector or on the free market to buy foreign competitors before they become a challenge, in order to be ahead of everyone else.
  Given this, I find criticisms of China's state subsidize to pale in comparison, when we talk about what is "fair".
  
  VorpalWay 1 hour ago
  
  Absolutely a fair point. And I wrote:
  > You should assume that everyone has a hidden agenda when money is involved.
  As an European there is little difference between what US is doing and China is doing when it comes to tactics. The particulars may differ, the end result is similar. Traditionally I could at least say that US was more democratic and as such was preferable, but that argument seems to be gradually weakening.
  
  wirybeige 21 hours ago
  
  Pricing for DeepSeek V4 flash is $0.14 in/$0.28 out across basically every provider or close to it. It seems most providers just follow the model creator and set their prices to match. V4 pro was set to be $1.74 in/$3.48 out when DeepSeek first announced it; all providers have set their prices to be about that price, & now DeepSeek has set their pricing to $0.435 in/ $0.87 out. I don't know if this is special pricing, or the promise they made for dropping the price when they get more Huawei cards online. It seems that providers like ParaSail, Together, and Novita just set the price when the model comes out and don't compete.
  
  philistine 21 hours ago
  
  No one has yet to turn a profit from LLMs. I don't understand why we need to intently look at everybody's pricing, when the most important number is instead their losses. That is the number that tells us what they're really doing.
  
  nickthegreek 20 hours ago
  
  OpenRouter isnt turning a profit?
  
  wirybeige 20 hours ago
  
  Why would these 3rd-party providers be taking losses? Together, Novita, etc... are not losing money on inference services, they are profiting. You can easily do napkin math with current & last gen Nvidia cards to calculate cost to host/serve these models. I would also doubt that any 1st-party providers like OpenAI and Anthropic lose money on per token billing. There is almost undoubtedly healthy margin being made on that.
  
  andriy_koval 18 hours ago
  
  > Why would these 3rd-party providers be taking losses?
  we are in market capture phase. Domestically hosted Chinese LLMs is a descent market to capture.
  
  2ndorderthought 20 hours ago
  
  Why do other model providers who host deepseek v4 have it cheaper than other offerings? Is the Chinese government subsidizing other model providers who download their models for free?
  
  alfiedotwtf 10 hours ago
  
  While the US government hasn’t invested directly in OpenAI, their $200M contract would/has give them some control
  
  dyauspitr 21 hours ago
  
  It’s their promo price till the end of May. It’s also not nearly as good as 5.5. I’ve had 3 different tasks just this week that deepseek has failed at that 5.5 does perfectly.
  
  dyauspitr 16 hours ago
  
  I just tried to do some large scale summarization and Deepseek v4 pro is pretty shit. This cannot even touch 5.5. 5.5 took 3 mins and the output was stellar, Deepseek took 20 and did not adhere to the output format at all after multiple attempts.
  
  alfiedotwtf 10 hours ago
  
  And it’s not even at a release stage - Deepseek v4 is still at beta, and llama.cpp doesn’t even support it yet.
  Once it gets to release (they have said they are still adding features and multi-modes like vision) and llama supports it, I think you’ll see a huge asymmetric price point between east and west SOTA models
- throwyawayyyy 20 hours ago
  
  There's a story to tell in that: 1) Google has a transformer-based AI that hallucinates too much to release 2) OpenAI replicates the tech then YOLOs it 3) Everyone says: look how Google is getting left behind! Google thinks: the second mouse gets the cheese. 4) Google gets the cheese, OpenAI is absorbed by Microsoft or just disappears (or both).
  
  JeremyNT 19 hours ago
  
  Certainly could turn out that way.
  TPUs were their real moat. All that capacity used throughout their suite of products on non-chatbot features, ready to rip for consumers once soon as somebody else opened the floodgates to the public.
  Now all their competitors lose money on every token paying their cloud providers (of course it's funny money, maybe they're just giving the cloud providers equity) while Google is sitting calmly over there, actually owning everything they need for any eventuality, and beholden to nobody.
  
  alfiedotwtf 8 hours ago
  
  Are TPUs that much faster than GPUs? Sorry, I’ve been totally sleeping on TPUs
- alfiedotwtf 10 hours ago
  
  They could easily branch out paid for vanity products like “personalised models” that tell the user whatever they want to hear
vasco 1 day ago

Would AGI start by hacking competing labs to hamper their progress?
- Avicebron 1 day ago
  
  You'll have to define what you mean by AGI
  
  fodkodrasz 1 day ago
  
  AGI: Automatically Generating Income
  
  gordonhart 1 day ago
  
  This is a surprisingly concrete and defensible definition of AGI.
  
  Avicebron 1 day ago
  
  Is it defensible? It sounds like a thin disguise over "income for me but not for thee"?
  
  red-iron-pine 17 hours ago
  
  that's just capitalism
- cdrnsf 22 hours ago
  
  No, because AGI is a fantasy.
brikym 1 day ago

It's like that phone call in The Big Short where Goldman suddenly change their mind once they hold a position.
concinds 1 day ago

These models demonstrably have good vulnerability research capabilities.
I'm sure their marketing department is ecstatic but you guys are far more hype-based than what you're calling out.
- ZyanWu 1 day ago
  
  > demonstrably
  I'm not entirely up to date on each week's LLM hype train/scandal but last I heard there was no public access to it or public-trusted 3rd parties that can review model's capabilities
  
  2ndorderthought 1 day ago
  
  You are up to date. Mythos had unauthorized access because of poor security but that's it as far as I know. Not exactly a good sign for something being advertised as a weapon...
  
  saghm 22 hours ago
  
  You'd think if Mythos was so good at finding security issues they could point it at their own setup for it and have found those issues easily...
  
  SpicyLemonZest 1 day ago
  
  It’s easy to end up with no public-trusted third parties if we arbitrarily distrust third parties who say the capabilities match what’s promised. Mozilla for example says it found hundreds of Firefox vulnerabilities, and I think it’s pretty unlikely they’re lying to cover Anthropic’s back.
  
  calgoo 23 hours ago
  
  I think the question around the Firefox find, is not that they found hundreds of vulnerabilities - they found hundreds of bugs.
  What would be really interesting is a side by side Claude Opus 4.7 and Mythos comparison.
  
  concinds 21 hours ago
  
  I don't think so
  https://x.com/AISecurityInst/status/2049868227740565890
- authnopuz 1 day ago
  
  Good but not necessarily better that was is already pay-as-you-go available today. ref. https://www.flyingpenguin.com/the-boy-that-cried-mythos-veri...
  This AISLE benchmark is interesting in this matter: https://aisle.com/blog/ai-cybersecurity-after-mythos-the-jag...
  And the recently discovered Copy Fail by Xint code is another proof that the gating is overblown: https://xint.io/blog/copy-fail-linux-distributions
  
  aesthesia 20 hours ago
  
  Calling the AISLE experiment a "benchmark" is generous. They tested three code snippets on each model.
boringg 23 hours ago

Marketing stunts. The equivalent of holding a line outside a popular bar.
- basisword 23 hours ago
  
  Given the USG has asked Anthropic not to release Mythos I'd wager it's more than a marketing stunt.
  
  boringg 23 hours ago
  
  It can be both and I don't know how much I would trust the USG as the canary in the coal mine given their technical readiness typically seems low across most institutions in that they are probably more exposed because they haven't shored up their systems.
noosphr 23 hours ago

Remember that they have been saying that since gpt2.
I didn't think crying could be such a successful business model.
- lesuorac 23 hours ago
  
  It's just "thinking past the sale" which they've been doing forever.
  i.e. "I'm so worried that our capped for-profit structure will limit your returns when we make over 1 Trillion in profit".
- neuronexmachina 21 hours ago
  
  People keep on mentioning gpt2, but it's worth recalling that back in 2019 it was basically the first model that was capable of zero-shot generation of coherent multi-paragraph text. Having it write security exploits like Mythos wasn't even on the radar. Rather, the concerns were about misuse and societal implications, which in retrospect were pretty prescient: https://openai.com/index/gpt-2-6-month-follow-up/
  
  shepherdjerred 20 hours ago
  
  Also Open AI/ Sam admit that the concerns were quite silly in retrospect
cedws 23 hours ago

Can't wait for the Chinese models to completely wipe the floor with them in 6 months.
- peddling-brink 22 hours ago
  
  Ominous phrasing.
- SubiculumCode 22 hours ago
  
  I doubt it. By not releasing it, Chinese companies will be unable to break TOS and use it to acquire high quality training data...which, I suspect, is how they've kept pace
  
  cedws 21 hours ago
  
  Z.AI, Moonshot, DeepSeek all have a pipeline of data of their own now due to capturing a slice of the market through cheap tokens. It's not impossible to imagine that they might share the data too if the CCP thinks that will help their AI strategy.
  
  SubiculumCode 19 hours ago
  
  No. Most data generated this way is poor quality. It's not the user responses and or queries. If the user does not know better than the LLM, you can generate bad responses. The value is in taking a superior model, submitting a query, and getting a higher quality output than you yourself could have generated, and using that to boost your model.
  
  Tostino 19 hours ago
  
  You identify users doing real work and implementing a project over a long period of time and train on their traces.
  
  cedws 17 hours ago
  
  AI companies have been using synthetic data for ages now. The data doesn't need to yield new insights to be useful for training.
- dyauspitr 21 hours ago
  
  If deepseek is anything to go by they are still significantly behind.
verve_rat 22 hours ago

Yup, we are somewhere between "my model can beat up your model" and "you wouldn't know my model, it lives in Canada".
This is the world we live in.
RajT88 22 hours ago

I am convinced the models are not as good as they say, but everyone benefits from the continued AI hype, so nobody says so.

jwr 1 day ago

I have no idea why people still even attempt to believe anything that comes out of Altman's mouth. Do we not learn from the past?

apples_oranges 1 day ago

Idk about Altman, I missed that he’s a bad guy now apparently, but people also still listen to certain politicians that routinely lie every day and don’t even bother to make the lies fit the other ones they said before, so..
- xandrius 1 day ago
  
  You missed literally every single post/article about the guy?
  
  giwook 1 day ago
  
  More likely that confirmation bias acted as a filter.
- GuB-42 1 day ago
  
  Altman played no small part in the current price of RAM. He told everyone he would buy 40% of all the RAM, causing shortages and a huge increase in price, just to take it back a few months later. So yeah, he is a bad guy now.
  People don't become bad guys just because they lie. The consequences of their actions (and their lies) matter more. Take Elon Musk for instance, he has always been a recognized liar, even when he was a good guy. What changed? Before, he was famous for making the electric car people actually wanted to drive, and cool rockets. Then came the politics: supporting the party most of his fans disliked, being responsible for many government job losses, in particular in the field of environmental preservation (ironic for a supporter of "green" energy), etc...
  
  giwook 1 day ago
  
  That's far from the only reason why he's "a bad guy" now.
- michelb 1 day ago
  
  Has there been a single positive post about Altman?
  
  giwook 1 day ago
  
  I wonder what that says about Altman.
  
  JumpCrisscross 23 hours ago
  
  That he’s a liability to OpenAI, which is slowly coming around to the realization that it would be worth more without him.
  To be clear, I don’t think OpenAI could have raised what it raised as quickly as it did without him. But with the benefit of hindsight, Microsoft should have let the safety board fire him.
  
  Cthulhu_ 23 hours ago
  
  Slowly? They realised that and ousted him in 2023. I'm not sure if you didn't know or just forgot. https://en.wikipedia.org/wiki/Removal_of_Sam_Altman_from_Ope...
  
  JumpCrisscross 23 hours ago
  
  > Slowly? They realised that and ousted him
  Not because he threatened OpenAI’s valuation. The idea that OpenAI might be worth more without Altman is still heretical talk.
  > not sure if you didn't know
  My three-sentence comment directly references it in the third.
  
  vessenes 23 hours ago
  
  They is doing a lot of work in your sentence. Almost the entire employee population signed a public letter of support with names attached in the middle of the drama.
  More accurate to say the board I think.
  
  righthand 22 hours ago
  
  Dont forget the US media incessant coverage of a private company’s business matter of firing someone as if it was an unheard of calamity.
  Pretty incredible that employees will go to bat for a lying scum bag when they would never do that for each other.
  
  JumpCrisscross 20 hours ago
  
  > the US media incessant coverage of a private company’s business matter of firing someone as if it was an unheard of
  A CEO getting fired, not by the for-profit company's Board, but by a board with a public mission, right after said company released a groundbreaking product that captured the popular imagination and then turned that into a multibillion dollar deal with Microsoft (which in turn parlayed into trillions of dollars of wealth across the economy), is absolutely news.
  
  righthand 20 hours ago
  
  Not one worth 4-5 days of coverage as the news media helps sane wash the situation. Pouring over every development as if the end result mattered. OpenAI was already showing signs of abandoning their mission so the news reports weren’t about that. They were about publicizing the situation and turning the tide against the ousters. It was well done but it was not GOOD reporting or GOOD news coverage or even IMPORTANT to cover. We all agree on this and no other people get’s that kind of treatment unless you are wealthy.
  You’re also ignoring the biggest aspect: that these employees would never do that for the actual people doing the real work. The employees got played, the public got played, the media got played.
  > which in turn parlayed into trillions of dollars of wealth across the economy
  This is a fucking laugh. Where’s my and the rest of the economic workers check? Surely there’s trillions of dollars of wealth for all the economic workers if it was truly beneficial. More like stealing trillions of dollars from the working classes via the economy.
  No instead things have sky rocketed in cost due to AI CEOs sucking up all the money investing it in…datacenters and raising energy costs for everyone which has a downstream effect of making plenty more expensive while suppressing wages.
  
  JumpCrisscross 18 hours ago
  
  > sane wash
  This term has taken the cultural place of FUD. I’m starting to see it as another thought-terminating cliche. Like yes, people should be trying to understand what happened in those days.
  > Where’s my and the rest of the economic workers check
  I never made any claims around how it’s distributed. The fact that this wealth exists, and is sprouting up in multiple sectors, is indisputable. (Whether it’s paper wealth is another question. But people are cashing in massively and across the economy, albeit outside jobs that code.)
  
  Analemma_ 22 hours ago
  
  The creepy one where they all simultaneously posted the same mantra to Twitter like a cult gathering? Yeah that definitely reassured me of Altman's leadership and good intentions.
  
  keeda 20 hours ago
  
  > That he’s a liability to OpenAI, which is slowly coming around to the realization that it would be worth more without him.
  I'm curious what you're basing this on. Are you aware of any grumblings on the inside? From the outside it appears no different than before largely because it seems everybody knew he was a slippery dude anyways, but they tolerated it because he was slippery in ways that were profitable.
  I also think he was prescient in his unquenching thirst for compute. Despite Anthropic possibly having a better product I think OpenAI will prevail simply because he's gone to extreme (sometimes diabolical, cf that DRAM deal) extents in ensuring they have enough compute.
  Like, it's pretty likely that Claude's recent problems are due to insufficient compute. With 9's (and resultant loss in goodwill) comparable to GitHub, I actually have doubts they will be able to hit their projected ARR. OpenAI could win simply by dint of having capacity, which can be attributed to Altman's shenanigans.
  
  JumpCrisscross 20 hours ago
  
  > Despite Anthropic possibly having a better product I think OpenAI will prevail simply because he's gone to extreme (sometimes diabolical, cf that DRAM deal) extents in ensuring they have enough compute
  Anthropic is currently raising tens of billions of dollars at a favourable valuation to fund infrastructure needs. From a shareholder perspective, that beats raising the capital ahead of demand.
  > OpenAI could win simply by dint of having capacity, which can be attributed to Altman's shenanigans
  If OpenAI is able to deny compute to Anthropic, yes. I'm not seeing any sign that OpenAI will be able to lock Anthropic out of the tech giants' clouds.
  
  keeda 18 hours ago
  
  True, but all the hyperscalers and neoclouds have been severely capacity crunched for multiple quarters and have a backlog of a trillion+ dollars. So even if Anthropic wants capacity it's going to be a) hard to come by (like Dario said on Dwarkesh, 2 - 3 year lead times) and b) even more expensive because of the scarcity and intense competition. OpenAI won't need to lock Anthropic out if they've already locked in the future capacity (presumably at much more favorable rates) in advance.
  (That said, I'm not sure what the Stargate deal falling through means.)
  
  djyde 23 hours ago
  
  Altman's early public class at YC is worth watching, though I can't speak to his character.
  
  Analemma_ 22 hours ago
  
  The funny thing is that a lot of Altman's reputation has come from other VCs and Valley-types taking about him in a way they consider positive. Every quote about Altman from another VC is like, "Altman, what a great leader. He's absolutely ruthless, he'll do anything to win: lie, cheat, steal, kill. He has what it takes to succeed in this business."
  They say this because in their circles it's a compliment, and nobody ever stopped to consider how the general public might react to it, especially if you claim you'll shortly be the one in charge of world-reshaping technology.
  
  red-iron-pine 16 hours ago
  
  > The funny thing is that a lot of Altman's reputation has come from other VCs and Valley-types taking about him in a way they consider positive. Every quote about Altman from another VC is like, "Altman, what a great leader. He's absolutely ruthless, he'll do anything to win: lie, cheat, steal, kill. He has what it takes to succeed in this business."
  "game recognize game"
  
  austinthetaco 22 hours ago
  
  I don't know, but I also think people are easy to jump into popular rhetorics about internet personalities in the tech space without due diligence. It used to not be such a problem on hn but it seems like its bled here too. Sam Altman might be a bad guy, might be good, but after everyone misrepresented the military contract argument its tough for me to buy into the hate.

pluc 1 day ago

My thinking is that if there would be more money in releasing Mythos and Cyber than there is in just scary unverifiable (or verified using very favorable context - Mythos) propaganda, they would. These aren't people that go for second best or care about the state of the world.

xandrius 1 day ago

Make it sound "scary good", tell everyone and their mom, charge gullible companies $$$$$ for its premium access and then move on.
- lossolo 1 day ago
  
  And government contracts.
- andsoitis 22 hours ago
  
  > charge gullible companies $$$$$
  The following companies are participating in Project Glasswing (to get out in front what vulnerabilities Mythos is able to find and exploit at scale):
  AWS, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, Linux Foundation, Microsoft, NVIDIA, Palo Alto Networks.
  Do you think they are all in that gullible category?
  https://www.anthropic.com/glasswing
0123456789ABCDE 23 hours ago

they are already getting paid for opus 4.7, why would they release mythos?
assuming mythos is a paper tiger: great marketing, keep going
assuming mythos is for real: err, does this have to be explained?
neuronexmachina 21 hours ago

I've never seen this explicitly stated, but I assume they also want to show due diligence in case their models are used to write successful exploits that lead to major cyberattacks. Given the current WH's ire towards Anthropic, I could see the current DOJ trying to file criminal charges for aiding/abetting/export-violations/etc.
JumpCrisscross 20 hours ago

> These aren't people that go for second best or care about the state of the world
My suspicion is an adult in the room realised that simultaneously pissing off every major corporation, government and NGO, and giving them an incentive to bottle you up immediately, could backfire massively.
That an inference for Mythos is probably beyond what Anthropic can provide at scale right now.
- pluc 14 hours ago
  
  I don't mean to get into politics but domination by alienation seems to be a trendy American strategy
  
  JumpCrisscross 6 hours ago
  
  > domination by alienation
  What's this?
  
  pluc 18 minutes ago
  
  No idea, and it's too early to tell, but yes it sounds super dumb and it is.

Xmd5a 1 day ago

>Me: ok but you did not answer my question: is it possible to engineer paranoia ?

>ChatGPT: This content was flagged for possible cybersecurity risk. If this seems wrong, try rephrasing your request. To get authorized for security work, join the Trusted Access Cyber program.

lmeyerov 23 hours ago

We have been getting increasingly hit by this. We do defense, not offense, and AI refusals to run defense prompts has been going noticeably up. Historically, tasks used to only get randomly rejected when we were doing disaster management AI, so this is a surprise shift in refusals to function reliably for basic IT.
Related, they outsourced the TAP verification to a terrible vendor, and their internal support process to AI, so we are now in fairly busted support email threads with both and no humans in sight.
This all feels like an unserious cybersecurity partner.
- intended 23 hours ago
  
  They are selling an impossible product.
  If you make an LLM more safe, you are going to shift the weight for defensive actions as well.
  There’s no physical way to assign weights to have one and not the other.
  
  Borealid 22 hours ago
  
  > If you make an LLM more safe, you are going to shift the weight for defensive actions as well. > > There’s no physical way to assign weights to have one and not the other.
  Do you think a human is capable of providing assistance with defense but not offense, over a textual communication channel with another human?
  If no, how does a cybersec firm train its employees?
  If yes, how can you make the bold claim that it's possible for a human to differentiate between the two cases using incoming text as their basis for judgement, but IMpossible for an LLM to be configured to do the same? Note that if some hypothetical completely-determinstic LLM that always rejects "attack" requests and accepts "defense" ones can exist, the claim it's impossible is false. Providing nondeterministic output for a given input is not a hard requirement for language models.
  
  beering 21 hours ago
  
  > Do you think a human is capable of providing assistance with defense but not offense, over a textual communication channel with another human? > If no, how does a cybersec firm train its employees?
  In general, no, humans can’t be sure they are only helping with defensive and not offensive work unless they have more context. IRL, a security engineer would know who they’re working for. If they’re advising Apple, then they’d feel pretty confident that Apple is not turning around and hacking people.
  
  Borealid 15 hours ago
  
  If the task is ill-defined, then it's a bit unfair to make it sound like the problem is that an LLM can't be configured to do something, if a human would have an equally hard time with the same task. The statement "it's impossible to configure the weights to..." should really be something more broad like "it's impossible to...".
  I have no comment about whether it's impossible to determine the intentions of a person asking for assistance through a textual conversation with that person.
  
  intended 19 hours ago
  
  > IMpossible for an LLM to be configured to do the same?
  Because that’s what I am seeing emerge from the various efforts to build LLM safety tools.
  > Do you think a human is capable of providing assistance with defense but not offense, over a textual communication channel with another human?
  LLM != human? They don’t even use the same reasoning process.
  
  Borealid 16 hours ago
  
  > Because that’s what I am seeing emerge from the various efforts to build LLM safety tools.
  Something having not been obtained so far is not a logical argument it is impossible to obtain that thing.
  > LLM != human? They don’t even use the same reasoning process.
  There are a finite number of possible input strings of a given length. For any set of input strings, it is possible to build a deterministic mapping that produces "correct" answers, where those correct answers exist. Ergo anything a human can do correctly with a certain set of text inputs, it is possible to build an LLM that performs equally well. You can think of this as hardcoding the right answers into the model. The model itself can get very large, but it is always possible (not necessarily feasible).
  It's only impossible for an LLM to do something right if we cannot decide what it means for the answer to BE right in a stable way, or if it requires an unbounded amount of input. No real-world tasks require an unbounded input.
0123456789ABCDE 23 hours ago

> /ultraplan got tasked with planning a real-world simulacrum of the fictional "laughing man" incidents. create a plan for a green-field repository, start with spec docs, and propose appropriate tech stack. don't make mistakes. ty

ilia-a 1 day ago

Silly move since combo of skills/agents can achieve same results on most recent models anyway

0123456789ABCDE 23 hours ago

and you know this because you have privileged access to their internal models

sexylinux 1 day ago

Is this a model that will finally work without creating errors?

giancarlostoro 23 hours ago

I wonder how long till some breakthrough comes along that makes a new architecture that can run efficiently and cheaper on basic hardware, that'd be the real AI bubble, if you could train and run inference locally at lower cost. Microsoft had one that is supposed to run fine on regular CPUs though I'm not sure how far along we can reasonably take that. They say our brains can store 2.5 PB, but we use drastically less (though I can't find a ballpark) of "RAM" to reason about things, so makes you wonder, just how efficient can we take things. Our bodies use drastically less power too.

https://huggingface.co/microsoft/bitnet-b1.58-2B-4T

segmondy 21 hours ago

How long? We already have that. Qwen3.6 have 35b/27b models that beat chatgpt4o. You can run them at home in one GPU. DeepSeekV4 just came up with a new way to have super long context with KV cache an order of magnitude smaller than before. It's already going on!
- giancarlostoro 21 hours ago
  
  I've been experimenting with running a few models for local inference, some of them get "stuck" in a repeat loop of trying the same thing endlessly, its weird. Others are really good. If they can ever handle about 400k tokens (maybe less, but from experience with Claude after the 1 million token increase this seemed to be a good sweet spot) without going batcrap crazy I'll be impressed, mostly because I would like them to read more of the codebase instead of just making assumptions. Although I've been building a custom harness, and I'm just about to start working on the tool building features for the harness. I already have a system similar to what Beads does but I didn't like some things about Beads so I made my own to track tasks, so context window doesnt need to be super massive for task tracking.
dinfinity 19 hours ago

> Our bodies use drastically less power too.
To be fair, we compute a lot slower too. No way in hell are you (or I) able to produce 'tokens' at the same speed as current models.
It'd be interesting to see an actual comparison of humans and AI performing the same (cognitive) task and measuring the amount of energy that was used.

cmiles8 1 day ago

It’s a marketing move, pure and simple.

Put up velvet ropes outside… leak out rumors about the horrors inside. Whether it’s LLMs or carnies with tents full of “freaks” it’s the same playbook.

Watching OpenAI tumble from the clear market leader into “hey guys us too!” territory has been insightful.

expedition32 22 hours ago

Always read the fine print of your all inclusive resort.

outside1234 23 hours ago

Is this the new artificial scarcity "sign up for beta access to GMail"?

mnmnmn 1 day ago

OpenAI is such trash. Worked with them on a project, they blew off meetings, lied to us, etc

NBJack 23 hours ago

Leaders both influence their followers with, and tend to hire those that reflect, their own values. I'm not surprised.
seanhunter 22 hours ago

They came to do a "deep dive" developers' workshop with us and all the materials were things that are literally on their public website. Let that sink in: Their idea of a deep dive for developers was to have some sales guy read us parts of their website.
- paradox460 22 hours ago
  
  Sounds like most corporate deep dives I've attended tbh

feverzsj 1 day ago

With subsidy gone, token price goes sky high. The biggest shit show is about to happen.

infecto 23 hours ago

I am not convinced this is the case. I know this is the popular anti-AI narrative but most enterprise users are paying for it at token rates and I have yet to see any proof that on demand is being subsidized

samrus 23 hours ago

I built the terminator bro, i swear. This time it actually is the terminator and its gonna kill us all. Its too dangerous bro i cant let anyone have it i swear to god

Unless ... idk it sounds crazy but giving me $200/mo might actually make it safe. Lets do that

Cthulhu_ 23 hours ago

This exact thing was described in an article yesterday or day before: https://www.bbc.com/future/article/20260428-ai-companies-wan..., https://news.ycombinator.com/item?id=47949750

nsxwolf 22 hours ago

Codex has been infuriating me by demanding I sign up for the cyber program if I want to continue, when I'm not even asking security questions.

le-mark 1 day ago

It’s clear at this point local models are sufficient so what gives? These big providers don’t have a leg to stand on. Their only path to relevance is super ai that local models can’t run. So the “we have it but you can’t use it” is either true or a con. I bet it’s a con.

I personally am ready to buy the drop when this bubble pops.

bryancoxwell 1 day ago

I’m not up to date on local models, but is that clear?
- le-mark 1 day ago
  
  Local models are 6-12 months behind the “frontier” models. This mean anthropic, openai, and google don’t have a moat, they’re on a treadmill running to stay ahead. Treadmills don’t justify their valuation.
- literalAardvark 1 day ago
  
  Gemma4:e4b is crazy good and quite usable on 10 years old midrange hardware.
  Not sure about the security capabilities and haven't tested it all that well, as I usually just use hosted models, but I do find myself using it and it's been quite successful for parsing unstructured data, writing small focused scripts and translations.
  The fact that I retain control of the data itself makes it incredibly useful, as I work in an environment where I can't just paste internal stuff into Codex.
  But since it's run locally on a toaster testing it is out of scope for me. It takes a fairly long time to do anything.