I really dislike these AI middleman plans. The value-add that Microsoft brings to Github Copilot is near zero compared to directly buying from Anthropic or OpenAI, where 99% of the value is being delivered from. I don't understand why anyone would want to deal with Microsoft as a vendor if they don't have to. The short period of discounted usage was always the obvious rug pull.
I was accounting for that in the 1% of value. I don't see a ton of value in this for development, you end up just always using the smartest model, with maybe tuning subagents to slightly dumber but much faster model. You really only need one subscription to the provider of the smartest model, with maybe 30 minutes of setup time to switch over if SOTA ever switches back to OpenAI.
Bingo. Github Copilot is mostly for organizations that have an existing Azure bill and would rather see that go up then get a new vendor bill. Professional middlemen.
If you’ve ever had to be part of the frankly batshit insane procurement process that some organizations force you to gauntlet through, it becomes a very obvious and appealing option to do this
I would also add that the models they supply through Azure Foundry are covered under my employer's existing customer agreement, by which MS is not allowed to train models on our data (which might include IP of the company or its clients). For organizations worried about that, it's nice & cozy.
Ah, the AWS Marketplace procurement model, where products mostly exist so that you can line item things through Amazon rather than going through a lengthy procurement process
Not surprised to see this is common. At my company basically everyone and their mother are using Claude Code via Bedrock, despite us having company-wide Windsurf, Copilot and ChatGPT Enterprise accounts
That sounds different, the parent is saying they're using that because then no new billing and stuff has to be negotiated/setup, but in your case everything is already setup and people have access, they just chose to use something else?
Indeed. The use case is like this: I'm a Devops/Platform/SRE/Infra/WhateverYouCallAWSAdminInYourOrg at BigCorp and end users are asking me to use software XYZ. It's on the AWS Marketplace. I have two choices. I could either
1. Go through a 1-2 month procurement process where I have to deal with not only the vendor's sales team on who I'm buying from but also probably multiple teams in my BigCorp. Vendor sales team wants to feel relevant and so I'm sitting in at least one meeting where I'm telling them I just want to buy your shit make it as fast as possible. But then the people in my BigCorp likely not only don't understand why the software is necessary, but need to feel relevant and as such will make me fight through bureaucratic hurdles. I have to get compliance involved. Finance involved. If there's a procurement team I have to get them involved. Probably there's a security questionnaire that my bigcorp's security team uses. I have to send that to the vendor's sales people. They have to send it to their security folks. Security folks on their end have to complete it and send it back. I have to send approvals up the chain on my end, after I've successfully convinced some clueless nontechnical user why software XYZ is important and no, the shit half baked thing we already have doesn't work.
OR alternatively:
2. I can go to the AWS marketplace, click a button, and now my AWS bill goes up X thousands of dollars per month and none of the bullshit from 1 is required. Because AWS is already an approved vendor. Everyone except perhaps someone monitoring the AWS bill for large increases is happy and doesn't care (well, maybe the security team does, but hopefully they aren't tattling on you to the procurement people who have nothing to do and want to stick their fingers in the process and we can make that process go quick), and I just need to tell that person that we are doing it.
It's not always the exact narrative I just laid out, but the gist of it is pretty much procurement at every bigcorp.
Because if you’re a vscode user up until a couple days ago you could hammer Opus 4.6 all day every day and pay nowhere close to the Claude Max plan. Many people exploited this and the subsidy is closing.
Yes, I loved my $10 a month person subscription for light coding tasks, it worked great. I'd use claude code max for heavy lifting, but the $10 a month copilot plan kept me off cursor for the IDE centric things.
Well they charge per prompt, but with usage limits it is a mix of token and prompt. If prompt multiplier is higher, tokens are also multiplied, so limit is reached sooner.
It is basically a token based pricing, but you get alos a limitation of prompts (you can't just randomly ask questions to models, you have to optimize to make them do the most work for e.g. hour(s) without you replying - or ask them to use the question tool).
The Anthropic Pro plan cost double and gave you, I don't know, a tenth the usage, depending on how efficiently you used Copilot requests, and no access to a large set of models including GPT and Gemini and free ones.
Opus 4.6 is no longer available and Opus 4.7 chews through monthly limits with reckless abandon. The value-add of GH Copilot is basically gone (at least for individuals on the Pro or Pro+ plans.)
Been having a ton of fun with copilot cli directed to local qwen 3.6. If you’re willing to increase the amount of specificity in your prompts then delegating from a GPT-5.4 or Opus to local qwen has been great so far.
A suggestion: Don't invest in any new hardware to run an LLM locally until you've tried the model for a while through OpenRouter.
The Qwen models are cool, but if you're coming from Opus you will be somewhere between mildly to very disappointed depending on the complexity of your work.
I have thought about making a product out of something I'm building and trying to make the cost of my product a percentage on top of whatever I could resell Anthropic or OpenAI (or whatever) tokens for. I get this may be unpopular, maybe I should just stick with BYO-key.
Except Copilot doesnt bill you per token like all those companies do, they bill you per prompt, at least Copilot in Visual Studio 2026 which is insane to me, are they just hosting all those models and able to reduce costs of doing so?
No they are taking the massive L. Thats why they paused new sign ups.
Just for context to the insanity, they allow recursive subagents to I believe its 5 levels deep.
You can make a prompt and tell copilot to dig through a code base, have one sub agent per file, and one Recursive subagent per function, to do some complex codebase wide audit. If you use Opus 4.7 to do this it consumes a grand total of 0.5% of a Pro+ plan.
Thats why this paragraph is here:
> it’s now common for a handful of requests to incur costs that exceed the plan price
I disagree. I like the standard interface, being able to easily switch models as things invariably change from week to week, and having a relationship with one company. That's why I'm a big fan of openrouter and Cursor. Not too much experience with Copilot, but I think there's a huge value add in AI middlemen.
I found the Copilot harness generally more buggy/disfunctional. After seeing a "long" agent response get dropped (still counts against usage of course) too many times I gave up on the product.
It doesn't matter how competent the actual model is, or how long it's able to operate independently, if the harness can't handle it and drops responses. Made me think are they even using their own harness?
At least Anthropic is obviously dogfooding on Claude Code which keeps it mostly functional.
It was so much cheaper! I subscribed with the monthly plan instead of the yearly one thinking that the deal won’t last. It has last a bit longer than expected.
Copilot was there in AI based development first with tab completions.
Now, it may be the right call to immediately give up and shutdown after Opus 4.5, but models and subscriptions are in flux right now, so the right call is not at all obvious to me.
The agentic AI models could be commoditized, some model may excel in one area of SWE, while others are good for another area, local models may be at least good enough for 80%, and cloud usage could fall to 20%, etc. etc.
Staying in the market and providing multi-model and harness options (Claude and Codex usable in Copilot) is good for the market, even if you don't use it.
I don't know what they have done to Claude, but when using through copilot it's truly awful compared to using it straight from the API.
I have always just used the API, but I decided to give copilot a go on the weekend because of the cheap price. And I am seeing weird behavior like I have never seen before... It will somehow fail to use the file editing tool and then spend an absolutely huge amount of time/tokens building a python script to apply the edit in a sub process... And it will spin it's wheels on stuff the API routinely just gets right in one shot.
This might have been bad timing. Copilot API broke things last weekend with caused a lot of tool calls in various agent harnesses to start failing like the edit tool.
1. They heavily subsidized their plans vs. paying for API.
2. They allowed me to use the subscription in every tool I wanted.
3. It covered both Anthropic and OpenAI.
I exclusively use prepaid OAI tokens when doing copilot work in visual studio. It's really easy to set up a "custom" model. The consistency is hard to beat and I can use the latest model on day one. I also get to see how the magic happens in my provider logs. Every token accounted for.
> The value-add that Microsoft brings to Github Copilot is near zero compared to directly buying from Anthropic or OpenAI
Over here in the EU, we need to store sensitive data in an EU server. Anthropic only offers US-hosted version of their models, while G-cloud and Azure has EU based servers.
> The value-add that Microsoft brings to Github Copilot is near zero
You are not their target audience.
The value add is the GitHub integration. By far the best.
GH has cloud agents that can be kicked off from VS Code; deeply integrated with GH and very easy to set up. You can apply enterprise policies on model access, MCP white lists, model behavior, etc. from GitHub enterprise and layered down to org and repo (multiple layers of controls for enterprises and teams). It aggregates and collects metrics across the org.
It also has tight integration with Codespaces which is pretty damn amazing. `gh codespace code` and it's an entire standalone full-stack that runs our entire app on a unique URL and GH credentials flow through into the Codespace so everything "just works". Basically full preview environments for the full application at a unique URL conveniently integrated into GH. But also a better alternative to git worktrees. This is a pretty killer runtime environment for agents because you can fully preview and work on multiple streams at once in totally isolated environments.
If you are a solo engineer, none of this is relevant and probably doesn't make sense (except Codespaces, which is pretty sweet in any case), but for orgs using the GH stack is a huge, huge value add because Microsoft is going to have a better understanding of enterprise controls.
If you want to understand the value add of Copilot, I think you need to spend a bit of time digging into the enterprise account featureset in GH, try Codespaces, try Copilot cloud agents. Then it clicks.
Over the past month, I started a GHCP ~$12 Pro sub, and found I hit my quota about half way through March or so (but I also wasn't being very...frugal). So I signed up for Claude (~$20 Pro) for a month, and I liked it at first, but the 5 hour window was very annoying, and I hit it quite a bit. The first ~week of April was nice though, and I could use Claude to the limit, and then switch to GHCP. I've sym-linked my instructions so it was more or less easy switching back and forth when I hit a limit.
However, Claude changed their limits so I got to 100% very easily, and when I did hit 100%, I couldn't be given a "window" of snapshoting my work into something for another agent (either future claude window or GHCP agent) to easily pick up mid-work.
I found the lack of visibility into what costs what was very annoying. For $20/month, you get an arbitrary amount of usage that they were changing without notice or alerts or visuals. I didn't renew CC after it expired and just kept with GHCP.
Even with this announcement from GHCP, I haven't run into a limit. I'm considering upgrading to Pro+ if I don't see a limit.
But I stick with Sonnet more or less in both environments. I only used Opus for a couple of planning sessions at the very beginning, but JIT planning is done good enough by the more mid-tier models.
The value-add that Microsoft brings is checking the boxes that you want checked.
If you need some random Egyptian government compliance certification for your vendors or whatever, Microsoft probably has that, Anthropic probably doesn't. Microsoft's (as well as Oracle's) entire deal these days is figuring out what customers care about compliance-wise, and structuring their offerings to deliver exactly that. Whether they're selling their own products, or re-selling somebody else who doesn't have that kind of global footprint and clout, is secondary at best.
Even in solo development, the value add of developer experience benefits of Github Copilot's integration into Github itself (kick off agent work from your phone on the GitHub site), VS Code, and other tools is quite high.
For direct instance: Anthropic's Claude Code, despite being primarily written in Python, didn't even properly support Windows until far too recently (suggesting people use WSL instead) and even now is not a great Windows experience (requiring git bash, illuminating that among other things Claude's models themselves haven't trained enough on PowerShell, and I try to avoid Claude's models when working on PowerShell scripts still, personally).
Meanwhile VS Code works everywhere I want it to, out of the box, and VS Code's GitHub Copilot integration does the same.
Also, your "near zero" value add includes engineers at Microsoft/Github following the "which is the smartest/most practical model" meta-game for you and just silently updating defaults in Copilot for you without needing to make conscious choices. Sure, you can follow that meta yourself by watching HN every day and sampling hundreds or thousands of opinions across dozens to hundreds of stories each day, then play a "Netflix subscription game" of switching subscriptions every X months when the meta-game shifts or you can pay Microsoft to do all that research for you (which true also includes their professional business relationships/contracts with OpenAI and Anthropic, which is as much of a feature as a bug in my opinion because it's also a signal in the opinion war noise of choosing the "smartest model [for you, right now]" and does show up as its own HN stories for meta-debate). At least to me that's much more than a 0% or 1% value add, but maybe that's also because I don't trust either Anthropic or OpenAI directly, I sort of don't trust HN's comments as a strong guide to playing the meta-game, it's not a meta-game I want to play, and I'm happy to pay someone else to play it for me.
I have a GitHub Pro subscription, renewed for the 2nd year, and I just found out I can no longer use Opus with it. Opus was one of the reasons I had a subscription in the first place.
Opus 4.6 had a 3x multiplier in Pro. Now the new Opus 4.7 model has 7.5x in Pro+, which offers 5x more requests, but costs 4x more than Pro. So now Opus is essentially 2x the price it used to be.
Reading the comments here drives home an industry wide problem with these tools: people are just using the latest and most expensive models because they can, and because they’re cargo-culting. This is perhaps the first time that software has had this kind of problem, and coders are not exactly demonstrating great discretionary decision making.
I’ve been using Anthropic models exclusively for the last month on a large, realistic codebase, and I can count the number of times I needed to use Opus on one hand. Most of the time, Haiku is fine. About 10% of the time I splurge for Sonnet, and honestly, even some of those are unnecessary.
Folks are complaining because they lost unlimited access to a Ferrari, when a bicycle is fine for 95% of trips.
Haiku is most definitely not fine for the code bases that I work on. Sonnet is probably fine for most daily tasks, but Opus is still needed to find that pesky bug you've been chasing, or to thoroughly review your PR.
> Haiku is most definitely not fine for the code bases that I work on. Sonnet is probably fine for most daily tasks, but Opus is still needed to find that pesky bug you've been chasing, or to thoroughly review your PR.
Yeah, I hear that a lot, but it never comes with proof. Everyone is special.
I’m sure you’d find that Haiku is pretty functional if there were a constraint on your use.
> I don't think it's really helpful to tell people they're holding it wrong
I’m not saying that. If anything, it really doesn’t matter much what model you use, and it’s only a case of “you’re holding it wrong” in the sense that you have to use your brain to write code, and that if you outsource your thinking to a machine, that’s the fundamental mistake.
In other words, it’s a tool, not a magic wand. So yeah, you do have to understand how to use it, but in a fairly deterministic way, not in a mysterious woo-woo way.
It’s not snarky. It’s literally the argument people are making: I am special, my use case is exceptional, therefore I need to use the special tool, even if you don’t need to.
I use models from Opus through Haiku and down into Qwen locally hosted models.
I don't know how anyone could believe that Haiku is useful for most engineering tasks. I often try to have it take on small tasks in the codebase with well defined boundaries to try to conserve my plan limits, but half the time I end up disappointed and feeling like I wasted more time than I should have.
The differences between the models is vast. I'm not even sure how you could conclude that Haiku is usable for most work, unless you have a very different type of workload than what I work on.
More information required. What are you working on? What languages? How do you define “small tasks”? What are “well-defined boundaries”? What is your workflow?
Most importantly, define your acceptance criteria. What do you mean by “disappointed” - this word is doing most of the heavy lifting in your anecdote. (i.e. I know plenty of coders who are “disappointed” by any code that they didn’t personally write, and become reflexively snobby about LLM code quality. Not saying that’s you, but I can’t rule it out, either.)
The models are not the same, but Haiku is definitely not useless, and without a lot more detail, I just ignore anecdotal statements with this sort of hyperbole. Just to illustrate the larger point, I find something wrong with nearly everything Haiku writes, but then again, I don’t expect perfection. I’d probably get a “better” end result for most individual runs with the more expensive models, but at vastly higher cost that doesn’t justify the difference.
>> Yeah, I hear that a lot, but it never comes with proof. Everyone is special.
You were the one who made the claim that Haiku is fine most of the time. To any reasonable person, the burden of proof is on you. Maybe you should share some high level details about your codebase, like its stack, size, problem domain, and so on? Maybe they are so generic that Haiku indeed does fine for you.
Most of the people using these models aren't skilled enough to make that determination. Seems rough trying to sell yourself as the thing that means you don't need to understand what you're doing but also insist that you understand what you're doing well enough to select an appropriate model.
> I’ve been using Anthropic models exclusively for the last month on a large, realistic codebase, and I can count the number of times I needed to use Opus on one hand. Most of the time, Haiku is fine. About 10% of the time I splurge for Sonnet, and honestly, even some of those are unnecessary.
I mean at some point some people learn...
I was doing Opus for nasty stuff or otherwise at most planning and then using Sonnet to execute.
Buuuuut I'm dealing with a lot of nonstandard use cases and/or sloppy codebases.
Also, at work, Haiku isn't an enabled model.
But also, if I or my employer are paying for premium requests, then they should be served appropriately.
As it stands this announcement smells of "We know our pricing was predatory and here is the rug pull."
My other lesser worry isn't that Opus 4.7 has a 7.5x multi, it's that the multiplier is quoted as an 'introductory' rate.
Haiku is complete crap compared to sonnet in GHCP. A basic task in Haiku takes 3 prompts with a lot of correction. 1 prompt in sonnet. It isn't worth a third of the price if I have to fix it twice.
I think it heavily depends on how you're using it. If you understand your codebase and you're using it like "build a function that does x in y file" then smaller/cheaper models are great. But if you're saying "hey build this relatively complex feature following the 30,000 foot view spec in this markdown doc" then Haiku doesn't work (unless your "complex feature" is just an api endpoint and some UI that consumes it).
I largely agree. But that goes back to my point (albeit with mixed metaphors): there are lots of people who are just hitting things with a jackhammer in lieu of understanding how to properly use a hammer.
I basically never just yolo large code changes, and use my taste and experience to guide the tools along. For this, Haiku is perfectly fine in nearly all circumstances.
AI should decide the level of model needed, and fallback if it fails.
It mostly is a UX problem. Why do I need to specify the level of model beforehand?
Many problems don't allow decision pre-implementation.
That’s certainly an opinion. Not one I agree with, but sure, if you entirely outsource all of your thinking to the magic box, then you probably want the box to have the strongest possible magic.
This is the approach of Auto in Cursor and I've not been impressed with it at all. I think I'm always getting Composer and while its fast it wastes my time. GLM 5.1 in OpenCode is far better and less expensive, it can do planning and implementation both very effectively. Opus is still the best but GPT 5.4 (in Codex) is good enough too, and way more affordable.
This would require LLMs being good at knowing when they are doing a bad job, which they are still terrible at. With a good testing and verification harness set up, sure, then it could just go to a more powerful model if it can't make tests pass. But not a lot of usage is like this.
Model selection for day to day tasks based on vibes is not very scientific. Micromanaging the model doesn't seem like a great idea when doing real professional work with professional goals/deadlines/pressures.
It’s deeply ironic that the folks who want to outsource as much thought to the model as possible are saying that my stance - use your brain to decide the right tool for the job - is tantamount to “vibes”.
You are being deeply reductive and that's against the spirit of hacker news. The issue is that models are difficult to objectively benchmark. The benchmarks don't always align with real world performance. It's not easy and clear cut to determine which model will work best in a given situation. It boils down to loose experiences/anecdotes. Do you have an objective criteria for model selection that you have tested to be effective with reproducible tests?
> Micromanaging the model doesn't seem like a great idea when doing real professional work with professional goals/deadlines/pressures.
Remember that it's not only the cost per token, but also speed. Some tasks are done faster with simpler/less-thinking models, so it might actually make sense to micromanage the model when you have deadlines.
> people are just using the latest and most expensive models because they can, and because they’re cargo-culting. This is perhaps the first time that software has had this kind of problem, and coders are not exactly demonstrating great discretionary decision making.
> I’ve been using Anthropic models exclusively for the last month on a large, realistic codebase, and I can count the number of times I needed to use Opus on one hand. Most of the time, Haiku is fine. About 10% of the time I splurge for Sonnet, and honestly, even some of those are unnecessary.
You and I couldn't have more different experiences. Opus 4.7 on the max setting still gets lost and chokes on a lot of my tasks.
I switch to Sonnet for simpler tasks like refactoring where I can lay out all of the expectations in detail, but even with Opus 4.7 I can often go through my entire 5-hour credit limit just trying to get it to converge on a reasonable plan. This is in a medium size codebase.
For the people putting together simple web apps using Sonnet with a mix of Haiku might be fine, but we have a long way to go with LLMs before even the SOTA models are trustworthy for complex tasks.
I don’t use Haiku for planning of big tasks, so we basically agree on that. But even just Sonnet 4.6, on a fairly large codebase, only truly goes into the weeds maybe 10% of the time for me. I also write pretty specific initial prompts, and have a good idea of how I want the code to work before I start prompting. For example, sometimes I will spend several hours writing a spec before even picking up the power tools.
I have never had the situation you describe, where Opus won’t come up with “a reasonable plan”, but your definition of “reasonable” might be very different than mine, and of course, running through your credit limit is an entirely tangential problem.
- If you pay for unlimited trips will you choose the Ferrari or the old VW? Both are waiting outside your door, ready to go.
- Providers that let you choose models don't really price much difference between lower class models. On my grandfathered Cursor plan I pay 1x request to use Composer 2 or 2x request to use Opus 4.6. Until the price is more differentiated so people can say "ok yes Opus is smarter, but paying 10x more when Haiku would do the same isn't worth it" it won't happen.
Agreed on both points. We’re dealing with a cost/benefit analysis, and to this point, coders have been subsidized, coerced…maybe even mandated into using the most expensive option as if it was a limitless resource. Clearly not true, and so of course we’re going to see nerfing of the tools over time.
Obviously we’re a long way away from being able to rationally evaluate whether the value of X tokens in model Y is better than model Z, let alone better in terms of developer cost, but that’s kind of where we need to get to, otherwise the model providers are selling magic beans rated in ineffable units of magicalness. The only rational behavior in such a world is to gorge yourself.
85% of my code tasking can be handled by either GLM or Sonnet. The truth of the matter is that most software isn't that complicated. Even more hilarious is that people were running Opus on their OpenClaw setups. I'm glad Anthropic kicked them to the curb.
Claude Code doesn't have an option to use Opus 4.6 any more for me. It was great, but I guess now I have to use it half as much or upgrade my subscription again.
>people are just using the latest and most expensive models because they can,
While I agree with the sentiment, I think that might have been initially driven by older models being nerfed and/or newer ones were better at token/$. And there is this notion that those labs don't constraint the model on the first days after its release.
Of course you don't NEED the better models, but figuring out what model you need can waste a lot of time and effort.
Even when a cheap model is capable of a task it needs a lot more guidance than a more expensive one.
They are also less reliable. You can waste a lot of time cleaning up after them.
Judging whether something is good enough is hard work and rerolling with a more expensive model is painful.
Judging the difficulty of a task ahead of time is very hard. Judging how good a model is for a given task even harder, especially when models and harnesses keep changing all the time.
The real productivity boost LLMs provide is already modest and when you start tinkering with models it can easily evaporate.
> This whole thing is a massive asshole move, probably illegal in all countries with a minimum set of consumer protections
Why would it be illegal in any country? Did you pay for an year upfront? Even if so they're offering a pro-rated refund according to the linked blog post:
> If you hit unexpected limits or these changes just don’t work for you, you can cancel your Pro or Pro+ subscription and receive a refund for the time remaining on your current subscription by visiting your Billing settings before May 20
Not sure where the expectation that a business should continue serving you at a given price till the end of time no matter what came from.
I'm in the same boat as you. Wish I had known this before my subscription renewed. There's no longer any value in paying them for this service when I can cut them out of the equation and pay the model providers directly.
This thread is pretty quiet for what strikes me as a substantial set of changes with, presumably, more substantial changes still to come for anyone not grandfathered into a Pro plan.
I get the impression that the intersection of HN posters and Copilot users is quite small in practice; that Claude Code and Codex suck up all the oxygen in this room. But it seems plausible we’ll see similar “true costs greatly exceed our current subscription pricing” from Anthropic and OpenAI someday soon…
> If you hit unexpected limits or these changes just don’t work for you, you can cancel your Pro or Pro+ subscription and you will not be charged for April usage. Please reach out to GitHub support between April 20 and May 20 for a refund.
Using Copilot Pro with Pi, way better and smarter than using Claude Code. I haven't gotten a single e-mail and just wanted to use Opus (I use Sonnet 95% of the time with Opus for issues where Sonnet is struggling) and got an error message. No prior warning, nothing, I'm pissed. They just rugpulled all paying customers man. I liked Copilot because I can plan my usage over a whole month and I'm not "forced" to use it for a week before hitting limits unlike Claude and Codex.
Do you have a citation on this? I have a Claude Pro subscription and looked at the comparison page and it says this under Pro:
Everything in Free and:
Claude Code directly in your codebase
Power through tasks with Cowork
Higher usage limits
Deep research and analysis
Memory that carries across conversations
Speaking as someone where he only 'real' option we have at work is Copilot Plugin, but I also use Copilot Plugin at home....
This is a shitty shitty shitty move.
As a personal user, I can now only use Opus 4.7 at a 7.5x 'Introductory' multiplier if I upgrade to pro+, but at work I can still apparently do Opus 4.6 at a 3x Multiplier on my work 'enterprise' account.
Honestly it strikes me as though someone at Github Copilot took Palantir's manifesto to heart; Screw the individual, consolidate power to companies on every level.
> But it seems plausible we’ll see similar "true costs greatly exceed our current subscription pricing" from Anthropic and OpenAI someday soon
Enterprise might stick around, but individually, I reckon the developers will flock to OpenCode + open weights (Qwen/GLM/Codestral). The problem then is, if the open weight models impress these new adopters, they will shout about it from rooftops (conferences, social media, blogs) in unison, which might result in an exodus. Especially troublesome considering developers are a major market for both frontier labs (Anthropic & OpenAI) & its IPO ambitions.
Yesterday, Opus 4.6 cost three credits. You can no longer use 4.6 or 4.5.
Opus 4.7 is available today for 7.5 credits per prompt.
They have also suspended new signups.
After testing all of the major IDEs/tools that integrate with LLMs over the last four weeks, I was happy to settle on Copilot. I, and others, seem to be a lot confident in that decision. Especially since there seems to be no refund path for people who prepaid for a year.
In my 30+ years online, I've never seen an industry change so much in terms of pricing, service levels, etc, as I have the last two months.
I'm really curious where all of this lands, and if AI coding tools will be something that only a small percentage can genuinely afford at a competitive level.
> In my 30+ years online, I've never seen an industry change so much in terms of pricing, service levels, etc, as I have the last two months.
Warning: baseless speculation/theorizing ahead.
This is the consequence of LLM inference being really expensive to run, and LLM inference companies being really attractive to VCs. The VC silly money means their costs are totally decoupled from revenue for a while, but I guess eventually people look at incomings vs outgoings and start asking questions.
Previous big trends like SaaS apps, NFTs, blockchain etc were similarly attractive to VCs (for a period of time at least for the last two, the first one is still pretty attractive to VCs), but nowhere near as expensive to run so the behaviour of the companies running them wasn't quite the same.
AI is still in the "VCs subsidizing everything" -phase.
So:
- DO use AIs to build tools for yourself faster. If the AI goes away, the dashboard and scripts you made will still work.
- DO NOT build your business on top of 3rd party AI services with no way of swapping the backend easily. The question isn't whether there's going to be a "rug-pull", but when it happens. It might be sudden like this one or gradual where they just pump up the price like boiling a frog.
>it’s now common for a handful of requests to incur costs that exceed the plan price!
I think this is really telling. The cost of AI has really been masked HUGELY to drive adoption. The true cost is likely to be unsustainable for the big complex tasks (agents running for hours+) that companies have been pushing.
I was skeptical, then quietly bullish on AI, but I'm now seeing signs the market is cracking and the availability is going to receded/costs balloon.
Claude Code is definitely token based, its been discussed extensively on Hacker News and the related Github threads. A large context cache miss can take half your usage easily in just one request... "max" just means more reasoning tokens. I've also run out of usage during a single request in CoWork. Its definitely token based.
Opus was the reason I paid for GitHub copilot, but they had the pricing model completely wrong. I could assign copilot to a substantial issue using Opus and have it handle 30 minutes of work with many subagents with iterative testing. With 300 "premium requests" a month I could have copilot do substantial work for 3 premium requests per issue. It was very clear that this was unsustainable for Microsoft to pay for, so I expected change to come.
However, I never expected Opus 4.6 to be removed for the cheaper plans. I expected the pricing model to change, but not to lose access to the model. Moving to being token-based makes sense. It makes the cost more closely aligned with user pricing.
It was nice while it lasted. I got Opus 4.5 to do a lot of work from the beach by assigning it to detailed issues. With this news I've cancelled my Pro subscription. That will help a bit with their capacity issues.
> Opus was the reason I paid for GitHub copilot, but they had the pricing model completely wrong. I could assign copilot to a substantial issue using Opus and have it handle 30 minutes of work with many subagents with iterative testing. With 300 "premium requests" a month I could have copilot do substantial work for 3 premium requests per issue. It was very clear that this was unsustainable for Microsoft to pay for, so I expected change to come.
Yes I was hearing that a lot the past few weeks. Its pretty clear what happened:
Anthropic demand soared from OpenClaw, but they were already over-sold. Cowork shipped, Hegseth flexed and a lot of people and entire orgs moved from ChatGPT to Claude in the space of a month. They couldn't handle the demand - they quantized their models, dropped effective usage limits and made all kinds of tweaks in Claude Code to reduce token burn.
Lots of customers fled to Codex and they got crunched as well. Some people noticed Copilot was still selling dollars for nickles and mentioned it to other people.
The only question I care about: Is z.ai next? GLM 5.1 is simply where its at right now. It is not the best model, but it is much better than Sonnet at 1/5 the cost.
Note that the 7.5x multiplier is only for the promotional period (until end of April), then it'll get even worse. If I had to guess it'll be priced at 10x.
Good thing I had just finished migrating all of my workflows to OpenCode for the time being!
It's a shame because the VsCode copilot experience is quite good out of the box compared to all of the other harnesses I've used. But with typical lack of transparency, and sudden, harsh changes... What are they thinking?
After the restrictive rate limiting they've already instituted, I'm simply cancelling and continuing by using providers directly.
Without access to Opus, it seems to me that the limited context size you get isn't worth the subsidy, especially once you blow through the included requests, the cost seems kind of hard to predict because it's 'request' based not token based.
I just opted for ChatGPT Pro; I'm trying to take advantage of the increased usage limits they are offering, and from what I heard, Claude Pro was also having bad rate limiting for many people.
Great. I, a small consultancy, have just spent the last month working out a workload that uses Opus 4.6 via VS Code to prep horrible, inconsistent, survey data for upload to a proprietary platform. Worked a treat with some light babysitting.
It's the sort of messy job that agents excel at. Decisions need to be made on free text data, translations done into multiple languages, ambiguity handled.
I now need to recheck it still works with another model, which involves a lot of manual verification; and potentially move to Claude Code and pay more money I can ill afford right now.
I'm not even clear from the post when this comes in, I'm guessing effective immediately.
This really hammers home for me the point that we should not be renting our tools.
My own dumb fault for trusting them, I will make sure to learn from this.
So this is pretty devastating to my general workflows [1] right now, and poorly timed to boot, with no wind-down at all.
It was clear (see the linked post from 70 days ago) that the current offering was unsustainable, but I'm a bit taken aback at how sharp the clawback is.
Yes, Github's per-request pricing was insane; anyone suggesting using CC instead or asking if any other provider is as cheap just doesn't understand the insanity. Clearly losing a lot of money on the people making good use of it.
I was actually hoping they would change it to something that more closely tracks their actual costs so that they wouldn't have to rug-pull this badly. In particular what was really bad about it was that sending prompts to agents while they were working (to give them corrections) cost extra so I stopped doing that (after initially OpenCode didn't cause billing for that, until they became official).
I guess it makes more sense for me to just get Claude Pro instead. I was using my Copilot license only because of Opus 4.6 access as all other models seemed crippled in comparison in Copilot; does not even make sense to upgrade to Pro+ which goes from $10/mo to $40/mo and only gives you access to a model that has 7x the rate - 5x the limit at 7x the rate for 4x the price does not seem appealing at all.
A test to see if they could get away with it. I think we're really in the thick of token rationing right now and the fallout is going to be funny to watch.
I wouldn't mind this change that much if opus-4.7 worked properly in copilot cli. It keeps stopping mid-thought or task and forces me to waste more prompts for no observable reason.
Looks like I'm ending my subscription, good (likely too good, no way my account was even remotely within profitable range) access to opus-4.6 was the only reason I used this at all.
Are you using through regular copilot (the 'local' agent type), or through the separate claude agent type (which I believe you have to activate in your repository settings on github).
I had the exact same issues with the latter - randomly stops working, wipes chat history, just generally seems to be totally broken. But the former works totally fine and still lets you select sonnet/opus. My experience was before this recent 4.6 -> 4.7 change though.
Regular local agent. Seems like as soon as the context fills up (and it only has about 160k of context so that doesn't take much) it starts to fall to pieces. I even tried using opencode as a harness instead and it causes opus 4.7 to lose all memory every time I hit a compaction step.
Welp. I already added a $20 Claude Pro subscription to complement my $10 Github Copilot Pro subscription and $10 DuckDuckGo Plus. That was partly to show support for Anthropic after the OpenAI/DOD episode, but also because I've been using Opus 4.5 exclusively with Copilot and I figured I should try Claude Code eventually.
Now it's going to cost me an upgrade to $39 Github Pro+ to keep using Opus, and even then it's with much higher multipliers. I don't fully understand the extent to which this reflects actual costs for Opus versus Microsoft leveraging network effects to discourage the usage of a competitor.
I didn't really want to wander outside of VSCode just yet because I was happy with VSCode/Copilot/Opus-4.5 and I don't want to spend all my time experimenting when stuff is changing so fast. But I guess my hand has been forced.
You can also use Claude Code in a VS Code terminal window, which I much prefer for reasons I can’t quite put my finger on. Granted, I’ve moved to Zed in the past few months. I’m doing the same there.
I cannot describe how disappointing it is to be switching to this insane time limit window based pricing. I absolutely abhor that I'll be subjected to 5 hour chunks of time where I'll be limited at some point in that window of time, and be told I'll have to wait. And then there is a weekly limit.
That's not how my creative energy works. I have time that I want to solve problems, and I want to solve them. I don't want a cooldown timer applied to solving a problem. Not to mention the anxiety of realizing that while I sleep I could have burned tokens in that time.
I'm incredibly disappointed when I sat down to my hobbyist programming time and realized copilot was suddenly and dramatically changed in a way that is incredibly disheartening.
Meter my token usage DON'T tell me when I can use them! ARGH.
> I'm incredibly disappointed when I sat down to my hobbyist programming time and realized copilot was suddenly and dramatically changed in a way that is incredibly disheartening.
Guess it’s time to rediscover the lost art of programming without an LLM.
I cannot understand people still using anthropic models on copilot, when gpt 5.4 is better and 3 to 7 time cheaper. Anthropic quite obviously raised their licensing to the max. You probably can still have a taste of it for a few minutes before being limited on their own subscription.
Simple, for what I'm doing Opus 4.6 (and before that, Opus 4.5) are just much better at following my instructions and achieve consistently better results.
From what I've been gathering, this split in success seems to depend a lot on the types of tasks, the domains / programming languages / frameworks used, and style of prompting.
I couldn't get 5.2 to follow instructions for the life of me, even when repeating multiple times to do / not do something. 5.3-codex was an improvement and 5.4 while _usually_ decent still regularly forgets, goes on unnecessary tangents, or otherwise repeatedly stops just to ask for continuation.
Sure, I'm paying 3x more per request, but I'm also doing 5x fewer requests.
Or well, used to. Still bummed about them dropping 4.6.
My experience is similar. Opus, especially Opus 4.5, understands my intentions better even when poorly phrased, and more consistently follows my instructions to do only what's necessary and no more.
As far as I can tell, the distinctive feature of my workflow is that I'm giving it small, contained single-commit-sized tasks and limited context. For instance: "For all controller `output()` functions under `Controller/Edit/` and `Controller/Report/`, ensure that they check `Auth::userCanManage`." Others seem to be taking bigger swings.
Anecdotally, I experimented GPT-5.4 xhigh and something about the code it wrote just didn't vibe with me.
It felt like I constantly have to go back and either fix things or I just didn't like the results. Like the forward momentum/progress on my projects overally wasn't there over time. Even with tho its cheaper it just doesn't feel worth it, to the point I start to feel negative emotions.
I'm actually a bit worried that I've somehow become to feel more negative emotions with agentic coding. Quicker to feel frustrated somehow when things aren't working.
Same for me. I would still be happy with my Copilot Pro subscription if I could use 5.4 with 1x coefficient (and 5.4 mini with 0.33x).
But seeing that they are stopping to get new subscriptions, and rumours/evidence that they plan to increase coefficients of remaining models, it seems they want us to see "the writing on the wall"
GPT's output is awful and it gets even more awful when you try to work out a solution "together" because it shits out 10 paragraphs with 20 options instead of focusing and getting things done.
I'm not surprised at all. This was one of the most generous plans out there, offering frankly ridiculous pricing based on a single prompt regardless of turns taken or tokens used. I was subscribed for a month around Christmas and got a shitload of tokens out of Opus 4.5 for a measly $10.
context: using student pack's "pro" plan for a long time, with exposure to enterprise "pro" plan also.
given the recent changes that kneecapped the plan for students [1], i feel less bad after seeing this. always had monthly limit on premium requests shown in the extension (which i would watch in dread creep up), the daily/weekly "usage limits" part seem ambiguous at best.
using agentic workloads as the basis for this change does not sit quite right with me. if you look at the newly added debug mode, you may notice the token consumption as well as the subagent/tool calls made behind the scenes. my takeaways:
- it consumes way too much tokens for simple tasks (had one use case where the agent burnt 16+ million tokens just to make 50 line change in a monorepo using plan -> agent approach)
- even when you select a model in the dropdown, the subagents/tools can be called with an entirely different model, often the haiku-4.5. gpt-4o is widely used for creating summaries or titles to display for the plan.
- the new reasoning modes have exacerbated the token burning as the agent tends to loop a whole lot. the prompt vs plan token ratio is quite minuscule, and when combined with your own instruction files and skills, it just goes out of the window.
i think they have given a generous model in the past, but by kneecapping the lower tier, it no longer justifies existence. if they want to raise prices, they can raise the floor. or rather put some work in improving their own orchestration system before putting the blame on the users vibing it out.
It's quite cheap at $10 at 1000 premium requests (1 request is like a plan mode + implementation + tests + commit & push). The only problem is I have already used it all, but was billed on the 3rd day of the month, and have to wait till next month to use it.
Damn it was good while it lasted, but it was obvious the previous per request pricing scheme was misaligned with their actual costs. MS's product people must be seriously detached from their technical and financial people for it to have even lasted this long (or they're willing to burn a lot of money for the typical "make customers happy and then rug pull" cycle, but hey, Hanlons razor).
Given that they've already silently had session + weekly rate limits for the past couple weeks already at least (I've hit them), I wonder if this change is just making them visible to the user, or if it's actually tightening them too.
If it's the former then I can say they're still significantly more generous than claude pro (on the pro+ plan), so this might be okay. If it's the latter, and the new limits are similar to claude pro then copilot is going to be significantly less useful to me.
Oh nuts, I forgot I was on Copilot. I used to use it for auto-complete and so on. I haven't used it in over a year and I'm still paying for it. If you're like me you'll find it here: https://github.com/settings/billing/licensing
And you can then cancel it. I have no idea what a premium request is and it's all just too complicated to use.
Microslop, 'xcuse me, Microsoft is working hard to make github less and less appealing. It's a bit weird how an initially fairly good idea, over time becomes ... worse.
I wonder what this will mean for all of my students. I have been weaving copilot into my software engineering and code centric courses, but that depends on copilot pro as provided by the GitHub education package. If those signups are paused too, that means I can’t bring in any new ones, at least not to this stack. For me, I’ll probably have to send them to Claude code where they will have to pay for access. (Though it is a better product IMHO.)
This points toward a deeper issue though. We’ll probably see more individual offerings dry up over time. That means you’ll have individuals stuck with hand coding while the hyper productive AI assisted coders will all be at large organizations. If that happens, we’ll enter a phase where computing will once more be available exclusively to the elite few.
> it’s now common for a handful of requests to incur costs that exceed the plan price
Pricing per turn/request was/is an idiotic model and I'm glad they are paying for it. It just forces you into a workflow just to work around business model. Heck the best laugh would be to create a plan outside vscode with interactive CC/Codex then copy paste into GH copilot to do a single session burn of few M tokens.
So far they did not change it, and none of this applies to business and enterprise accounts. My idea is that it can still be viable as most businesses will have plenty minimally used licenses with just a few power users abusing the request model.
Demand is increasing exponentially but supply is increasing linearly. NIMBYs inventing datacenter lies about sucking up water or noise is going to drive prices through the roof.
Noise in residential area is already a huge problem and data centers do in fact make it worse. They may be able to carve out exceptions in laws or push non-enforcement, but none of this changes the impact on human health.
This is some shit, coming with 0 notice at the start of a work week. My exposure to Claude is only via Copilot which has worked very well for my purposes. I didn't have to learn a ton for it to just start working. I guess I'll look into other options now as I really want to continue using Opus, but don't have a need to 4x my spend on Copilot quite yet.
It's quite telling, they've paused new signups because Microsoft doesn't have enough compute, and they moved Opus to only being accessible on a higher tier because Anthropic doesn't have enough compute either.
They're all operating at a loss, enshittification is coming for us all.
I'm a paying customer and I did not receive ANY communication about this. Was using Opus this afternoon and then it disappeared.
Microsoft really can't stop being Microsoft. I don't dispute the need to charge more for those models, but there is a basic decency to do things and as usual the Big Tech fuckery and complete lack of morals makes them do this in a way that generates total mistrust where it could be just annoyance.
I'll see how Sonnet handles the most difficult problems but I'm foresee a subscription cancelation soon.
It was great while it lasted - so great in fact, I was able to ignore the fact I was paying Microslop for it. Not bummed about it, now I can be completely M$ free again.
I really dislike these AI middleman plans. The value-add that Microsoft brings to Github Copilot is near zero compared to directly buying from Anthropic or OpenAI, where 99% of the value is being delivered from. I don't understand why anyone would want to deal with Microsoft as a vendor if they don't have to. The short period of discounted usage was always the obvious rug pull.
one subscription for access to most of the models..
I was accounting for that in the 1% of value. I don't see a ton of value in this for development, you end up just always using the smartest model, with maybe tuning subagents to slightly dumber but much faster model. You really only need one subscription to the provider of the smartest model, with maybe 30 minutes of setup time to switch over if SOTA ever switches back to OpenAI.
> I don't understand why anyone would want to deal with Microsoft as a vendor if they don't have to.
It can bill to our Azure sub and I don't have to go through the internal bureaucracy of purchasing a new product/service from a new vendor.
Bingo. Github Copilot is mostly for organizations that have an existing Azure bill and would rather see that go up then get a new vendor bill. Professional middlemen.
This is pretty straightforward compared to the giant universe of companies that resell Microsoft services.
The number of intermediaries that some customers, especially governmental agencies, go through to get just an Azure bill can be wild...
I’ve recently been unwillingly exposed to this side of things. It’s truly an insane, there must be a better way?
It's BASIC. LET them GOTO jail (BSD also comes to mind)
(see https://en.wikipedia.org/wiki/Microsoft_licensing_corruption... ) The upside: the EU finally got a prosecutor. And last but not least everybody forgot why the Baby Bells were born.
If you’ve ever had to be part of the frankly batshit insane procurement process that some organizations force you to gauntlet through, it becomes a very obvious and appealing option to do this
I would also add that the models they supply through Azure Foundry are covered under my employer's existing customer agreement, by which MS is not allowed to train models on our data (which might include IP of the company or its clients). For organizations worried about that, it's nice & cozy.
They just altered this deal for everyone else. Wonder how long they will wait before default opting you all into training too?
PSA: You only have about 36 hours left to opt-out!
Ah, the AWS Marketplace procurement model, where products mostly exist so that you can line item things through Amazon rather than going through a lengthy procurement process
Not surprised to see this is common. At my company basically everyone and their mother are using Claude Code via Bedrock, despite us having company-wide Windsurf, Copilot and ChatGPT Enterprise accounts
That sounds different, the parent is saying they're using that because then no new billing and stuff has to be negotiated/setup, but in your case everything is already setup and people have access, they just chose to use something else?
Indeed. The use case is like this: I'm a Devops/Platform/SRE/Infra/WhateverYouCallAWSAdminInYourOrg at BigCorp and end users are asking me to use software XYZ. It's on the AWS Marketplace. I have two choices. I could either
1. Go through a 1-2 month procurement process where I have to deal with not only the vendor's sales team on who I'm buying from but also probably multiple teams in my BigCorp. Vendor sales team wants to feel relevant and so I'm sitting in at least one meeting where I'm telling them I just want to buy your shit make it as fast as possible. But then the people in my BigCorp likely not only don't understand why the software is necessary, but need to feel relevant and as such will make me fight through bureaucratic hurdles. I have to get compliance involved. Finance involved. If there's a procurement team I have to get them involved. Probably there's a security questionnaire that my bigcorp's security team uses. I have to send that to the vendor's sales people. They have to send it to their security folks. Security folks on their end have to complete it and send it back. I have to send approvals up the chain on my end, after I've successfully convinced some clueless nontechnical user why software XYZ is important and no, the shit half baked thing we already have doesn't work.
OR alternatively:
2. I can go to the AWS marketplace, click a button, and now my AWS bill goes up X thousands of dollars per month and none of the bullshit from 1 is required. Because AWS is already an approved vendor. Everyone except perhaps someone monitoring the AWS bill for large increases is happy and doesn't care (well, maybe the security team does, but hopefully they aren't tattling on you to the procurement people who have nothing to do and want to stick their fingers in the process and we can make that process go quick), and I just need to tell that person that we are doing it.
It's not always the exact narrative I just laid out, but the gist of it is pretty much procurement at every bigcorp.
It’s got pretty good integration into vscode and you can bypass key anyway
It's understandable but sad that this will often be the reason.
Microsoft's USP in one sentence.
Because if you’re a vscode user up until a couple days ago you could hammer Opus 4.6 all day every day and pay nowhere close to the Claude Max plan. Many people exploited this and the subsidy is closing.
Good, I hope Microsoft lost a lot of money in the deal.
From a friend in GitHub: they've been burning so much money because of Opus.
Just use claude code directly with a pro plan instead of copilot for roughly the same cost.
On wait, nevermind.
https://news.ycombinator.com/item?id=47855565
> Just use claude code directly with a pro plan
Usage limits are/were higher in Copilot. They also charge per prompt, not per token.
Yes, I loved my $10 a month person subscription for light coding tasks, it worked great. I'd use claude code max for heavy lifting, but the $10 a month copilot plan kept me off cursor for the IDE centric things.
Me too. Claude isn't the best option when all you do is ask "what's this error message", every 10 minutes or so.
Well they charge per prompt, but with usage limits it is a mix of token and prompt. If prompt multiplier is higher, tokens are also multiplied, so limit is reached sooner.
It is basically a token based pricing, but you get alos a limitation of prompts (you can't just randomly ask questions to models, you have to optimize to make them do the most work for e.g. hour(s) without you replying - or ask them to use the question tool).
The Anthropic Pro plan cost double and gave you, I don't know, a tenth the usage, depending on how efficiently you used Copilot requests, and no access to a large set of models including GPT and Gemini and free ones.
Opus 4.6 is no longer available and Opus 4.7 chews through monthly limits with reckless abandon. The value-add of GH Copilot is basically gone (at least for individuals on the Pro or Pro+ plans.)
Yeah this was me. I just got a message that I hit my limit and now I am looking into what it takes to run Qwen on local hardware.
Been having a ton of fun with copilot cli directed to local qwen 3.6. If you’re willing to increase the amount of specificity in your prompts then delegating from a GPT-5.4 or Opus to local qwen has been great so far.
A suggestion: Don't invest in any new hardware to run an LLM locally until you've tried the model for a while through OpenRouter.
The Qwen models are cool, but if you're coming from Opus you will be somewhere between mildly to very disappointed depending on the complexity of your work.
OpenRouter-served models are often more heavily quantized than what you can run locally, or try for yourself on generic cloud-based infrastructure.
I have to say this was how I used GitHub copilot in vscode. I Used opus 4.6 for most tasks. I am not sure I want to keep my copilot plan now.
Exactly, it was just simply much cheaper and perfect for my usecase.
It makes enterprise deployments much easier because most orgs already have github enterprise.
I have thought about making a product out of something I'm building and trying to make the cost of my product a percentage on top of whatever I could resell Anthropic or OpenAI (or whatever) tokens for. I get this may be unpopular, maybe I should just stick with BYO-key.
Except Copilot doesnt bill you per token like all those companies do, they bill you per prompt, at least Copilot in Visual Studio 2026 which is insane to me, are they just hosting all those models and able to reduce costs of doing so?
No, like every other provider they're just losing money and hoping this will some day magically become profitable
No they are taking the massive L. Thats why they paused new sign ups.
Just for context to the insanity, they allow recursive subagents to I believe its 5 levels deep.
You can make a prompt and tell copilot to dig through a code base, have one sub agent per file, and one Recursive subagent per function, to do some complex codebase wide audit. If you use Opus 4.7 to do this it consumes a grand total of 0.5% of a Pro+ plan.
Thats why this paragraph is here:
> it’s now common for a handful of requests to incur costs that exceed the plan price
I wonder how many of those requests are "necessary" or end up being more correct/efficient than a single agent linearly go through the tasks.
I disagree. I like the standard interface, being able to easily switch models as things invariably change from week to week, and having a relationship with one company. That's why I'm a big fan of openrouter and Cursor. Not too much experience with Copilot, but I think there's a huge value add in AI middlemen.
I also just saw:
> Claude Code to be removed from Pro Tier? > https://news.ycombinator.com/item?id=47855565
I found the Copilot harness generally more buggy/disfunctional. After seeing a "long" agent response get dropped (still counts against usage of course) too many times I gave up on the product.
It doesn't matter how competent the actual model is, or how long it's able to operate independently, if the harness can't handle it and drops responses. Made me think are they even using their own harness?
At least Anthropic is obviously dogfooding on Claude Code which keeps it mostly functional.
I only ever used Copilot through OpenCode and for a while it was a crazy good deal. Quite possibly two orders of magnitude cheaper than API credits.
It was great while it lasted.
Some Opus models were free on Copilot, and in my country you cannot attach a repo to Gemini, that is limited to their premium offerings.
Which Opus models were free on Copilot?
Haiku, right?
It was so much cheaper! I subscribed with the monthly plan instead of the yearly one thinking that the deal won’t last. It has last a bit longer than expected.
Copilot was there in AI based development first with tab completions.
Now, it may be the right call to immediately give up and shutdown after Opus 4.5, but models and subscriptions are in flux right now, so the right call is not at all obvious to me.
The agentic AI models could be commoditized, some model may excel in one area of SWE, while others are good for another area, local models may be at least good enough for 80%, and cloud usage could fall to 20%, etc. etc.
Staying in the market and providing multi-model and harness options (Claude and Codex usable in Copilot) is good for the market, even if you don't use it.
I don't know what they have done to Claude, but when using through copilot it's truly awful compared to using it straight from the API.
I have always just used the API, but I decided to give copilot a go on the weekend because of the cheap price. And I am seeing weird behavior like I have never seen before... It will somehow fail to use the file editing tool and then spend an absolutely huge amount of time/tokens building a python script to apply the edit in a sub process... And it will spin it's wheels on stuff the API routinely just gets right in one shot.
This might have been bad timing. Copilot API broke things last weekend with caused a lot of tool calls in various agent harnesses to start failing like the edit tool.
Example zed issue https://github.com/zed-industries/zed/issues/54219?issue=zed...
The value add for me is that I can use the web UI to start chatting about and drafting stuff on my phone while I'm commuting to work.
1. They heavily subsidized their plans vs. paying for API. 2. They allowed me to use the subscription in every tool I wanted. 3. It covered both Anthropic and OpenAI.
> if they don't have to.
That's the only reason.
In many enterprises you'd need to be very lucky to get an approval for any service that doesn't come from MS.
I exclusively use prepaid OAI tokens when doing copilot work in visual studio. It's really easy to set up a "custom" model. The consistency is hard to beat and I can use the latest model on day one. I also get to see how the magic happens in my provider logs. Every token accounted for.
because I can swap multiple models at the same time and ask them to rubber duck against each other ? if anything I'd like more models in github
> The value-add that Microsoft brings to Github Copilot is near zero compared to directly buying from Anthropic or OpenAI
Over here in the EU, we need to store sensitive data in an EU server. Anthropic only offers US-hosted version of their models, while G-cloud and Azure has EU based servers.
You are not their target audience.
The value add is the GitHub integration. By far the best.
GH has cloud agents that can be kicked off from VS Code; deeply integrated with GH and very easy to set up. You can apply enterprise policies on model access, MCP white lists, model behavior, etc. from GitHub enterprise and layered down to org and repo (multiple layers of controls for enterprises and teams). It aggregates and collects metrics across the org.
It also has tight integration with Codespaces which is pretty damn amazing. `gh codespace code` and it's an entire standalone full-stack that runs our entire app on a unique URL and GH credentials flow through into the Codespace so everything "just works". Basically full preview environments for the full application at a unique URL conveniently integrated into GH. But also a better alternative to git worktrees. This is a pretty killer runtime environment for agents because you can fully preview and work on multiple streams at once in totally isolated environments.
If you are a solo engineer, none of this is relevant and probably doesn't make sense (except Codespaces, which is pretty sweet in any case), but for orgs using the GH stack is a huge, huge value add because Microsoft is going to have a better understanding of enterprise controls.
If you want to understand the value add of Copilot, I think you need to spend a bit of time digging into the enterprise account featureset in GH, try Codespaces, try Copilot cloud agents. Then it clicks.
> I don't understand why anyone would want to deal with Microsoft as a vendor if they don't have to.
This is about personal plans. Github Copilot is half the price of any competition I found.
It's just a decent deal for light users.
Over the past month, I started a GHCP ~$12 Pro sub, and found I hit my quota about half way through March or so (but I also wasn't being very...frugal). So I signed up for Claude (~$20 Pro) for a month, and I liked it at first, but the 5 hour window was very annoying, and I hit it quite a bit. The first ~week of April was nice though, and I could use Claude to the limit, and then switch to GHCP. I've sym-linked my instructions so it was more or less easy switching back and forth when I hit a limit.
However, Claude changed their limits so I got to 100% very easily, and when I did hit 100%, I couldn't be given a "window" of snapshoting my work into something for another agent (either future claude window or GHCP agent) to easily pick up mid-work.
I found the lack of visibility into what costs what was very annoying. For $20/month, you get an arbitrary amount of usage that they were changing without notice or alerts or visuals. I didn't renew CC after it expired and just kept with GHCP.
Even with this announcement from GHCP, I haven't run into a limit. I'm considering upgrading to Pro+ if I don't see a limit.
But I stick with Sonnet more or less in both environments. I only used Opus for a couple of planning sessions at the very beginning, but JIT planning is done good enough by the more mid-tier models.
I spoke too soon. I just got told I need to wait 100 hours without warning. Crazy. Same problem I had with claude code.
Time to consider buying hardware, I guess
The value-add that Microsoft brings is checking the boxes that you want checked.
If you need some random Egyptian government compliance certification for your vendors or whatever, Microsoft probably has that, Anthropic probably doesn't. Microsoft's (as well as Oracle's) entire deal these days is figuring out what customers care about compliance-wise, and structuring their offerings to deliver exactly that. Whether they're selling their own products, or re-selling somebody else who doesn't have that kind of global footprint and clout, is secondary at best.
I'm fine with it seeing as I can use my student email and get free usage
access to all of the latest and greatest models for half the price of a single company's basic plan is (or rather, was) a very compelling option
Even in solo development, the value add of developer experience benefits of Github Copilot's integration into Github itself (kick off agent work from your phone on the GitHub site), VS Code, and other tools is quite high.
For direct instance: Anthropic's Claude Code, despite being primarily written in Python, didn't even properly support Windows until far too recently (suggesting people use WSL instead) and even now is not a great Windows experience (requiring git bash, illuminating that among other things Claude's models themselves haven't trained enough on PowerShell, and I try to avoid Claude's models when working on PowerShell scripts still, personally).
Meanwhile VS Code works everywhere I want it to, out of the box, and VS Code's GitHub Copilot integration does the same.
Also, your "near zero" value add includes engineers at Microsoft/Github following the "which is the smartest/most practical model" meta-game for you and just silently updating defaults in Copilot for you without needing to make conscious choices. Sure, you can follow that meta yourself by watching HN every day and sampling hundreds or thousands of opinions across dozens to hundreds of stories each day, then play a "Netflix subscription game" of switching subscriptions every X months when the meta-game shifts or you can pay Microsoft to do all that research for you (which true also includes their professional business relationships/contracts with OpenAI and Anthropic, which is as much of a feature as a bug in my opinion because it's also a signal in the opinion war noise of choosing the "smartest model [for you, right now]" and does show up as its own HN stories for meta-debate). At least to me that's much more than a 0% or 1% value add, but maybe that's also because I don't trust either Anthropic or OpenAI directly, I sort of don't trust HN's comments as a strong guide to playing the meta-game, it's not a meta-game I want to play, and I'm happy to pay someone else to play it for me.
I have a GitHub Pro subscription, renewed for the 2nd year, and I just found out I can no longer use Opus with it. Opus was one of the reasons I had a subscription in the first place.
Opus 4.6 had a 3x multiplier in Pro. Now the new Opus 4.7 model has 7.5x in Pro+, which offers 5x more requests, but costs 4x more than Pro. So now Opus is essentially 2x the price it used to be.
It’s likely that Sonnet 4.7 will be the new 3x model in Pro — https://github.blog/news-insights/company-news/changes-to-gi...
This whole thing is a massive asshole move, and probably illegal in all countries with a minimum set of consumer protections.
Reading the comments here drives home an industry wide problem with these tools: people are just using the latest and most expensive models because they can, and because they’re cargo-culting. This is perhaps the first time that software has had this kind of problem, and coders are not exactly demonstrating great discretionary decision making.
I’ve been using Anthropic models exclusively for the last month on a large, realistic codebase, and I can count the number of times I needed to use Opus on one hand. Most of the time, Haiku is fine. About 10% of the time I splurge for Sonnet, and honestly, even some of those are unnecessary.
Folks are complaining because they lost unlimited access to a Ferrari, when a bicycle is fine for 95% of trips.
> Most of the time, Haiku is fine.
Haiku is most definitely not fine for the code bases that I work on. Sonnet is probably fine for most daily tasks, but Opus is still needed to find that pesky bug you've been chasing, or to thoroughly review your PR.
> Haiku is most definitely not fine for the code bases that I work on. Sonnet is probably fine for most daily tasks, but Opus is still needed to find that pesky bug you've been chasing, or to thoroughly review your PR.
Yeah, I hear that a lot, but it never comes with proof. Everyone is special.
I’m sure you’d find that Haiku is pretty functional if there were a constraint on your use.
I don't think it's really helpful to tell people they're holding it wrong, especially when you hear the problem a lot.
Maybe, just maybe, the tool isn't suitable for all problem spaces.
> I don't think it's really helpful to tell people they're holding it wrong
I’m not saying that. If anything, it really doesn’t matter much what model you use, and it’s only a case of “you’re holding it wrong” in the sense that you have to use your brain to write code, and that if you outsource your thinking to a machine, that’s the fundamental mistake.
In other words, it’s a tool, not a magic wand. So yeah, you do have to understand how to use it, but in a fairly deterministic way, not in a mysterious woo-woo way.
“Everyone is special” is a snarky, derogatory comment we don’t need here.
It’s not snarky. It’s literally the argument people are making: I am special, my use case is exceptional, therefore I need to use the special tool, even if you don’t need to.
I use models from Opus through Haiku and down into Qwen locally hosted models.
I don't know how anyone could believe that Haiku is useful for most engineering tasks. I often try to have it take on small tasks in the codebase with well defined boundaries to try to conserve my plan limits, but half the time I end up disappointed and feeling like I wasted more time than I should have.
The differences between the models is vast. I'm not even sure how you could conclude that Haiku is usable for most work, unless you have a very different type of workload than what I work on.
More information required. What are you working on? What languages? How do you define “small tasks”? What are “well-defined boundaries”? What is your workflow?
Most importantly, define your acceptance criteria. What do you mean by “disappointed” - this word is doing most of the heavy lifting in your anecdote. (i.e. I know plenty of coders who are “disappointed” by any code that they didn’t personally write, and become reflexively snobby about LLM code quality. Not saying that’s you, but I can’t rule it out, either.)
The models are not the same, but Haiku is definitely not useless, and without a lot more detail, I just ignore anecdotal statements with this sort of hyperbole. Just to illustrate the larger point, I find something wrong with nearly everything Haiku writes, but then again, I don’t expect perfection. I’d probably get a “better” end result for most individual runs with the more expensive models, but at vastly higher cost that doesn’t justify the difference.
I use Haiku frequently, and for my codebase it is working fine.
But I'm not vibecoding, I don't let models do large work or refactorings, this is just for some small boring tasks I don't want to do.
>> Yeah, I hear that a lot, but it never comes with proof. Everyone is special.
You were the one who made the claim that Haiku is fine most of the time. To any reasonable person, the burden of proof is on you. Maybe you should share some high level details about your codebase, like its stack, size, problem domain, and so on? Maybe they are so generic that Haiku indeed does fine for you.
I think Haiku is fine (e.g.) for any task that you could almost, but not quite, complete with (regex?) find and replace.
You give it 3 examples of the change you want, then ask it to do the other 87. You'll end up saving time and “money”.
Most of the people using these models aren't skilled enough to make that determination. Seems rough trying to sell yourself as the thing that means you don't need to understand what you're doing but also insist that you understand what you're doing well enough to select an appropriate model.
> I’ve been using Anthropic models exclusively for the last month on a large, realistic codebase, and I can count the number of times I needed to use Opus on one hand. Most of the time, Haiku is fine. About 10% of the time I splurge for Sonnet, and honestly, even some of those are unnecessary.
I mean at some point some people learn...
I was doing Opus for nasty stuff or otherwise at most planning and then using Sonnet to execute.
Buuuuut I'm dealing with a lot of nonstandard use cases and/or sloppy codebases.
Also, at work, Haiku isn't an enabled model.
But also, if I or my employer are paying for premium requests, then they should be served appropriately.
As it stands this announcement smells of "We know our pricing was predatory and here is the rug pull."
My other lesser worry isn't that Opus 4.7 has a 7.5x multi, it's that the multiplier is quoted as an 'introductory' rate.
Haiku is complete crap compared to sonnet in GHCP. A basic task in Haiku takes 3 prompts with a lot of correction. 1 prompt in sonnet. It isn't worth a third of the price if I have to fix it twice.
I think it heavily depends on how you're using it. If you understand your codebase and you're using it like "build a function that does x in y file" then smaller/cheaper models are great. But if you're saying "hey build this relatively complex feature following the 30,000 foot view spec in this markdown doc" then Haiku doesn't work (unless your "complex feature" is just an api endpoint and some UI that consumes it).
I largely agree. But that goes back to my point (albeit with mixed metaphors): there are lots of people who are just hitting things with a jackhammer in lieu of understanding how to properly use a hammer.
I basically never just yolo large code changes, and use my taste and experience to guide the tools along. For this, Haiku is perfectly fine in nearly all circumstances.
AI should decide the level of model needed, and fallback if it fails. It mostly is a UX problem. Why do I need to specify the level of model beforehand? Many problems don't allow decision pre-implementation.
That’s certainly an opinion. Not one I agree with, but sure, if you entirely outsource all of your thinking to the magic box, then you probably want the box to have the strongest possible magic.
This is the approach of Auto in Cursor and I've not been impressed with it at all. I think I'm always getting Composer and while its fast it wastes my time. GLM 5.1 in OpenCode is far better and less expensive, it can do planning and implementation both very effectively. Opus is still the best but GPT 5.4 (in Codex) is good enough too, and way more affordable.
This would require LLMs being good at knowing when they are doing a bad job, which they are still terrible at. With a good testing and verification harness set up, sure, then it could just go to a more powerful model if it can't make tests pass. But not a lot of usage is like this.
At the current cost, I just use the best model all the time. Why wouldn't I?
Because judging failure is itself a complex task requiring a potentially expensive model.
Model selection for day to day tasks based on vibes is not very scientific. Micromanaging the model doesn't seem like a great idea when doing real professional work with professional goals/deadlines/pressures.
It’s deeply ironic that the folks who want to outsource as much thought to the model as possible are saying that my stance - use your brain to decide the right tool for the job - is tantamount to “vibes”.
You are being deeply reductive and that's against the spirit of hacker news. The issue is that models are difficult to objectively benchmark. The benchmarks don't always align with real world performance. It's not easy and clear cut to determine which model will work best in a given situation. It boils down to loose experiences/anecdotes. Do you have an objective criteria for model selection that you have tested to be effective with reproducible tests?
> Micromanaging the model doesn't seem like a great idea when doing real professional work with professional goals/deadlines/pressures.
Remember that it's not only the cost per token, but also speed. Some tasks are done faster with simpler/less-thinking models, so it might actually make sense to micromanage the model when you have deadlines.
If you're using the models to generate 99%-100% of the code, then it doesn't make sense to plug yourself into the loop as a bottleneck.
It is not that simple; companies retire old models. I wanted to use 5.1 Codex Max to save money and I could not on my subscription.
> people are just using the latest and most expensive models because they can, and because they’re cargo-culting. This is perhaps the first time that software has had this kind of problem, and coders are not exactly demonstrating great discretionary decision making.
> I’ve been using Anthropic models exclusively for the last month on a large, realistic codebase, and I can count the number of times I needed to use Opus on one hand. Most of the time, Haiku is fine. About 10% of the time I splurge for Sonnet, and honestly, even some of those are unnecessary.
You and I couldn't have more different experiences. Opus 4.7 on the max setting still gets lost and chokes on a lot of my tasks.
I switch to Sonnet for simpler tasks like refactoring where I can lay out all of the expectations in detail, but even with Opus 4.7 I can often go through my entire 5-hour credit limit just trying to get it to converge on a reasonable plan. This is in a medium size codebase.
For the people putting together simple web apps using Sonnet with a mix of Haiku might be fine, but we have a long way to go with LLMs before even the SOTA models are trustworthy for complex tasks.
I don’t use Haiku for planning of big tasks, so we basically agree on that. But even just Sonnet 4.6, on a fairly large codebase, only truly goes into the weeds maybe 10% of the time for me. I also write pretty specific initial prompts, and have a good idea of how I want the code to work before I start prompting. For example, sometimes I will spend several hours writing a spec before even picking up the power tools.
I have never had the situation you describe, where Opus won’t come up with “a reasonable plan”, but your definition of “reasonable” might be very different than mine, and of course, running through your credit limit is an entirely tangential problem.
I think the reason is two fold:
- If you pay for unlimited trips will you choose the Ferrari or the old VW? Both are waiting outside your door, ready to go.
- Providers that let you choose models don't really price much difference between lower class models. On my grandfathered Cursor plan I pay 1x request to use Composer 2 or 2x request to use Opus 4.6. Until the price is more differentiated so people can say "ok yes Opus is smarter, but paying 10x more when Haiku would do the same isn't worth it" it won't happen.
Agreed on both points. We’re dealing with a cost/benefit analysis, and to this point, coders have been subsidized, coerced…maybe even mandated into using the most expensive option as if it was a limitless resource. Clearly not true, and so of course we’re going to see nerfing of the tools over time.
Obviously we’re a long way away from being able to rationally evaluate whether the value of X tokens in model Y is better than model Z, let alone better in terms of developer cost, but that’s kind of where we need to get to, otherwise the model providers are selling magic beans rated in ineffable units of magicalness. The only rational behavior in such a world is to gorge yourself.
85% of my code tasking can be handled by either GLM or Sonnet. The truth of the matter is that most software isn't that complicated. Even more hilarious is that people were running Opus on their OpenClaw setups. I'm glad Anthropic kicked them to the curb.
Claude Code doesn't have an option to use Opus 4.6 any more for me. It was great, but I guess now I have to use it half as much or upgrade my subscription again.
>people are just using the latest and most expensive models because they can,
While I agree with the sentiment, I think that might have been initially driven by older models being nerfed and/or newer ones were better at token/$. And there is this notion that those labs don't constraint the model on the first days after its release.
> coders are not exactly demonstrating great discretionary decision making.
From a business perspective, why would I start thinking about which model to use, when I could cheaply always use the best model?
Of course you don't NEED the better models, but figuring out what model you need can waste a lot of time and effort. Even when a cheap model is capable of a task it needs a lot more guidance than a more expensive one. They are also less reliable. You can waste a lot of time cleaning up after them. Judging whether something is good enough is hard work and rerolling with a more expensive model is painful. Judging the difficulty of a task ahead of time is very hard. Judging how good a model is for a given task even harder, especially when models and harnesses keep changing all the time. The real productivity boost LLMs provide is already modest and when you start tinkering with models it can easily evaporate.
> This whole thing is a massive asshole move, probably illegal in all countries with a minimum set of consumer protections
Why would it be illegal in any country? Did you pay for an year upfront? Even if so they're offering a pro-rated refund according to the linked blog post:
> If you hit unexpected limits or these changes just don’t work for you, you can cancel your Pro or Pro+ subscription and receive a refund for the time remaining on your current subscription by visiting your Billing settings before May 20
Not sure where the expectation that a business should continue serving you at a given price till the end of time no matter what came from.
I'm in the same boat as you. Wish I had known this before my subscription renewed. There's no longer any value in paying them for this service when I can cut them out of the equation and pay the model providers directly.
They are for large corps with bureaucracy, big spend on anthropic is difficult to get approved but microsoft services get greenlit instantly
If you cancel you get a prorated refund
Think someone got the bill and worked out their burn rate and pushed the big stop button.
Remember when you are renting other peoples computers they can and will change the terms for their benefit. They own it. You dont. You rent it.
It's even worse:
The multiplier got 2.5x-ed (from 3 to 7.5)
The minimum plan with Opus access is 4x costlier.
That's a 10x total price increase for having access to Opus at all.
But yes, if you account for the 5x more requests, then it's 2x – not relevant though if you're like me and wouldn't usually max out the quota.
This thread is pretty quiet for what strikes me as a substantial set of changes with, presumably, more substantial changes still to come for anyone not grandfathered into a Pro plan.
I get the impression that the intersection of HN posters and Copilot users is quite small in practice; that Claude Code and Codex suck up all the oxygen in this room. But it seems plausible we’ll see similar “true costs greatly exceed our current subscription pricing” from Anthropic and OpenAI someday soon…
The ux of copilot driving Claude beats Claude Code handily.
I never understood the low visibility.
Expensive ram is annoying. I don't look forward to expensive ai.
> more substantial changes still to come for anyone not grandfathered into a Pro plan
The change applies to existing subscriptions, some paid a year in advance.
You can get a refund, from the article:
> If you hit unexpected limits or these changes just don’t work for you, you can cancel your Pro or Pro+ subscription and you will not be charged for April usage. Please reach out to GitHub support between April 20 and May 20 for a refund.
>and you will not be charged for April usage
They removed this now without notice but Wayback Machine still has it: https://web.archive.org/web/20260420190656/https://github.bl...
They now tacked on an Editor's note to the blog post.
Indeed!
I just found out via other news sources, and was surprised I hadn't seen it on HN already.
Using Copilot Pro with Pi, way better and smarter than using Claude Code. I haven't gotten a single e-mail and just wanted to use Opus (I use Sonnet 95% of the time with Opus for issues where Sonnet is struggling) and got an error message. No prior warning, nothing, I'm pissed. They just rugpulled all paying customers man. I liked Copilot because I can plan my usage over a whole month and I'm not "forced" to use it for a week before hitting limits unlike Claude and Codex.
Anthropic literally just removed Claude Code from their Pro plan today, so you're even more right than you know.
Do you have a citation on this? I have a Claude Pro subscription and looked at the comparison page and it says this under Pro: Everything in Free and: Claude Code directly in your codebase Power through tasks with Cowork Higher usage limits Deep research and analysis Memory that carries across conversations
Go to the pricing page: https://claude.com/pricing
As of right now, it says Pro includes Code and Cowork. (At least, for me. There could always be A/B testing going on.)
There is A/B testing going on and for a while several pages on Anthropic's site did remove Code from pro (https://old.reddit.com/r/ClaudeAI/comments/1srzhd7/psa_claud...) if you want a lot more details.
It was recently discussed in https://news.ycombinator.com/item?id=47855565
....
Speaking as someone where he only 'real' option we have at work is Copilot Plugin, but I also use Copilot Plugin at home....
This is a shitty shitty shitty move.
As a personal user, I can now only use Opus 4.7 at a 7.5x 'Introductory' multiplier if I upgrade to pro+, but at work I can still apparently do Opus 4.6 at a 3x Multiplier on my work 'enterprise' account.
Honestly it strikes me as though someone at Github Copilot took Palantir's manifesto to heart; Screw the individual, consolidate power to companies on every level.
> But it seems plausible we’ll see similar "true costs greatly exceed our current subscription pricing" from Anthropic and OpenAI someday soon
Enterprise might stick around, but individually, I reckon the developers will flock to OpenCode + open weights (Qwen/GLM/Codestral). The problem then is, if the open weight models impress these new adopters, they will shout about it from rooftops (conferences, social media, blogs) in unison, which might result in an exodus. Especially troublesome considering developers are a major market for both frontier labs (Anthropic & OpenAI) & its IPO ambitions.
Yesterday, Opus 4.6 cost three credits. You can no longer use 4.6 or 4.5.
Opus 4.7 is available today for 7.5 credits per prompt.
They have also suspended new signups.
After testing all of the major IDEs/tools that integrate with LLMs over the last four weeks, I was happy to settle on Copilot. I, and others, seem to be a lot confident in that decision. Especially since there seems to be no refund path for people who prepaid for a year.
In my 30+ years online, I've never seen an industry change so much in terms of pricing, service levels, etc, as I have the last two months.
I'm really curious where all of this lands, and if AI coding tools will be something that only a small percentage can genuinely afford at a competitive level.
> In my 30+ years online, I've never seen an industry change so much in terms of pricing, service levels, etc, as I have the last two months.
Warning: baseless speculation/theorizing ahead.
This is the consequence of LLM inference being really expensive to run, and LLM inference companies being really attractive to VCs. The VC silly money means their costs are totally decoupled from revenue for a while, but I guess eventually people look at incomings vs outgoings and start asking questions.
Previous big trends like SaaS apps, NFTs, blockchain etc were similarly attractive to VCs (for a period of time at least for the last two, the first one is still pretty attractive to VCs), but nowhere near as expensive to run so the behaviour of the companies running them wasn't quite the same.
AI is still in the "VCs subsidizing everything" -phase.
So:
- DO use AIs to build tools for yourself faster. If the AI goes away, the dashboard and scripts you made will still work.
- DO NOT build your business on top of 3rd party AI services with no way of swapping the backend easily. The question isn't whether there's going to be a "rug-pull", but when it happens. It might be sudden like this one or gradual where they just pump up the price like boiling a frog.
>it’s now common for a handful of requests to incur costs that exceed the plan price!
I think this is really telling. The cost of AI has really been masked HUGELY to drive adoption. The true cost is likely to be unsustainable for the big complex tasks (agents running for hours+) that companies have been pushing.
I was skeptical, then quietly bullish on AI, but I'm now seeing signs the market is cracking and the availability is going to receded/costs balloon.
Copilot is pretty unique in that they were only measuring requests, and that model is just broken for agentic systems.
CC wasn't per-token either? Nor is Codex?
From my simple checks - and from Microsoft's own blog - per token pricing isn't going to be realistic for agentic coding either.
Claude Code is definitely token based, its been discussed extensively on Hacker News and the related Github threads. A large context cache miss can take half your usage easily in just one request... "max" just means more reasoning tokens. I've also run out of usage during a single request in CoWork. Its definitely token based.
They don't show your usage in tokens for Claude Code and Codex subscriptions, but that is how they are doing the accounting.
Opus was the reason I paid for GitHub copilot, but they had the pricing model completely wrong. I could assign copilot to a substantial issue using Opus and have it handle 30 minutes of work with many subagents with iterative testing. With 300 "premium requests" a month I could have copilot do substantial work for 3 premium requests per issue. It was very clear that this was unsustainable for Microsoft to pay for, so I expected change to come.
However, I never expected Opus 4.6 to be removed for the cheaper plans. I expected the pricing model to change, but not to lose access to the model. Moving to being token-based makes sense. It makes the cost more closely aligned with user pricing.
It was nice while it lasted. I got Opus 4.5 to do a lot of work from the beach by assigning it to detailed issues. With this news I've cancelled my Pro subscription. That will help a bit with their capacity issues.
I have been needing to cut back on my subscription services, too. I also canceled.
> Opus was the reason I paid for GitHub copilot, but they had the pricing model completely wrong. I could assign copilot to a substantial issue using Opus and have it handle 30 minutes of work with many subagents with iterative testing. With 300 "premium requests" a month I could have copilot do substantial work for 3 premium requests per issue. It was very clear that this was unsustainable for Microsoft to pay for, so I expected change to come.
Yes I was hearing that a lot the past few weeks. Its pretty clear what happened:
Anthropic demand soared from OpenClaw, but they were already over-sold. Cowork shipped, Hegseth flexed and a lot of people and entire orgs moved from ChatGPT to Claude in the space of a month. They couldn't handle the demand - they quantized their models, dropped effective usage limits and made all kinds of tweaks in Claude Code to reduce token burn.
Lots of customers fled to Codex and they got crunched as well. Some people noticed Copilot was still selling dollars for nickles and mentioned it to other people.
The only question I care about: Is z.ai next? GLM 5.1 is simply where its at right now. It is not the best model, but it is much better than Sonnet at 1/5 the cost.
This is quite the rug pull.
I've been using the Pro+ with Opus 4.6 very successfully and being charged 3x rate was mostly acceptable.
But removing Opus 4.6 and replacing with Opus 4.7 with a 7x rate is just insane!
Note that the 7.5x multiplier is only for the promotional period (until end of April), then it'll get even worse. If I had to guess it'll be priced at 10x.
I got in there quick yesterday with a refund once they pulled 4.6 and got my last month of about 1200 premium prompts free, nice.
If they don’t give refunds I can just do a charge back. It’s not what I paid for.
They will also change the business and enterprise plans to token based: https://www.wheresyoured.at/news-microsoft-to-shift-github-c...
That was the best thing about Copilot. It was too good to last.
Good thing I had just finished migrating all of my workflows to OpenCode for the time being!
It's a shame because the VsCode copilot experience is quite good out of the box compared to all of the other harnesses I've used. But with typical lack of transparency, and sudden, harsh changes... What are they thinking?
After the restrictive rate limiting they've already instituted, I'm simply cancelling and continuing by using providers directly.
Which providers do you use ? I find copilot's prices pretty hard to beat, but if there's something better I'll cancel too.
I've been happy with GLM 5.1
Which service provides this one for a decent cost ?
Someone else in another thread mentioned OpenCode, which looks neat at £10/mo: https://opencode.ai/go
Ah good catch. I like their CLI too. Though sometimes it feels like they take security a bit too loosely
they don't restrict you to using their opencode agent, you can use go in any other agent
Without access to Opus, it seems to me that the limited context size you get isn't worth the subsidy, especially once you blow through the included requests, the cost seems kind of hard to predict because it's 'request' based not token based.
I just opted for ChatGPT Pro; I'm trying to take advantage of the increased usage limits they are offering, and from what I heard, Claude Pro was also having bad rate limiting for many people.
Great. I, a small consultancy, have just spent the last month working out a workload that uses Opus 4.6 via VS Code to prep horrible, inconsistent, survey data for upload to a proprietary platform. Worked a treat with some light babysitting.
It's the sort of messy job that agents excel at. Decisions need to be made on free text data, translations done into multiple languages, ambiguity handled.
I now need to recheck it still works with another model, which involves a lot of manual verification; and potentially move to Claude Code and pay more money I can ill afford right now.
I'm not even clear from the post when this comes in, I'm guessing effective immediately.
This really hammers home for me the point that we should not be renting our tools.
My own dumb fault for trusting them, I will make sure to learn from this.
So this is pretty devastating to my general workflows [1] right now, and poorly timed to boot, with no wind-down at all.
It was clear (see the linked post from 70 days ago) that the current offering was unsustainable, but I'm a bit taken aback at how sharp the clawback is.
[1] https://news.ycombinator.com/item?id=46938246
Yes, Github's per-request pricing was insane; anyone suggesting using CC instead or asking if any other provider is as cheap just doesn't understand the insanity. Clearly losing a lot of money on the people making good use of it.
I was actually hoping they would change it to something that more closely tracks their actual costs so that they wouldn't have to rug-pull this badly. In particular what was really bad about it was that sending prompts to agents while they were working (to give them corrections) cost extra so I stopped doing that (after initially OpenCode didn't cause billing for that, until they became official).
I guess it makes more sense for me to just get Claude Pro instead. I was using my Copilot license only because of Opus 4.6 access as all other models seemed crippled in comparison in Copilot; does not even make sense to upgrade to Pro+ which goes from $10/mo to $40/mo and only gives you access to a model that has 7x the rate - 5x the limit at 7x the rate for 4x the price does not seem appealing at all.
Claude Pro no longer includes Claude Code
where have you read this? According to https://claude.com/pricing/pro it is included.
https://bsky.app/profile/edzitron.com/post/3mjzxwfx3qs2a
According to https://claude.com/pricing it no longer is. They’ve explicitly updated that page to exclude it (it was on there before)
its showing claude code as included in pro now on that page
Apparently if you believe them it was a “test”
A test to see if they could get away with it. I think we're really in the thick of token rationing right now and the fallout is going to be funny to watch.
I wouldn't mind this change that much if opus-4.7 worked properly in copilot cli. It keeps stopping mid-thought or task and forces me to waste more prompts for no observable reason.
Looks like I'm ending my subscription, good (likely too good, no way my account was even remotely within profitable range) access to opus-4.6 was the only reason I used this at all.
Are you using through regular copilot (the 'local' agent type), or through the separate claude agent type (which I believe you have to activate in your repository settings on github).
I had the exact same issues with the latter - randomly stops working, wipes chat history, just generally seems to be totally broken. But the former works totally fine and still lets you select sonnet/opus. My experience was before this recent 4.6 -> 4.7 change though.
Regular local agent. Seems like as soon as the context fills up (and it only has about 160k of context so that doesn't take much) it starts to fall to pieces. I even tried using opencode as a harness instead and it causes opus 4.7 to lose all memory every time I hit a compaction step.
Welp. I already added a $20 Claude Pro subscription to complement my $10 Github Copilot Pro subscription and $10 DuckDuckGo Plus. That was partly to show support for Anthropic after the OpenAI/DOD episode, but also because I've been using Opus 4.5 exclusively with Copilot and I figured I should try Claude Code eventually.
Now it's going to cost me an upgrade to $39 Github Pro+ to keep using Opus, and even then it's with much higher multipliers. I don't fully understand the extent to which this reflects actual costs for Opus versus Microsoft leveraging network effects to discourage the usage of a competitor.
I didn't really want to wander outside of VSCode just yet because I was happy with VSCode/Copilot/Opus-4.5 and I don't want to spend all my time experimenting when stuff is changing so fast. But I guess my hand has been forced.
> I didn't really want to wander outside of VSCode just yet because I was happy with VSCode/Copilot/Opus-4.5
This was my first thought too but apparently you can just use Claude Code within VSC: https://code.claude.com/docs/en/vs-code
I've started messing with this and the experience seems pretty similar.
You can also use Claude Code in a VS Code terminal window, which I much prefer for reasons I can’t quite put my finger on. Granted, I’ve moved to Zed in the past few months. I’m doing the same there.
>$39 Github Pro+ to keep using Opus,
For what its worth, i have been paying for Pro+ and i still got locked out of Opus. I only have access to Opus 4.7 at 7.5x
All of the evidence shows that anthropic is dealing with a capacity crisis since February.
There are no good solutions for them.
If OpenAI is indeed overbuilt they will completely eliminate Claude.
I saw some Reddit rumours going around and locked myself into the yearly Pro+
I guess overall probably was a good decision.
But 7.5x as well as quota limits is pretty hard to swallow.
The annoying thing about the quota limits is they make it really awkward to actually fully utilize the 1500 premium requests you are paying for.
Like if you don’t plan working around the daily and weekly quotas you may not actually be able to utilize your full request allocation.
Claude has the same issue. Single session blows through the quota.
Yeah I'm a bit confused by the double quota/rate limit situation
I cannot describe how disappointing it is to be switching to this insane time limit window based pricing. I absolutely abhor that I'll be subjected to 5 hour chunks of time where I'll be limited at some point in that window of time, and be told I'll have to wait. And then there is a weekly limit.
That's not how my creative energy works. I have time that I want to solve problems, and I want to solve them. I don't want a cooldown timer applied to solving a problem. Not to mention the anxiety of realizing that while I sleep I could have burned tokens in that time.
I'm incredibly disappointed when I sat down to my hobbyist programming time and realized copilot was suddenly and dramatically changed in a way that is incredibly disheartening.
Meter my token usage DON'T tell me when I can use them! ARGH.
Agreed. WTF is the point of offering a certain # of messages on each plan tier if you're then rate limited and can't even make full use of them?
> I'm incredibly disappointed when I sat down to my hobbyist programming time and realized copilot was suddenly and dramatically changed in a way that is incredibly disheartening.
Guess it’s time to rediscover the lost art of programming without an LLM.
> Meter my token usage DON'T tell me when I can use them! ARGH.
Think GitHub will do that eventually, just like everyone else is. TFA ends with:
I cannot understand people still using anthropic models on copilot, when gpt 5.4 is better and 3 to 7 time cheaper. Anthropic quite obviously raised their licensing to the max. You probably can still have a taste of it for a few minutes before being limited on their own subscription.
Simple, for what I'm doing Opus 4.6 (and before that, Opus 4.5) are just much better at following my instructions and achieve consistently better results.
From what I've been gathering, this split in success seems to depend a lot on the types of tasks, the domains / programming languages / frameworks used, and style of prompting.
I couldn't get 5.2 to follow instructions for the life of me, even when repeating multiple times to do / not do something. 5.3-codex was an improvement and 5.4 while _usually_ decent still regularly forgets, goes on unnecessary tangents, or otherwise repeatedly stops just to ask for continuation.
Sure, I'm paying 3x more per request, but I'm also doing 5x fewer requests.
Or well, used to. Still bummed about them dropping 4.6.
My experience is similar. Opus, especially Opus 4.5, understands my intentions better even when poorly phrased, and more consistently follows my instructions to do only what's necessary and no more.
As far as I can tell, the distinctive feature of my workflow is that I'm giving it small, contained single-commit-sized tasks and limited context. For instance: "For all controller `output()` functions under `Controller/Edit/` and `Controller/Report/`, ensure that they check `Auth::userCanManage`." Others seem to be taking bigger swings.
Anecdotally, I experimented GPT-5.4 xhigh and something about the code it wrote just didn't vibe with me.
It felt like I constantly have to go back and either fix things or I just didn't like the results. Like the forward momentum/progress on my projects overally wasn't there over time. Even with tho its cheaper it just doesn't feel worth it, to the point I start to feel negative emotions.
I'm actually a bit worried that I've somehow become to feel more negative emotions with agentic coding. Quicker to feel frustrated somehow when things aren't working.
Same for me. I would still be happy with my Copilot Pro subscription if I could use 5.4 with 1x coefficient (and 5.4 mini with 0.33x).
But seeing that they are stopping to get new subscriptions, and rumours/evidence that they plan to increase coefficients of remaining models, it seems they want us to see "the writing on the wall"
GPT's output is awful and it gets even more awful when you try to work out a solution "together" because it shits out 10 paragraphs with 20 options instead of focusing and getting things done.
Worst part is them doing this mid-billing cycle and not at the start of the next in 11 days. I cancelled and requested a refund.
I'm not surprised at all. This was one of the most generous plans out there, offering frankly ridiculous pricing based on a single prompt regardless of turns taken or tokens used. I was subscribed for a month around Christmas and got a shitload of tokens out of Opus 4.5 for a measly $10.
context: using student pack's "pro" plan for a long time, with exposure to enterprise "pro" plan also.
given the recent changes that kneecapped the plan for students [1], i feel less bad after seeing this. always had monthly limit on premium requests shown in the extension (which i would watch in dread creep up), the daily/weekly "usage limits" part seem ambiguous at best.
using agentic workloads as the basis for this change does not sit quite right with me. if you look at the newly added debug mode, you may notice the token consumption as well as the subagent/tool calls made behind the scenes. my takeaways:
- it consumes way too much tokens for simple tasks (had one use case where the agent burnt 16+ million tokens just to make 50 line change in a monorepo using plan -> agent approach)
- even when you select a model in the dropdown, the subagents/tools can be called with an entirely different model, often the haiku-4.5. gpt-4o is widely used for creating summaries or titles to display for the plan.
- the new reasoning modes have exacerbated the token burning as the agent tends to loop a whole lot. the prompt vs plan token ratio is quite minuscule, and when combined with your own instruction files and skills, it just goes out of the window.
i think they have given a generous model in the past, but by kneecapping the lower tier, it no longer justifies existence. if they want to raise prices, they can raise the floor. or rather put some work in improving their own orchestration system before putting the blame on the users vibing it out.
[1] https://news.ycombinator.com/item?id=47500445
Just as I ran out of my quota for today, and was planning to upgrade to pro...
Also, I've never ran into the quota limit before (I only use inline suggestions). The limits have definitely shrunk over time.
So can I just have my credit card do a charge back? This fundamentally changes the thing I paid for.
Always have the feeling github is overrated, at least in combination with Claud, we need something more smooth. Does is exist?
It's quite cheap at $10 at 1000 premium requests (1 request is like a plan mode + implementation + tests + commit & push). The only problem is I have already used it all, but was billed on the 3rd day of the month, and have to wait till next month to use it.
Damn it was good while it lasted, but it was obvious the previous per request pricing scheme was misaligned with their actual costs. MS's product people must be seriously detached from their technical and financial people for it to have even lasted this long (or they're willing to burn a lot of money for the typical "make customers happy and then rug pull" cycle, but hey, Hanlons razor).
Given that they've already silently had session + weekly rate limits for the past couple weeks already at least (I've hit them), I wonder if this change is just making them visible to the user, or if it's actually tightening them too.
If it's the former then I can say they're still significantly more generous than claude pro (on the pro+ plan), so this might be okay. If it's the latter, and the new limits are similar to claude pro then copilot is going to be significantly less useful to me.
I have Copilot Pro+ and discovered i cannot use Opus anymore today! Are we reaching the end of VC funded productivity?
If you’re a paying customer, it’s paying customer funded, not VC funded.
That is not necessarily true.
Removing access to opus is pretty funny. At least they recognize it’s unacceptable and tell you to go get a refund.
The per-request model was pretty insane.
So much for using my secondary Copilot plan with VSCode to hammer Opus 4.6 on a per-request basis.
The joke is on them, though (maybe) because this also means that there's literally no reason to keep that account active.
Oh nuts, I forgot I was on Copilot. I used to use it for auto-complete and so on. I haven't used it in over a year and I'm still paying for it. If you're like me you'll find it here: https://github.com/settings/billing/licensing
And you can then cancel it. I have no idea what a premium request is and it's all just too complicated to use.
> I have no idea what a premium request is and it's all just too complicated to use.
Copilot (before today) had one of the simplest & cheapest pricing on the market.
Ladies and gentlemen, last round, free lunch is over.
Microslop, 'xcuse me, Microsoft is working hard to make github less and less appealing. It's a bit weird how an initially fairly good idea, over time becomes ... worse.
Time to learn how to use ghost text API.
I consider migrating to claude, what a shame they dont have a github copilot plus like price tier.
I wonder what this will mean for all of my students. I have been weaving copilot into my software engineering and code centric courses, but that depends on copilot pro as provided by the GitHub education package. If those signups are paused too, that means I can’t bring in any new ones, at least not to this stack. For me, I’ll probably have to send them to Claude code where they will have to pay for access. (Though it is a better product IMHO.)
This points toward a deeper issue though. We’ll probably see more individual offerings dry up over time. That means you’ll have individuals stuck with hand coding while the hyper productive AI assisted coders will all be at large organizations. If that happens, we’ll enter a phase where computing will once more be available exclusively to the elite few.
> it’s now common for a handful of requests to incur costs that exceed the plan price
Pricing per turn/request was/is an idiotic model and I'm glad they are paying for it. It just forces you into a workflow just to work around business model. Heck the best laugh would be to create a plan outside vscode with interactive CC/Codex then copy paste into GH copilot to do a single session burn of few M tokens.
Again ridiculous model.
So far they did not change it, and none of this applies to business and enterprise accounts. My idea is that it can still be viable as most businesses will have plenty minimally used licenses with just a few power users abusing the request model.
It really seems like the cheap inference is coming to a head very quickly.
Demand is increasing exponentially but supply is increasing linearly. NIMBYs inventing datacenter lies about sucking up water or noise is going to drive prices through the roof.
Noise in residential area is already a huge problem and data centers do in fact make it worse. They may be able to carve out exceptions in laws or push non-enforcement, but none of this changes the impact on human health.
I have a GitHub Copilot subscription and this really sucks.
I subscribed two months ago, frustrated with Claude Code and their tight session limits.
The Copilot offer was unbeatable 100 dollars for a 12 months plan, if I remember correctly.
It was pretty clear they were losing money, but hey, it's Microsoft and they need customers, so a competitive push on pricing is expected.
Let's see what these limits look like and I'll decide whether to cancel my subscription or not.
Still a terrible move from them.
Wow, and I already prepaid for the year. Figures.
This is some shit, coming with 0 notice at the start of a work week. My exposure to Claude is only via Copilot which has worked very well for my purposes. I didn't have to learn a ton for it to just start working. I guess I'll look into other options now as I really want to continue using Opus, but don't have a need to 4x my spend on Copilot quite yet.
The squeeze begins.
It's quite telling, they've paused new signups because Microsoft doesn't have enough compute, and they moved Opus to only being accessible on a higher tier because Anthropic doesn't have enough compute either.
They're all operating at a loss, enshittification is coming for us all.
This is such a rug pull.
I'm a paying customer and I did not receive ANY communication about this. Was using Opus this afternoon and then it disappeared.
Microsoft really can't stop being Microsoft. I don't dispute the need to charge more for those models, but there is a basic decency to do things and as usual the Big Tech fuckery and complete lack of morals makes them do this in a way that generates total mistrust where it could be just annoyance.
I'll see how Sonnet handles the most difficult problems but I'm foresee a subscription cancelation soon.
It was great while it lasted - so great in fact, I was able to ignore the fact I was paying Microslop for it. Not bummed about it, now I can be completely M$ free again.
Props to them for being transparent about it.
how is it possible to have 126 points and still 26 karma, is it voter ring or bots? @dang
“Changes”…