points by bastawhiz 5 days ago

This isn't a good analysis, and it's because it keeps rounding everything up. He rounds up the cost of electricity by 10%. He has a range of power use, takes the high end (which is 2x the low end) and multiplies it by the inflated electricity cost.

But then they talk about using a newly purchased Mac to do the inference, running at full capacity, 24/7. Why would you do that? Apple silicon is fast but the author points out: you're only getting 10-40 tokens per second. It's not bad, but it's not meant for this!

It's comparing apples to oranges. Yeah, data centers don't pay residential electricity rates. Data centers use chips that are power efficient. Data centers use chips that aren't designed to be a Mac.

Apple silicon works out pretty good if you're not burning tokens 24/7/365 and you're not buying hardware specifically to do it. I use my Mac Studio a few times a week for things that I need it for, but I can run ollama on it over the tailnet "for free". The economics work when I'm not trying to make my Mac Studio behave like a H100 cluster with liquid cooling. Which should come as no surprise to anyone: more tokens per watt on hardware that's multi tenant with cheap electricity will pretty much always win.

datadrivenangel 5 days ago

Rounding everything down in the most optimistic setting got me to $0.40 per million tokens, and openrouter has the same model at $.38/mtok.

  • 650REDHAIR 5 days ago

    I’ll keep my data local over a $.02/mtok difference.

    • quietsegfault 5 days ago

      It’s more than just data locality. OpenRouter is faster, no? I have an M4 pro, and anything but the smallest dumbest models are unusably slow for interactive use. I personally haven’t yet found a good use case for offline/non-interactive LLM work locally.

      • datadrivenangel 5 days ago

        Yeah. The speed is the biggest issue. The intelligence of open models is good enough for serious work (though still worse than the frontier models), but the cloud models are often 3-7 times faster, and you can get more parallelization and so get speeds on the order of hundreds of tokens per second, which makes things fast!

        • freeopinion 5 days ago

          Even extremely slow LLMs can generate Part B faster than I can audit Part A. So the LLM can generate Part A while I look over my email. Then it can worry over Part B while I look over Part A.

          It can worry over Part C while I have my 10:30 group meet. And it can worry over Part D while I do whatever other silly, time-wasting thing all humans do in almost all organizations. Then I still haven't reviewed Part B, yet, so the extremely slow AI is waiting on me.

          Maybe someday I'll be good enough to need faster AI so I can rewrite something like Bun in a few days. Right now, slow and local fits my use case very well.

          • quietsegfault 5 days ago

            I don’t think it matters if you’re “good enough” or not. Much of AI development is iterative. If you context switch between A from project 1 to B from project 2 back to check A, then maybe C while B finishes up, you will lose the flow state that AI assistance can enable with speed for those who are not fluent coders.

            Sure, I can wait hours for my local model to finish, or I can spend basically as much and get the answer right away

            There’s a lot of exciting stuff with local LLMs despite the speed, but for me I don’t have the discipline and working memory to jump from project to project.

          • bcjdjsndon 3 days ago

            > Even extremely slow LLMs can generate Part B faster than I can audit Part A.

            Depends what part a and b are...because workloads are different

      • threatofrain 5 days ago

        And continuing the argument of "more than just...", if you stopped inferencing on your Mac you still have a generally nice computer. The difference between rent vs buy.

      • novok 5 days ago

        I played with classifying and summarizing my entire email history (per email) with small models, but that only took about 12h of GPU time at most. Using a coding agent cli wrapper in that case is far slower because of all the spin up cost and the system prompt they inject even if you want to turn it all off.

        If I used an actual direct API it probably would've been much faster, but I'm doing it for hobby / fun reasons. You also get to fiddle with a lot more params.

      • PAndreew 5 days ago

        I’m running a local Whisper + Gemma 4 pipeline with a cheap USB mic to extract health related data and potential todos from ambient speech. It doesn’t have to be fast doesn’t have to be 100% correct because if it captures at least a few bits of interesting information that would otherwise go unnoticed it’s still a win.

        • 650REDHAIR 4 days ago

          I run whisper through openwebui to gemma4 moe and use kokoro TTS back to me.

          I use a 5060ti 16gb and a minipc.

          I tunnel in via Tailscale and access it with my phone or laptop from anywhere. It’s pretty good and will only get better as I optimize.

  • formerly_proven 5 days ago

    What is it with AI SaaS naming themselves "openxyz" when there is 0% open about them?

    • em500 5 days ago

      They learnt from ooenai that naming yourself open-xyz doesn't actually require opening anything.

    • debugnik 5 days ago

      It's the next co-opted buzzword after "democratize".

      • throwaway2037 4 days ago

        Yeah, anytime that I see "People's" or "Democratic" in the name of a nation, I grow suspicious. It is rarely a well-functioning democracy.

    • brikym 4 days ago

      It's how marketing works. If something is a problem they have to loudly claim to have fixed it. Look around the economy and you'll see lots of it. "Healthy" (high sugar) muesli bars, clean-diesel, surveillance wrapped up as keeping us safe. The modus operandi of marketing is to change minds about self evident things otherwise what is the point?

  • nativeit 5 days ago

    But once all that is done you still own a Mac in one case, and you don’t in the other, correct?

    • odo1242 5 days ago

      Yea this; it’s the same reason why mortgaging is cheaper than renting

      • ericpauley 5 days ago

        This is far from a universal truth: https://www.nytimes.com/interactive/2024/upshot/buy-rent-cal...

        Real estate is only a clearly good investment if you ignore opportunity cost.

        • seanmcdirmid 5 days ago

          You also need to pay close attention to rent vs purchase ratios. A lot of cities are cheap to rent but expensive to buy (eg beijing 10 years ago).

          • mantas 5 days ago

            Key word being „ago“.

            • deaux 5 days ago

              Such cities still exist and have been in such a state for decades. They can change but that's meaningless as they can also change the other way around.

            • seanmcdirmid 4 days ago

              I’m covering my bases because Chinese real estate has been volatile recently and I’m not sure where the market is at now. It could be that renting is still way cheaper than buying, I just don’t have any direct experience to back that up. If I bought while I was living in Beijing I would probably be underwater with my investment right now, renting for 9 years was the right call and my rent was pretty affordable anyways.

        • sgt 5 days ago

          Articles like that still miss a bit of the nuance. Imagine having your house paid for, and you grow old and you have no rent to pay. Yes, you could have invested but likely you would have spent some of that money on something else, or your investments might have not worked out so well, or any other reason. Human reasons, to be specific. Owning property is like a lock.

          • orangecat 5 days ago

            Imagine having your house paid for, and you grow old and you have no rent to pay.

            My home is "paid for". Except for the HOA and property taxes that are not that far off from what I was previously paying in rent, the ongoing maintenance costs with random large spikes, and the opportunity cost of having a large chunk of money in the house and not in the market. It was still probably the right decision, but it's not at all a free lunch.

            • sgt 5 days ago

              Surely though, the HOA and all that would likely be baked into a renter's price.

              And you didn't need to go live in a HOA. I don't, and it's much cheaper.

              • orangecat 4 days ago

                Surely though, the HOA and all that would likely be baked into a renter's price.

                Sure, the same way that the benefits of a fixed mortgage payment are baked into sale prices. The efficient market hypothesis would say that neither renting nor buying should be obviously superior in the long term, because if either was then people would bid up rents/prices until it wasn't.

                And you didn't need to go live in a HOA

                I pretty much did, unless I wanted to significantly compromise on other factors.

                • PunchyHamster 4 days ago

                  > The efficient market hypothesis would say that neither renting nor buying should be obviously superior in the long term, because if either was then people would bid up rents/prices until it wasn't.

                  Buying have much higher entry point, need a bunch of cash at start then a ton of paperwork.

                  It is absolutely possible that local buying market is inflated precisely because the area is so desirable buying to rent is (or was) good investment, but that's rarely is true for a bigger market

            • ffsm8 5 days ago

              And it's gonna be interesting wherever this narrative will shift over the next 5 yrs

              I keep hearing that properties are in the biggest bubble yet in the USA - with the affordable housing shortage being a red herring, because real estate managers and boomers are unwilling/unable to reduce their prices - despite not getting renters/buyers because it would kick off a death spiral as their interests would consequently go up (because of lower security). Along with the ai layoffs etc

              I'm not American so I only hear the occasional interview so don't have any idea if it's really as pressing as these industry professionals keep saying but I'm definitely at the edge of my seat watching...

        • hadlock 5 days ago

          It never fails, there's always someone who trots this thing out. We had bought our house, and then had to move and decided to rent. I was APPALLED that they wanted me to fill out an APPLICATION form, where they would decide my worth, and let me know if we would be allowed to live there. When buying a house, my cash was as good as anyone elses'. And then the management company would come inside my house to inspect that I wasn't running a meth lab or something. Thankfully that only lasted two years. I will never rent again. Majority owner-occupied neighborhoods have different characteristics as well.

          • loeg 5 days ago

            > I was APPALLED that they wanted me to fill out an APPLICATION form, where they would decide my worth, and let me know if we would be allowed to live there. When buying a house, my cash was as good as anyone elses'.

            House sellers receive offers from buyers, sometimes including letters, and can choose to sell to any of them (or none of them), whether or not those offers are higher than the listed price. It's not so different.

            > And then the management company would come inside my house to inspect that I wasn't running a meth lab or something.

            Yeah that part is different. I also prefer owning.

            • throwaway2037 4 days ago

              Why would a house seller accept any offer that not the highest total price?

              • loeg 3 days ago

                A seller might prefer a cash offer to an offer contingent on the buyer securing credit (credit might fall through). Or, like landlords, a seller might prefer a buyer with a higher credit score (same reason -- buyer is more likely to be able to secure credit and close the deal). A seller might prefer a slightly lower total offer with a serious amount of "earnest money" over a slightly higher total offer without significant earnest money (buyer might try to back out). A seller might prefer to sell to a family with a nice story in their buyer's letter than someone buying a 2nd or 3rd house. Or the seller may think all offers are too low and they can hold out and get a better offer later.

        • PunchyHamster 4 days ago

          It is very close to universal truth, aside some small areas with very warped market.

          Even if you move out after 5 years, you still own the place and can rent it out and then it pays for itself, to skip the cost of selling it back to market

        • shaewest 4 days ago

          Real estate is generally a "good" investment as it's considered a relatively safe way to get significant leverage. 5x leverage in the case of a 20% deposit, or even up to 20x leverage with countries that allow for 5% deposits (New Zealand).

          In addition, the interest payments almost always end up being near the rent the owner would have paid, so mortgage payments are higher, but that increase is generally (and quickly becomes) principal while being able to counteract inflation of rent.

          • throwaway2037 4 days ago
                > relatively safe way to get significant leverage
            

            This only works if housing prices keep rising. This post could have been written in 2007.

            • nonameiguess 4 days ago

              We can estimate this. US median home price right before the crash in 2007 was $240,000. Today, it is about $400,000. Median rent in 2007 was $810. Today, it is $1,698. There's some simplifying assumptions we have no choice but to make. Let's say renter's insurance is negligible enough to ignore. Meanwhile, we'll just let an online mortgage calculator assume a median $50,000 home insurance coverage payment and bake it in. We'll assume 1.1% of assessed value for property insurance, which is currently the US national average (it varies a lot state to state in reality). We'll assume an FHA loan with 4% down.

              This gives us a $1,995 a month payment when we purchased and a $2,142 a month payment today, due to higher assessed value for the tax.

              We can see upsides and downsides in both cases. Rent would have been quite a bit cheaper in 2007, but it has very nearly caught up by now. Meanwhile, you're probably talking about renting maybe a 2 bed/1 bath apartment, whereas the median single-family house is more like 4 beds/2 baths, with a yard. Whether or not that extra space and privacy matters to you likely depends a lot on whether you're single or have or ever plan to have a family. You could have invested into something like the S&P 500, which has historically returned about 10.5% since 1957 annually in nominal returns. Let's just kind of naively split the difference here and assume you can invest $1,000 saved on rent versus mortage a month for the first 10 years and $200 a month for the next 9. That would have gotten you somewhere around $240,000 by now. Meanwhile, you're looking at about $248,000 in home equity by now for the purchase case.

              Choose different parameters if you please, but I'm not really seeing the case for renting here over the long term, and that's in spite of choosing the single worst time in the last century you could have made the purchase.

            • shaewest 3 days ago

              Oh I don't disagree, I hate real estate as an investment, it's a terrible asset only made "viable" by tax benefits, rent-replacement and excessive amounts of risk via leverage.

              • throwaway2037 3 days ago

                What is "rent-replacement"? I never heard that term before. I cannot find anything obvious using Google search.

                • shaewest 2 days ago

                  It's my shortcut for describing the idea that you'd spend $X a month on either interest lost to a bank or rent lost to a landlord, and therefore you can mostly consider that a constant expense when considering rent versus mortgage.

          • bcjdjsndon 3 days ago

            > Real estate is generally a "good" investment as it's considered a relatively safe way to get significant leverage

            Leverage? People don't normally invest in property (normally involves taking out a loan) for the purpose of taking out another loan. That so called "leverage" is being used to buy the house...ie you don't have any leverage

            • shaewest 1 day ago

              The leverage is the loan taken for the mortgage. If you have a $1M property, $900k loan. If the property's value increases by 5%, that's $1.05M, so you've made 50% returns on your $100k capital invested. That's leverage, the leveraging of $100k to get the returns of $1M asset.

              • bcjdjsndon 7 hours ago

                > That's leverage, the leveraging of $100k to get the returns of $1M asset.

                Obviously. But that's not leveraging real estate, that's just leveraging cash.

                Leveraging real estate would be using the property as collateral for a loan larger than the property itself

      • BoorishBears 5 days ago

        Except one day the hype will catch up to reality that was always true, people will realize their $20,000 Mac is has less utility as a "way to learn AI" than some kids 3090 fortnite machine, and it'll be back to below MSRP.

        • F7F7F7 4 days ago

          Where is a Mac above MSRP?

    • teekert 5 days ago

      Plus your privacy.

      • brikym 4 days ago

        Minus opportunity cost and depreciation

        • teekert 4 days ago

          It can get quite complex, for me the feeling of "I built this", and then watch as it makes stuff from the power generated on my solar panels is also worth a lot to me. The sheer coolness of it.

    • stusmall 4 days ago

      Not always. The calculations take its useful life expectancy as an input. If they estimate it correctly you have highly likelihood of it breaking, burning out or being woefully out of date by the end. At the 10 year window you are looking at losing support for security updates.

      So if you are lucky you might end up with something that still runs but most folks won't find it particularly useful

    • jmalicki 4 days ago

      Even at just the electricity cost openrouter will be both

      1) Roughly break-even to a little bit cheaper per token cost 2) Much, much, faster

      So the cost of the mac barely even matters, it's just an extra cost beyond.

      Sure, data center providers can pay lower rates.

      The point of this article is that LLMs at home really don't make a ton of sense, unless you are willing to pay through the nose for privacy. There is absolutely no cost saving to be had.

      If you're looking at your own datacenter as a larger corporate client, that could change.

      There are also some providers that will contractually keep your data private, like AWS Bedrock or parts of Google/Azure (I don't know their stack names).

      AWS even has AWS Secret Region and AWS Top Secret Region if you want to use LLMs on classified data.

      You have to value privacy at a roughly absurd level to not want to use LLMs run efficiently at scale by someone else. For the home user, just the extra efficiency produced by batching requests from a large number of users in a datacenter in a real win.

      Some of these companies are even selling tokens below cost to get marketshare. If someone will sell you a service for a dollar bill or three quarters, why wouldn't you take the three quarters?

      • computably 4 days ago

        > If someone will sell you a service for a dollar bill or three quarters, why wouldn't you take the three quarters?

        Because one day they'll send you an email informing you the new rate is $1.50, and if you missed the email, that's not their problem.

  • novok 5 days ago

    Also many have power even cheaper or even free unused surplus power with solar.

    I don't do local inference other than hobby & learning reasons because electricity is so expensive where I am at.

faitswulff 5 days ago

The article makes no sense. I can't use OpenRouter as a general purpose computing device. Why are we comparing a whole computer to a single purpose SaaS?

  • tuwtuwtuwtuw 5 days ago

    I think it's because there are a lot of people writing articles about the benefits of running local models. I think it's fair to say that there are daily threads on HN singing the praises or local inference. I also see people buying new hardware where the main trigger is ability to run local models.

    • FuckButtons 5 days ago

      But the people who want to do local inference are putting some amount of value on privacy that’s not captured by the raw monetary value so just comparing the price is somewhat beside the point, it’s also true that, if you have eg a Mac and you use that as your main computing device then you would have spent money on it anyway, so you can’t even really compare its value to spend on something that’s not general purpose.

      • datadrivenangel 5 days ago

        My overall opinion is that the smart thing is not to upgrade to the maximum memory for AI purposes. It's worth quantifying how much extra we pay for privacy.

      • tuwtuwtuwtuw 5 days ago

        I replied to a comment asking why the article exists.

        As for privacy, I'm sure there are many people that are not so interested in that aspect.

      • apf6 5 days ago

        That's a lot of assumptions. I think there are also people buying new hardware specifically for this purpose, and their motivation to do it is thinking it will be cheaper in the long run. Privacy is not necessarily the motivation.

  • mpyne 5 days ago

    They're responding to the people doing things like buying the most expensive Mac they can find specifically to do local inference for their AI agents.

    Some do it to have control over their ability to use AI. Some do it because they think it will be cheaper to not have to pay a SaaS to generate tokens for them.

    But for those interested in the latter case, it seems like it's not actually cheaper after all, at least at current prices. But then I don't expect prices to drastically jump because of how much competition there is in model development.

    • datadrivenangel 5 days ago

      It's worth paying a premium for the privacy (assuming that llama.cpp and ollama aren't sending my sessions back to the cloud regardless...), and for the concerns about not getting a surprise bill.

      • nomel 4 days ago

        > not getting a surprise bill.

        Correct me if I'm wrong, but I believe this is a feature that only Google has figured out how to implement. All of the other pay-as-you-go token services have a cap you can set, some by monthly spending, some with API key resolution, others by how much you put into the account. I use many, and if configured with auto-purchase disabled, it's not possible to have a "surprise" bill (except for Google!)

    • dcrazy 5 days ago

      You also have control over your costs. It is reasonable to assume that tokens will cost significantly more in the near to medium future as the market consolidates and subsidies decline.

      • Danox 4 days ago

        Google, Microsoft, Meta, Anthropic, OpenAI, Oracle and others are going to be looking to recoup all the money that they’ve spent to date. Why would the price go down in the future?

        • FeloniousHam 4 days ago

          The AI numbers are huge, but I remember similar arguments about residential high-speed internet. According to Gemini, the "price for internet" is down 12% in real terms (ugh, capitalism!), while speeds are staggeringly faster.

          The providers have spent a fortune on wireless, pulled a lot of fiber/cable, and it's cheaper than it was when it started.

        • mpyne 3 days ago

          > Why would the price go down in the future?

          Because price is driven mainly by competition, not by a desire to recoup prior spending.

          Investors aren't doing things out of the sheer goodness of their hearts, so if they could just bump the price up they'd have already forced it up. The very existence of workable local models puts a cap on how high the price can realistically go, but the high level of competition still extant makes the price floor ever closer to the actual cost to generate tokens.

  • sheepscreek 5 days ago

    No, that’s not the point. I think this is to help people who are thinking about getting a beefier Mac so they can run their LLMs on it too. Some in particular want a dedicated Mac Mini or Studio for this purpose. The breakdown, even if slightly flawed, offers a good insight into the economics of it.

    For most people, they might be better off with OpenRouter models and providers supporting Zero Data Retention. On the cloud, that’s as good as it gets for privacy - your data is never retained beyond the life of the request.

    • 47282847 4 days ago

      > your data is never retained beyond the life of the request.

      Like with OpenAI for a year?

      ” In June 2025, the court ordered OpenAI to retain its consumer and API customer chat logs indefinitely, including any that had been deleted, so they could be investigated […]”

      https://www.techspot.com/news/109839-openai-no-longer-requir...

dist-epoch 5 days ago

using it 24/7 brings the average cost down, not up.

the less you use local LLM, the less sense it makes since you paid a lot for hardware you don't use

  • groundzeros2015 5 days ago

    The hardware has multiple uses for the same cost. The pay-per-use server does not.

    • bastawhiz 4 days ago

      The author isn't pricing in the multiple uses. You either compare it apples to apples or you don't. If you're using the machine for general purpose computing on top of inference then the amortized hardware costs are pointless to measure. This is exactly what I said.

  • bastawhiz 5 days ago

    That's the point: why would you buy a device that's specifically not optimized to be used for 24/7 inference? It's expensive hardware that's not designed to be used in that situation! The power use for inference isn't especially good and you're not getting even a fraction of the benefit from the hardware that you're paying for.

    • dist-epoch 5 days ago

      > why would you buy a device that's specifically not optimized to be used for 24/7 inference

      because it costs $1k-$2k instead of $10k-30k+ for optimized devices

      • bastawhiz 4 days ago

        Nobody is suggesting you buy a pair of A100s, which is what 15k gets you these days. Get a used 5090. And the author specifically priced the hardware at over 4k, which is double the 1-2k you're noting

    • apf6 5 days ago

      Good question but people are doing it anyway. It's a fact that right now tons of people are buying Mac Minis specifically for this use case, to treat them as their personal data center for agents. The concept of "power use for inference" is foreign. Those people are the ones that motivated this blog post I think.

avidphantasm 4 days ago

Not sure where 40 tokens per second is coming from. I’ve seen 95-100 tokens per second on M5 Max 128GB running Gemma 4 31B. I’ve done experiments where it is faster than Claude Opus 4.5 for the same prompts.

  • dhiraj_bhakta_ 4 days ago

    can you provide your configurations pls ?

    • avidphantasm 4 days ago

      It's actually a bit faster than that now it seems, about 112 tok/sec.

      Configuration:

      Gemma 4 31B Instruct Q6K Context size 40960 LM Studio 0.4.13+1 Metal llama.cpp v2.14.0 LM Studio MLX (Apple M5) v1.6.0

      Here are my results:

      prompt eval time = 32545.36 ms / 5625 tokens ( 5.79 ms per token, 172.84 tokens per second) eval time = 20227.99 ms / 310 tokens ( 65.25 ms per token, 15.33 tokens per second) total time = 52773.35 ms / 5935 tokens

      This was for interacting with a local MCP service, running a tool that returns a ~20KB text file to the agent to add to the chat context.

      I'm seeing about the same number of tokens/second on an M2 Ultra that I have access to (also with 128GB of memory).

      This is surely apples-to-oranges to the OP results (and I don't spend a great deal of time benchmarking these things, so my methodology might be lacking), but it's interesting seeing okay performance for a top open model. For most use, however, I find Gemma 4 26B A4B (Q6K) to be good enough (esp. for MCP calling) and much much faster (~1,200 tokens/second).

  • throwaway2037 4 days ago
        > M5 Max 128GB
    

    Wild. That must be like a 5,000 USD laptop.

ikidd 4 days ago

Actually, figuring it on generating tokens 24/7 is the best case scenario. if you figure it at 8 hours a day of actual use, you still have the fixed cost of the hardware being the highest portion of the budget, but now you generate 1/3 the tokens so you triple that cost per token.

statestreet123 5 days ago

Rounded up, yes, and oddly inefficient for someone obsessed with inefficiency. One could buy a brand new 64gb M5 macbook for well over 4k. Another could buy a scratched up but functioning M1 Max 64gb off of ebay for a little over 1k—and somehow get the same 10-20 t/s with 31b that the author does with an M5. Or better yet, have a frontier model do the planning and judging, and have a local MOE model execute at 50 t/s. All of this achievable by a former English major with too much free time.

  • novok 5 days ago

    I have an M1 Pro, and a M4 & M5 max to play with at work and the speed difference is very significant between all 3 machines, the M1 Pro is far slower, and the M5 is significantly faster than the M4. And a windows 3090 beats all of them but eats twice the amount of power per token. This is all running the same 24GB memory friendly model with LM studio.

make3 4 days ago

The real reason this comparison makes no sense is that only a vanishingly small fraction of people seriously using ai to code would seriously use a model so far from the top models (including open source ones).

He should compare his MacBook to Open Router on Kimi 2.6 1.1T or GLM 5.1 (754B), at bfloat16 precision, which he can't ofc.

But it furthers his point that things like open router are a better idea, which is not surprising.

PunchyHamster 4 days ago

> Yeah, data centers don't pay residential electricity rates.

There are 2 caveats here:

Some places have higher prices for industrial than residential power as residential one might be subsidied by govt.

And DC also pay for cooling, which residential will only effectively pay if they have AC and is hot outside. So power rates are some multiply of industrial pricing.

  • bastawhiz 4 days ago

    Generally you don't build a data center in a place that doesn't sell you electricity for cheap

    • PunchyHamster 3 days ago

      Still for purpose of calculation you still need to calculate the price of both power and removing heat created by that power.

econ 4 days ago

Boss, I make 16.50 per hour, say 15, I work 36 hours, say 35, say 500 per week, say 4 weeks per month, that's only about 2000! Don't you agree I need a raise?

outside1234 4 days ago

We also have no idea what it actually costs Anthropic. This could be wildly subsidized and actually Apple Silicon is more cost effective.

llm_nerd 5 days ago

Your post makes sense if you bought the hardware for other reasons, and maybe run models occasionally as a novelty.

That isn't the case for many, though, and there is a whole social media space where people are hyping up the latest homebrew options for running models, believing it frees them from the yoke of big AI.

Millions of people are buying big $ maxed-out hardware like the Mac Studios or DGX specifically to run LLMs. Someone rationally running the numbers is a good thing.

  • atq2119 5 days ago

    Let's not get ahead of ourselves. Millions, really? I can believe there are a lot of enthusiasts doing this, but "millions" needs a citation.

    • filleduchaos 4 days ago

      This is HN; it has probably never occurred to half the people here that the average person even in first world countries doesn't even have the financial capacity to make an impulse five-figure USD purchase, even if on credit.

      • llm_nerd 4 days ago

        No one said anything about this being an "impulse" purchase -- it would usually be perceived as a "career investment" purchase, given that many feel they need to race to be with "it" or be left behind -- nor does "five figure" have anything to do with this -- DGX options are available at $3000 -- there's a certain irony that you are posting a comment that is basically "people in my country/circle can't, therefore no one can", while dismissing my comment that many people can.

        Millions of people are paying thousands of dollars a year to buy a slightly upgraded entertainment package in their car. There are 60 million or so millionaires alone, including 6 million+ in China.

        There are a lot of people with a lot of wealth on the planet. A lot. Millions...it isn't that unfounded, friend.

        So doing this "this is HN" snide jerk act, and then basically projecting your lot on the planet is...I don't know if you intended it, but it's rather amazing.

  • curt15 4 days ago

    > Millions of people are buying big $ maxed-out hardware like the Mac Studios or DGX specifically to run LLMs.

    What's your source for this?

    • llm_nerd 4 days ago

      Not just one, but two replies completely hung up on that. Like, why even reply when you already saw the other guy doing the same tired pedantry? Just wanted to feel like you contributed?

      Ignoring that it was just tosser hyperbole (that absolutely zero reasonable people need to question), yes, enormous numbers of people are buying GPUs or hardware with the explicit goal of running local LLMs, and social media is full of people hyping various setups and models. Mac Minis are almost impossible to find, and that alone is selling at a clip of about 300,000 every four months. Large memory GPUs are basically a myth at this point. All so people can pay more to get a worse result than commercial options, which is precisely the point of the submission.

      These local setups only ever make sense if you have something that confidential, or you're doing something that ToS of the majors would ban you for.

      Now given this pedantry horseshit, you'll probably demand that I specifically show a citation on DGX or Studio sales, which...rofl.

giancarlostoro 4 days ago

Honestly, I don't even see my Macbook Pro costing me anywhere near as much as using any of these AI services, but maybe I'm just not seeing a significant increase in my power bill to notice? I am the power user who uses Claude Max pretty much all the time to prototype ideas, and build things I actually use, and has given me a lot of value, I work full time and have a family to raise and care for, my free coding time is mostly limited to ideas. Now I can draft a plan with detail, review the code, run the code, test it, and use software custom tailored to my needs.

cyanydeez 5 days ago

nothing about the current data center craze looks efficient.

  • bastawhiz 5 days ago

    Whether you think building data centers or not is a good idea it's inarguable that the per-token efficiency (power, hardware, etc) is FAR higher in a data center. That's literally what it's designed for.

    • cyanydeez 5 days ago

      im talking per value. look at the efgiency of chinese open source models; then look at SOTA sucking gigawatts, then the proposals.

      America is basically proposing AI using the equivalent bloatware of Windows 11.

      • bastawhiz 4 days ago

        I run two 49B parameter models on a pair of used A100s full time and it sucks down 250 watts at peak utilization. That's not gigawatts, and it's completely within the realm of comparison to what the author is describing.

  • trollbridge 5 days ago

    Probably because lots of data centres are being built (or half-built) which are sitting idle.

    • mpyne 5 days ago

      If there are datacenters sitting idle right now then you could probably make a lot of money selling that capacity to Anthropic at this point...

    • bastawhiz 4 days ago

      If you have racks of idle H100s, you are doing a terrible job of running a business.