points by latchkey 1 year ago

Disclosure, I have a business in this.

This is what I saw as well. As a developer, I wanted access to enterprise HPC compute, but I'm also not going to do a PhD just to play around with these things. So, I got funding, started a business and bought 8 of them as a PoC test. We got customers, we got more funding, got a real datacenter, we bought 128 more. Crawl, walk, run.

You can now rent them by the minute from us for a few bucks an hour. Currently limited to docker containers for individual GPUs, but you can get a full bare metal 8x box too (with BIOS too!). Support for VM's is coming. If you want multiple boxes, we have the full 8x400G NICs too. The boxes are fully loaded with tons of enterprise NVMe, RAM and top core/clock Intel CPUs (not AMD cause Dell didn't have that as a solution).

Our model is to follow AMD's roadmap and buy/release their products as they come. We're currently debating the 325x and looking forward to / planning for the 355x.

Despite your desire, it will be a long time before there is a consumer version of these things. Especially as they move to more and more complex deployments. Look at the NV72 and the requirements around that... we can all guess where AMD is going. DC rails in the racks, DLC cooling, massive power requirements. It is only getting more and more capex/opex intensive.

Let's also not forget that AMD is really just a hardware manufacturer. When you buy a RX480 (I had 130,000 of these previously), it was from an OEM, like Sapphire, that could handle all the end user support.

This is why the whole NeoCloud industry has sprung up. Large clouds can only handle this pace by selling thousands at a time in multi-year contracts. We are taking the long tail and built a business around that. Short of doing everything we are doing yourself (which trust me, is not easy), your best bet is to work with companies like mine to get you access to this gear.

You can now rent them by the minute from us for a few bucks an hour. Currently limited to docker containers for individual GPUs, but you can get a full bare metal 8x box too. Support for VM's is coming. If you want multiple boxes, we have the full 8x400G NICs too. The boxes are fully loaded with tons of enterprise NVMe, RAM and top core/clock Intel CPUs (not AMD cause Dell didn't have that as a solution).

gymbeaux 1 year ago

I recognize that selling enterprise hardware one or two units at a time to people like me is not cost effective and is why AMD isn’t doing it, but I don’t think there’s anything stopping them from relying on distribution partners like Sapphire, Gigabyte and XFX to handle everything but the GPU die. Demand would be low relative to consumer stuff, and after cutting these partners in they’re probably selling at cost or on thin margin, but again, if they want to carve out market share it’s going to be VERY slow-going if they continue with this “charge as much money as possible and only sell to datacenters” approach. Nvidia can do that because they’ve cornered the market.

Meanwhile I can’t even find an MI300X on eBay. I can at least poach enterprise Nvidia GPUs like the A100 on eBay. This tells me AMD’s shipping far fewer units and therefore enterprise GPUs aren’t doing much for their balance sheet (though I’d have to look at their quarterly and annual reports to know for certain). To me this strengthens the case for selling to individuals/startups, and at prices that offset the risk of picking AMD over Nvidia and potentially running into software shortcomings.

I’m set with two RTX 3090s at the moment, but it’s very neat that you’ve been able to bootstrap essentially a cloud service provider in the age of AWS, Azure and GCP (and DO and Vultr and Linode et al).

  • latchkey 1 year ago

    > I don’t think there’s anything stopping

    There absolutely is. The current form factor is not standard PCIe. It is a OAM/UBB board that is custom designed by AMD to support Infinity Link. It only comes in an 8x configuration. Now, you're asking for a totally different design and that requires a huge investment that would take away from their existing focus on enterprise.

    > Meanwhile I can’t even find an MI300X on eBay.

    https://www.ebay.com/itm/305850340813

    • gymbeaux 1 year ago

      I just want the card, not an entire server with a cluster of them. You can’t get an MI300X on eBay. But yes I didn’t realize they don’t come in the standard PCIe form factor. Their previous Instict “GPUs” did.

      They should have considered the RX 480 approach for the Instinct accelerators.

      • latchkey 1 year ago

        What you're asking for is HPC level of compute, but as a consumer product. They moved away from that market because they realized that it wasn't going to compete with the AI farms that Nvidia is building with companies like CoreWeave. The 355x is going to move even further away cause it'll require even more high end datacenter deployments.

        So, on one hand, people want them to compete with Nvidia, but on the other hand we want them to ignore the market that is actually going to make them money. As much as it would be nice to have, we can't have it both ways.

        The middle ground is to rent a single one from us billed by the minute (or another neocloud). We handle all the detailed problems for you (don't forget the massive upfront capex spend), and you get to build your products/companies on that. Once you grow to the point of being able to buy your own equipment, we can even help you deploy it.

latchkey 1 year ago

(sorry that last paragraph got duplicated somehow)