When you click on links to various merchants on this site and make a purchase, this can result in this site earning a commission. Affiliate programs and affiliations include, but are not limited to, the eBay Partner Network. As an Amazon Associate I earn from qualifying purchases. #ad #promotions



The newest generation of Nvidia Geforce RTX GPUs look very good for their MSRP and specs. First let me ask you to suspend what marking speak we heard for now from CES yesterday. Many of these numbers are vague and literally just the largest number for large number sakes. Not to diminish the impacts however as the performance gains, and especially VRAM size gains, are substantial.
TOPICS WE WILL COVER
Quick 50 series Specs
5070 – $549 (leave this for the gamers)
5070ti – $749 (Interesting and likely realistic to get)
5080 – $999 (leave this for the gamers)
5090 – $1999 (VERY interesting, likely impossible to get)
DIGITS Household Supercomputer – $3000 (May launch, lacks details we need on RAM BW)
https://www.nvidia.com/en-us/geforce/news/rtx-50-series-graphics-cards-gpu-laptop-
announcements/
https://www.nvidia.com/en-us/project-digits/
MY NOTES ON THE TRAD GPUs
workstation or the latest EPYC 90005 series PCIe 5 chips.
MODIFIED 4090 Nvidia Driver to enable P2P mode
https://github.com/tinygrad/open-gpu-kernel-modules
NOTES ON PROJECT DIGITS
This one is more complex. There is a LOT of devil in these details that is dependent on the LPDDR5 that comes in that system specifically around the Bandwidth that it will provide. It is likely at the performance level of SYS RAM and not that of VRAM. The form factor is VERY compelling, but also speaks to the physics of the processing for inference workloads. Hopefully its some balanced slice of high performance nearly that of Unified. It would be likely be DDR7 if that was the case is my first blush take however.
The lauded stat of running a 200B parameter model sounds fantastic and is absolutely achievable in 128GB ram at quant 4, which is the numbers they have been presenting consistently here in, the reality can be pretty slow. Like 7 tokens per second vs 1 token per second is very imaginable as the gulf that may exist here given the Q4 performance we see reported. Nvidia has A LOT of tricks and optimizations they could utilize no doubt on the software side, but current physics still apply. I think this device really challenges Apple at a hardware level. Is this why there are rumors of Apple discontinuing unified memory are around? I thought initially it was likely to push users to offload data to their cloud and prop MRR, but maybe they had an idea this was in the pipeline.
I doubt you will get the ability to connect more then 2 given the 80gbit interconnects capacity. Dash those thoughts I think is a decent idea. RDMA is a wonderful addition which shows they are crafting the stack down to the silicon with care imho.
Either way, at 3K this is the ultimate high end mini-PC desktop box at the moment. I really want one (no I lie, I want 2) but is it worth $3K from a performance per $ standpoint, very unlikely.
They look fucking amazing! Gold computers all of a sudden make the most sense vs silver computers. You get it Apple?
Are you better off getting a 5090 at the MSRP of 1,999.00? That is a really good question that hinges on the bandwidth we have yet to get numbers on.
How Do NVIDIA 50 Series GPUs compare to 40X0 and 30X0?
What does this mean for used 40 series cards?
On Prices – The top end 4090 is going to be under initial price pressure both used and new during the lead up to the reality the 5090 somehow all got snagged by scalpers. Expect a 20 – 35% reduction based on card quality/mfg off current prices… which trends back toward MSRP at currently inflated prices. Hit the hardest. Feels bad as an owner of 2x 4090s who didnt sell before this announcement but likely even had I, it wouldn’t be enough to buy 1 scalped 5090.
4080 and 4070 non-ti are not typical market for home AI inference due to $/GB VRAM or available VRAM. Basically that is gamers stuff and I have no opinion.
4070ti 16GB is going to be under slight sales pressure until the Feb launch of the 5070ti but likely on new stock most. Expect up to 10% reductions on new stock. Owners really love their 4070ti cards but used markets will be interesting to watch.
4060ti 16GB is a very loved card at the prices they have been recently. It is still one of the cheapest modern rides to 16GB VRAM land also and likely holds fairly firm in its pricing as a result.
What does this mean for used 30 series cards?
3090s will be under some pressure but they are still and will still be the cheapest modernish way to hit 24GB VRAM. I’d expect we see them return to the 800 price point used and refurbished (used with extra words) but for inference they hold a lot of value. We 3090 see gamers sell off their 3090’s is where I think we see supply pressure. Likely singles. Since I own 4 I can say I have no intention of getting rid of mine unless I can somehow snag multiple 5090s are near MSRP, which is a fun dream but not reality.
What does this mean for all other cards?
Inference is in the tokenizer of the beholder. This is why I consistently talk about my expectations for performance when I an running inference workloads. Some things are just too slow for me, other things are faster than I need but thats always fine unless I am paying premium for it. We will see more selloff but small and old can and does work fairly decent if its up to your expectations. We will see normal selling used of the stock which has already been depleting over time.