5090 for AI Workloads – Meta Analysis

When you click on links to various merchants on this site and make a purchase, this can result in this site earning a commission. Affiliate programs and affiliations include, but are not limited to, the eBay Partner Network. As an Amazon Associate I earn from qualifying purchases. #ad #promotions

From the Nvidia 5090 Local Ai meta analysis video here are my notes and takeaways geared specifically to use of the Nvidia RTX 5090 for local AI hosting. I watched all embargo day lift videos so you don’t have to. Specifically we are only interested in image, video and text generation, level of precision and acceleration in those precisions, and use case workloads. This review will also sprinkle in my thoughts on Strengths and Weaknesses intermixed so these are not 100% all directly correlated to the videos, rather much is inferred based on my knowledge of existing Locally hosting Ai rigs systems. Companion video for article.

5090 for AI Top 10 Points

It is not 30% faster in all tasks, specifically higher precision AI workloads.
Excellent density, cooling and thermal performance but stacking 32GB 5090 FE’s is going to heat that next card way too much. Spacing will be needed, but still very innovative cooling.
INT4 HAS ISSUES for text generation quality. This was recently demonstrated by me with Deepseek R1.
Cost per GB VRAM is $62.47 in the 5090 MSRP. (Actually much higher as it is currently over MSRP prices)
Cost per GP VRAM was $66.63 for reference in 4090 MSRP.
Renderer workflows for Video Creators and Streamers is huge. Renderer uplift for 3D artist, modelers, and designers is very solid.
NO reviewer had a test platform that fully unbounded the card outside gaming. A true shame we didn’t get a real high-end system reviewing it from a VRAM/RAM standpoint.
Liquid metal cooling is going to be scary for DIY repairs in instanced of strict tolerances.
All of the embargoed videos may or may not have had restrictions in exactly what was agreed to get that “first” 5090 video. We simply don’t know as there is contracts that cannot be disclosed and/or creators do not want to disclose.
The 5090 is a solid performance increase for gamers looking for 4K gaming, and is a decent improvement on top end AI workloads with between 10%-40% gains from highest precision 16 to 8 precision respectively.

LTT had the best dedicated Ai TOPS review but…

Text gen llama 2 super disappointing
Phi 3.5 mini 3.8b which I have reviewed prior
ONNX DirectML runtime may favor heavily specific uplifts
- We may not see this level of performance in other runtimes like llama.cpp
Time to First Token doesn’t look like a 50% gain I had hoped for and expected for the gen5 upgrade. This is baffling but likely CPU restricted and Memory channel restricted.
Flux.1-dev FP8 differential is 1.4x gain, Respectable img gen uplift.
Flex.1-dev FP4 is 4x gains in FP4
Seemed like a very restrictive set of benchmarks to run based off a specific testing tooling benchmark, ul, which has not fully integrated the 5090 yet.
The review didn’t seem pleased and I feel there was some additional undercurrent of why that might be in addition to what was stated.

Serious doubts on the rest of 5XXX series, especially the 5070ti, are starting to form for me now.

GN has the best systems review on thermal and electrical performance characteristics

Thermals looking great for a single GPU setup with the innovative cooling design the FE will provide.
Other 3rd party cards will have much more traditional designs
Liquid Metal with surrounds gaskets = Well shit if it fails out of warranty
- Directly challenging to re-liquid (new term of danger) in the future as a very precise amount is needed. More and it will pressure the gaskets/leak. Less and you cavitate.
- Several of the 3rd party partners are expected to also use liquid metal.
- I hope NVIDIA for the FE puts out guidance on this specifically. We need to be certain of the lifespan and liquid metal is a new variable.
Respectable uplift for gaming, but DLSS is that uplift in large part.

Notable other Creator Reviews

L1T I was hoping would have much more on the Professional Workloads impact but sadly did not. He is one of the few creators with a system capable of being truly unbound to run against the 5090 in this regard. JZ2C didnt add-in what I was hoping for on the insane top end side but did reveal the new 16 pin plug is vastly better vs the 4090s somewhat sketchy one. Notable.

Digital Spaceports 5090 Weaknesses and Strengths of the 5090

I did a lot of reading between the lines. There is fully a possibility that some of this is not correct but the best of my abilities I am looking to uncover relative value for the use of a 5090, specifically the FE, for a Local Ai user.

Top 10 5090 Weaknesses

We have a problem and it is INT4 based. While you can absolutely experience a doubling of performance just by changing from an int8 to a int4 model, we do not see that presented as an across the board relative gain to the announced int4 numbers. This is interesting and the bottleneck is not disclosed as no one launched flamegraph and ran this against various engines.
I suspect we have early optimization problems which many of will be corrected but would be interesting to approach from a system perspective.
The Liquid Metal is NOT something I am excited about. Like at all. This stuff is amazing, but needs to be changed out eventually. This could spell disaster for any DIY re-liquid application in say 3 years. What is the lifespan and degradation gradient? Can I send it in to NVIDIA to get fixed? What are the qualifications around that if that is a thing?
The ability to apply corrective measures to a GPU or anything else extends greatly the lifespan of that hardware. This is both economically and environmentally important. We may have just entered the era of disposable GPUs.
Gains of 10% on FP16 inference is not major.
Thermal cooling patterns expertly revealed by Gamers Nexus will destroy a stacked 2 GPU setup, but be amazing for a 1 GPU setup.
Wattage draws at peak can exceed specs slightly and are already very high. Running 2 GPUS and a high-end rig is unlikely in a 1500w PSU.
Performance per watt does appear to be one of the biggest non-dlss4 scale factors aside from int4 revolutionary acceleration
Dramatically improved 16 pin plug fitting and layout in the FE. While I love my 8 pin plugs the future at least posses less of a threat of meltdown now.
NO ONE TESTED WHAT NEEDED TO BE TESTED FOR LOCAL AI YET! We still have a lot of unknowns that we should not have. They missed out by not sending me a testing unit, but also you can be assured I don’t have any conflict of interest at least in what I write and record.

Top 10 5090 Strengths

Cost per GB of VRAM against MSRP is down from $66.63 in the 4090 to $62.47 in the 5090. This is a savings of $4.16 per GB of VRAM MSRP. VRAM amount is one of the most important factors for local Ai.
You will be able to fit many more models into that additional 8GB of VRAM. That alone makes this a good card if, big if, you can get it at MSRP.
Gains of 1.3x to 1.4x on INT8 is major if we see it manifest for local Ai.
Gains of 4x on INT4 is major, but can we get models that are tuned to this and not crappy?
Massive uplift for image generation.
Likely massive uplift for video generation expected. We could finally get decent results in reasonable timeframes.
If you have a multiuse machine that is a all-in-one for your local ai tasks as well as daily driver, the 5090 is a must get if you can at MSRP.
Improved CUDA levels often bring later-than-launch improvements to software that uses CUDA as the authors take time to adapt and update. We should see things get better from here on out.
Revolutionary small size for so much power and amazing new cooling solution. Appears a 1 slot offset would be minimum clearance for clear evacuation airflow upward.
It could have been priced higher at MSRP based on the VRAM value. We now conceivably have compelling use for 8K and as one of the very few creators who has an 8k pipeline, I fully welcome this. It is what is next and this GPU makes it reality.