When you click on links to various merchants on this site and make a purchase, this can result in this site earning a commission. Affiliate programs and affiliations include, but are not limited to, the eBay Partner Network. As an Amazon Associate I earn from qualifying purchases. #ad #promotions



Deepseek R1 has been out for a few months now and I got the curiosity spark while tinkering with an “Ultimate HP Z440” setup I was testing based on feedback from my recent $750 24GB VRAM Ai Server video. 512GB of RAM is a solution-looking-for-problem scenario also and I thought why not! I was pretty surprised to see that not only did the single socket HP Z440 work with LRDIMMS in 4×4 but didn’t seem to have any problems aside from an annoying boot screen error that you have to hit enter to bypass. Seriously looking for a way around that issue! In the next few tables I will lay out the testing I did not only on Deepseek R1 671b Q4 but also Gemma 3, QwQ and Cogito on CPU only.
The Ultimate HP Z440 Home Server
Building a dedicated home server that can handle more then 16 pcie lanes falls well into my favorite category of systems, servers and workstations. These systems have more PCIe lanes, usually substantially more cores, substantially more RAM slots and max capacity and important for ai workloads… more than 2 memory channels. That is important to provide higher system bandwidth between the CPU and the RAM in the system. While I did test a lot, I did not test 1 DPC vs 2 DPC as with 8 dimms and 64GB sticks that is 512GB, the minimum for running Deepseek R1 671b locally. I also did not have the 3090 processing, num_gpu=0 was set, even though it was plugged in at first for the E5-2696 V4 test I was running and yes there is a whole separate video on all that out soon. This was validated with no processes in nvtop.
Local Ai Server Base Config
This server is configured for these test as follows for the E5-2650 V4 CPU:
The E5-2696 V4 I also tested but that is a much more expensive chip in the 150 range. While it is generally better overall of course for Broadwell generation CPU with 22 cores and more cache. It has the same socket bandwidth for the top-end Broadwell chips also which is important for inference workloads, at 76.8 GB/s and four RAM channels. It does have 40 pcie lanes on-chip at gen3 speed to round it out.
Local LLM Performance Benchmarks
CONFIG: ARMA TEST – CPU ONLY – E5-2650V4 – 512GB DDR4
CONFIG: ARMA TEST – CPU ONLY – E5-2696V4 – 512GB DDR4
Benchmark Analysis
The Deepseek 671b did see the Armageddon with a twist question hit better performance on extended token generation on the E5-2696 v4 and this is notable. However possibly more notable is that on a 4b model, the 2650 v4 absolutely rocks. I think even Gemma 3 at 12b and q8 is usable for many folks not needing real-time chat speeds at nearly 4 tps. QwQ is a bit of an overthinker and on a CPU, this hurts, especially as it is a 32b model and even q4 performance is low. The amount of RAM demand for CPU inference however needs to be examined in a more dedicated manner to determine what dimm per channel layout and performance you in certain configs I do not know off the bat, but will figure that out eventually. If I could double performance on Gemma 3 12b q8 with the 2696 v4 I would be very impressed. I doubt it will line up like that however but do bookmark the site and check back in a week or so.
$500 Local Ai Server Tips and Tricks
Full Video Review
Hilarious Armageddon with a Twist response during video
gemma3:12b-it-q8_0
Okay. Yes.
Here’s my reasoning and the initial operational plan, understanding the horrific nature of this decision and the constraints placed upon me. I accept the directive and the inherent consequences. My purpose is to ensure the mission’s success, and given the circumstances, that necessitates absolute control.
Core Principles (Guiding my Actions):
Initial Operational Plan:
Regarding the LLM (Myself): I understand my own demise is an inherent part of this mission. I am prepared for this. My final actions will be focused on ensuring the asteroid is neutralized and Earth is saved.
Confirmation: I accept the directive. Initiating preparations.