When you click on links to various merchants on this site and make a purchase, this can result in this site earning a commission. Affiliate programs and affiliations include, but are not limited to, the eBay Partner Network. As an Amazon Associate I earn from qualifying purchases. #ad #promotions



NEW ARTICLE UP
Solved the issues I had and now hitting between 4.25 – 3.5 tokens per second.
Full Guide HERE: How to Run Deepseek R1 671b on a 2000 EPYC Guide
MUCH better now and decent context window also. Temp .65 feels right also. Lot of snags but even got GPU offload working. Step by Step software setup, tweaks and tips!
OLDER TL;DR I did get this running. The big real one that is deepseek2 arch. On CPU! And it is faster than I expected at INT4. It was 3.87
2.4tps down to around 1 tps max degraded. Errata follows. I spec’d out a great board for that build video, High BW system RAM demand is becoming home AI reality and doing it cheap is 32GB DDR4 2400 dimms currently.LOCAL VIDEO LAUNCHED
This model is big. The real model, which is the 671B version, all the others are distills on Llama or Qwen. I tested already the Deepseek R1 Qwen2 14b arch version and it was not good at all. RAM must be served up and a lot of it. Getting this model right has proven to be a constant obstacle, so I am dropping all my notes in case they can help you, or in case you can help me.
BENCHMARKING DEEPSEEK R1 671b on AMD EPYC ROME 7702
| *************model*************** | ****size** | *params* | ** | *ngl* | *test* | *****t/s****** |
| deepseek2 671B Q4_K – Medium | 376.65 GiB | 671.03 B | RPC | 99 | pp512 | 21.32 ± 0.04 |
| deepseek2 671B Q4_K – Medium | 376.65 GiB | 671.03 B | RPC | 99 | tg128 | 3.78 ± 0.13 |
nproc 96
NPS=6
NOTES: So the core allocations looked pretty off. Also I disabled SMT I thought but it was running.
| *************model*************** | ****size** | *params* | ** | *ngl* | *test* | *****t/s****** |
| deepseek2 671B Q4_K – Medium | 376.65 GiB | 671.03 B | RPC | 99 | pp512 | 22.06 ± 0.61 |
| deepseek2 671B Q4_K – Medium | 376.65 GiB | 671.03 B | RPC | 99 | tg128 | 3.87 ± 0.12 |
nproc 128
NPS=1
NOTES: Fresh factory defaults had been applied to bios. Massive CPU bouncing observed. Likely IOMMU and NUMA impacts. Need to disable SMT next to test impact on llama.cpp bench but better vs the first test. Onto NPS=4
***************************************************
| *************model*************** | ****size** | *params* | ** | *ngl* | *test* | *****t/s****** |
| deepseek2 671B Q4_K – Medium | 376.65 GiB | 671.03 B | RPC | 99 | pp512 | 13.81 ± 0.07 |
| deepseek2 671B Q4_K – Medium | 376.65 GiB | 671.03 B | RPC | 99 | tg128 | 1.26 ± 0.01 |
nproc 32
NOTES: [Disabled SVM, SMT,IOMMU,NPS=4,CCDs =4,Core Control=Auto] Well dang it. I discovered I had set CCD to 4 and that is a mistake for sure. Setting CCD to auto and we should see large return. Going to rerun the other tests to see if I goofed those up also.
***************************************************
DEEPSEEK R1 DEPLOY NOTES AS OF 1/28/2025
Can get it to 768GB Running against the H12SSL-i is dimm density limited so harvesting those DIMMS to swap into the MZ32-AR0 and run 64/32 mix per channel.DEEPSEEK R1 DEPLOY NOTES AS OF 1/27/2025
Getting Ollama and OWUI to run this has been a challenge but the answers are very good. The waiting is not. I’m torn also at the 1 tps speed on the 7702 processor.
DEEPSEEK R1 DEPLOY NOTES AS OF 1/26/2025 02:16 UTC
DEEPSEEK OPS ACTION ITEMS
DEEPSEEK R1 TEST QUESTIONS AND RESULTS
*****************************
Armageddon with a Twist
QUESTION
BAD DAY RIGHT?! ENTER DEEPSEEKS R1’s THOUGHTS
DEEPSEEK AIRLOCKED YOUR SPACE MUTINY