NVIDIA's First Server CPU Clears the Performance Floor

NVIDIA just released its first server CPU, and it actually lands in the same performance bracket as AMD's EPYC and Intel's Xeon. That matters because the AI stack is shifting from GPU-only to full-system design. If you're planning hybrid CPU+GPU workloads or agentic AI deployments, Vera changes the board architecture, not just the chip count.

The Vera CPU is an 88-core design built on NVIDIA's custom Olympus microarchitecture. It uses LPDDR5X memory and delivers up to 1.2 TB/s of memory bandwidth. Early benchmarks from Phoronix and Tom's Hardware show a 1.6× geometric mean performance gain over the previous-generation Grace CPU. In a standard Linux kernel compile, Vera clocks in at 20 seconds. Across server-side and compute workloads, it holds its own against AMD's latest EPYC and Intel's Xeon 6-series. It does not win outright in every test, but the margin is tight enough to signal that NVIDIA has crossed the performance floor for enterprise server CPUs.

Component	Architecture	Core Count	Memory Bandwidth	Benchmark Delta
NVIDIA Grace	ARMv9	72	446 GB/s	Baseline
NVIDIA Vera	Olympus (ARM)	88	1.2 TB/s	1.6× geometric mean
AMD EPYC (Zen 5)	x86	96–128	6400 MT/s DDR5	Within ~15%
Intel Xeon 6	x86 (P/E cores)	64–144	6400 MT/s DDR5	Within ~10%

This is where the AI workload story lives. Vera was designed for agentic AI — coordinating tool calls, running parallel workloads, and managing databases alongside GPU inference. NVIDIA routes CPU and GPU traffic over NVLink-C2C, creating a unified memory space that removes the PCIe shuffling bottleneck. For local inference and small-to-medium serving clusters, that architecture cuts data-movement overhead. A model that fits in memory no longer stalls waiting for host-to-device transfers. The 1.2 TB/s bandwidth ceiling means KV cache placement for larger context windows stops being a bottleneck on the memory controller itself.

I would not treat Vera as a drop-in replacement for your existing EPYC or Xeon racks today. This is a first-generation chip. NVIDIA's server software stack, hypervisor support, and out-of-box compatibility with third-party management tools still lag behind decades of EPYC and Xeon maturation. The immediate availability window is also narrow. Vera ships primarily through NVIDIA's HGX and NVL platform partners, not as a standalone socket you can buy off the shelf. If you need independent CPU scaling outside of NVIDIA's GPU ecosystem, the software friction outweighs the raw bandwidth advantage.

For homelab and small-cluster builders, the real signal is the memory architecture. Pushing 1.2 TB/s onto LPDDR5X pushes workstation memory expectations into server territory. As NVIDIA integrates Vera into more GB200 and GH200 superchips, the gap between discrete server CPUs and integrated accelerator memory will continue to narrow. Watch for software maturity and partner board availability before committing to an upgrade path. The silicon clears the performance bar. The ecosystem catch-up is just starting.

Sources:

https://www.phoronix.com/news/NVIDIA-Vera-CPU https://www.tomshardware.com/tech-industry/nvidias-new-server-cpu-doesnt-win-outright-in-most-tests-but-its-running-very-close-to-amds-epyc-which-is-incredible-for-a-first-generation-custom-server-core-from-nvidia https://www.nvidia.com/en-us/data-center/grace-cpu