Earlier this year, Google announced Axion as their first Arm-based CPU for the Google Cloud. Today they are already bringing Axion to general availability with the new C4A instances. These new C4A instances are advertised as offering up to 50% better performance and up to 60% better energy efficiency than their current generation x86 instance types. This article provides some of the first public, independent performance benchmarks of the Google Axion CPU, along with a comparison against existing GCE Arm and x86_64 instance types.
Google was kind enough to grant me free access to the C4A instances in the Google Cloud / Google Compute Engine over the past few weeks to run some benchmarks of their first internal Arm data center processor. The first generation of Google Axion processors use Arm Neoverse-V2 cores with up to 72 cores per processor.
The Axion Armv9 based processors enable SVE2, BTI, BF16, I8MM, PAC and PMU as some prominent additional features are enabled. Google promotes their Axion C4A instances as great for general-purpose workloads, containerized microservices, open source databases/in-memory stores, data analytics, CPU-based AI inference, and similar workloads.
In addition to the new Axion processors, the C4A instances feature local SSD storage, networking up to 100G, Titanium networking and storage offloads, and various sizes from 1 to 72 vCPUs. Both standard and high-mem instances are available, with the latter offering 8 GB of RAM per vCPU as standard instead of 4 GB per vCPU.
The Google C4A instances are supported by all major ARM64 enterprise Linux distributions, from RHEL to SUSE, Ubuntu, Rocky Linux and others. For my testing, I used Ubuntu 24.04 LTS for all tested instance types.
Google’s Axion is following the path of Amazon Graviton and Microsoft Azure Cobalt for the public clouds and hyperscalers by coming up with their own in-house Arm processor designs. Google Compute Engine has offered Arm-based instances with Google Tau VMs powered by Ampere Altra, but now they’ve taken their Arm processor needs in-house with Axion.
Considering previous Neoverse-V2 testing at Phoronix with the likes of Graviton4 and NVIDIA GH200 Grace, it was a given that it was going to be quite a performant experience… For these launch day tests, I compared the Google C4A stock memory 48 vCPU instance against other GCE 48 vCPU instances, including the C4 using Intel Xeon Platinum Emerald Rapids (Xeon Platinum 8581C) and Tau T2A Ampere Altra 48 vCPU instances. Each tested 48 vCPU instance consisted of 180 ~ 192 GB of memory and was tested using Ubuntu 24.04 LTS with the Linux 6.8 kernel. No AMD EPYC instances were tested in this comparison because Google Compute Engine does not currently offer an AMD current generation “C4” instance type. And because Google only covers free access for the C4A instance types, the number of instances tested outside of C4A was limited to keep costs down given today’s very challenging environment for web publishers.
As for the increase in Axion’s prices, for an Intel The T2A Ampere Altra 48 vCPU size with 192 GB memory is $1,349.04 per month or about $1.85 per hour…. And then the new C4A instance type with 48 vCPUs and 192 GB memory is $1,573.30 per month or approximately $2.16 per hour. So Axion is more than the legacy Ampere Altra instances, but much less than the C4 Intel type. Prices based on US-Central1 data at time of writing. Within the benchmark results shown in this article, you will also find performance-per-dollar metrics.
Because the Google Axion instances (and other tested GCE VMs) do not display CPU power metrics that can be systematically queried, there is no CPU power consumption/performance per watt data to share in this article. So the focus of my launch day testing for the Google C4A Axion instance family is on raw performance and performance per dollar against C4 Intel and T2A Ampere Altra instances in Google Cloud. For those curious how Google Axion compares to AWS Graviton4, that will come in a separate article on Phoronix in the coming days.