80 Cores, up to 3.3 GHz at 250 W; 128 Core in Q4

With the advent of higher performance Arm based cloud computing, a lot of focus is being put on what the various competitors can do in this space. We’ve covered Ampere Computing’s previous eMag products, which actually came from the acquisition of Applied Micro, but the next generation hardware is called Altra, and after a few months of teasing some high performance compute, the company is finally announcing its product list, as well as an upcoming product due for sampling this year.

Ampere’s Altra is a realized version of Arm’s Neoverse N1 enterprise core, much like Amazon’s Graviton2, but this time in an 80-core arrangement. Where Graviton2 is designed to suit Amazon’s needs for Arm-based instances, Ampere’s goal is essentially to supply a better-than-Graviton2 solution to the rest of the big cloud service providers (CSPs). Of the companies that have committed to an N1 based design, so far on paper Ampere is publically the biggest and fastest on the books.

The Ampere Altra range, as part of today’s release, will offer parts from 32 cores up to 80 cores, up to 3.3 GHz, with a variety of TDPs up to 250 W. As we’ve described in our previous news items on the chip, this is an Arm v8.2 core with a few 8.3+8.5 features, offers support for FP16 and INT8, supports 8 channels of DDR4-3200 ECC at 2 DIMMs per channel, and up to 4 TiB of memory per socket in a 1P or 2P configuration. Each CPU will offer 128 PCIe 4.0 lanes, 32 of which can be used for socket-to-socket communications implemented with the CCIX protocol over PCIe. This means 50 GB/s in each direction, and 192 PCIe 4.0 lanes in a dual socket system for add-in cards. Each of the PCIe lanes can bifurcate down to x2.

Ampere 1st Gen Altra 'QuickSilver'
Product List
AnandTechCoresFrequencyTDPPCIeDDR4Price
Q80-33803.3 GHz250 W128x G48 x 3200?
Q80-30803.0 GHz210 W128x G48 x 3200?
Q80-26802.6 GHz175 W128x G48 x 3200?
Q80-23802.3 GHz150 W128x G48 x 3200?
Q72-30723.0 GHz195 W128x G48 x 3200?
Q64-33643.3 GHz220 W128x G48 x 3200?
Q64-30643.0 GHz180 W128x G48 x 3200?
Q64-26642.6 GHz125 W128x G48 x 3200?
Q64-24642.4 GHz95 W128x G48 x 3200?
Q48-22482.2 GHz85 W128x G48 x 3200?
Q32-17*321.7 GHz58 W128x G48 x 3200?
Q32-17321.7 GHz45 W128x G48 x 3200?
*With 4 TiB DRAM Installed

I must credit Ampere here. This is by far the easiest product naming scheme I’ve ever seen. Intel could learn a million things from this naming scheme alone. The ‘Q’ stands for QuickSilver, the codename of the underlying SoC, followed by a core count and a frequency.

Previously Ampere had stated they were going for 80 cores at 3.0 GHz at 210 W, however the Q80-33 is pushing that frequency another 300 MHz for another 40 W, and we understand that the tapeout of silicon from TSMC performed better than expected, hence this new top processor.

It’s worth doing some basic metrics on power efficiency. If we take the TDP as solely the power for the cores, and do some math on Watts per Core, then GHz per Watt, the top Q80-33 SKU scores 1.06, around the middle of the pack (most CPUs score 0.95-1.25 GHz/W). The highlight of the list by this metric is the Q64-24, offering the most frequency for the least power: 1.62 GHz per Watt.

Also, just because we have the numbers, AMD’s big Rome CPUs consume about 3 W per core at full load, and run at approximately 3.0 GHz on all CPUs. Altra, by comparison, uses 2.6 W per core on the Q80-30. These Altra CPUs have no turbo mechanism, and thus the TDP metrics being given by Ampere are for the literal peak power consumption numbers, so what is listed above is merely a design point for chassis building, rather than a full representation of power consumption when deployed in the cloud.

Ampere states they have a number of ODMs on board that will be ready to provide Altra systems, including Gigabyte and Wiwynn, with a couple of second tier players also in the mix. These systems should be more readily available in August and September.

When we asked Ampere about the interest for these chips, the company stated that most of the interest from CSPs was actually at the high end dual socket deployments, for the highest core counts and the highest frequencies. Even though Ampere isn’t announcing pricing publically, the company states that their pricing has not been an obstacle for CSP deployments, with major customers testing the hardware for up to 2 months already. Current announced customers include Packet and CloudFlare, with Packet offering early access for its key clients.

Ampere is also one of the lead partners for CUDA on Arm, and is set to offer full CUDA support for Altra when paired with NVIDIA graphics accelerators.

Altra Max

If that wasn’t enough, Ampere dropped a sizeable nugget into our pre-announcement briefing. The company is set to launch a 128-core version of Altra later this year. 

This will be a new silicon design, beyond Ampere's initial layout of 80 cores for Altra, however Ampere states that while they are using the same platform as the regular Altra, they have done extensive tweaking and optimizations within the mesh interconnect for Altra Max to hide the additional contention that might occur when using the same main memory speeds.

Altra Max will be socket and pin-compatible with Altra, also support dual socket deployments, and Ampere states that the silicon will be ready for early sampling with partners in Q4, and is looking to move into high volume in mid-2021. The 128-core design was given the code-name Mystique, and so we might expect to see these CPUs start with the letter M.

Update on 5nm Siryn

The next generation of Ampere’s product line, as previously reported, is going to use the codename Siryn (sire-inn) and be built on TSMC’s 5nm process, set for sampling in late 2021. Ampere stated in our briefing that test chips that use IP meant to be adopted in Siryn have already taped out - the actual Siryn chip will tape out sometime in the next year.

Siryn will likely be marketed as ‘2nd Generation Altra’, and if the naming convention of CPUs stays the same, these will start with ‘S80’ etc. Ampere has stated that the Siryn platform will be new, especially because of new technologies (PCIe 5.0 and DDR5 were mentioned, but not confirmed for Siryn).

Related Reading

ncG1vNJzZmivp6x7orrAp5utnZOde6S7zGiqoaenZH52hJZqZpqloJq%2Fpr%2BMqamonKWYwW64yKyrZnBgYrCwvsSsZK6oXam8bn%2BSZp6hsl2WwW5%2BlGlksGVhZ4Vur86rnGahnmK%2BdQ%3D%3D