ARM’s new processors designed to power the machine-learning machines
On
the eve of Computex, Taiwan’s big showpiece event where PC makers roll out their latest and best implementations of Intel CPUs, mobile rival ARM is announcing
its own big news with the unveiling of a new generation of ARM CPUs
and GPUs.
Official today, the ARM Cortex-A75 is the new flagship-tier
mobile processor design, with a claimed 22 percent improvement in performance
over the incumbent A73. It’s joined by the new Cortex-A55, which has the highest power
efficiency of any mid-range CPU ARM’s ever designed, and the Mali-G72graphics processor, which also comes
with a 25 percent improvement in efficiency relative to its predecessor G71.
The
efficiency improvements are evolutionary and predictable, but the revolutionary
aspects of this new lineup relate to artificial intelligence: this is the first
set of processing components designed specifically to tackle the challenges of
onboard AI and machine learning. Plus, last year’s updates to improve
performance in the power-hugry tasks of augmented and virtual reality are being
extended and elaborated.
Before
we dive into the detail of this year’s changes, it’s worth recapping what ARM
does and why it’s important. This English company, now owned by Japan’s SoftBank, is responsible
for designing the processor architecture of practically every mobile device —
you’ll have heard of Qualcomm’s Snapdragon, Samsung’s Exynos, and Apple’s
A-series of mobile chips, all of which are built using ARM’s instruction sets
and based on ARM’s design blueprints. When we talk about the oncoming wave of
mobile AI, mobile VR, and smartphones that can perform machine-learning tasks
without sending them off to processor farms up in the cloud, developing the
capabilities for those tasks starts with ARM.
ARM
The
new Cortex-A75 and A55 are the first Dynamiq CPUs from ARM. Dynamiq is the
branding chosen to describe a much more flexible set of design options for
silicon vendors like Qualcomm. Where previously ARM allowed for designs that
paired a cluster of so-called big CPUs (from its A7x class) and a matched
number of little CPUs (from the A5x series), the new design makes it possible
to spec a single, mixed-up cluster composed of both big and little CPUs, to a
maximum of eight. Thus, chip makers can now have, for example, seven little A55
cores and just one big A75 one: for a favorable mix of long battery life, cost
efficiency, and a high ceiling of single-threaded performance when it’s called
for.
"50X IMPROVEMENT
IN AI PERFORMANCE OVER THE NEXT THREE TO FIVE YEARS"
ARM
marketing chief John Ronco says he anticipates a "50x improvement in AI
performance over the next three to five years thanks to better architecture,
micro-architecture, and software optimizations." ARM’s Dynamiq changes
include a redesigned memory subsystem and tweaks to how CPU caches work — which
has led to a doubling of memory streaming performance on the A55 relative to
the A53 preceding it. Given that the A53 has shipped on 1.7 billion devices
over the past three years, it’s truly the A55 that will make the biggest
difference in achieving Ronco’s ambitious forecast. In most applications, the
new mid-range core will be 10 to 30 percent better than previously, offering up
to 15 percent better power efficiency and 18 percent better single-thread
performance. But it’s the fact that the new chip designs will be 10 times more
configurable, with up to 3,000 different configurations, that will allow
chipmakers far greater flexibility to make the most of them by tailoring them
to specific tasks.
ARM
THE CORTEX-A75 MAKES
DOUBLE-FIGURE PERFORMANCE IMPROVEMENTS ACROSS THE BOARD
Interestingly,
ARM won’t just be powering machine learning with its new chips, it’ll benefit
from ML too. The new designs benefit from an improved branch predictor that
uses neural network algorithms to improve data prefetching and overall
performance.
The
Cortex-A75 makes double-figure performance improvements across the board, with
ARM claiming it’s on average 22 percent better than the A73, with 16 percent
higher memory throughput, and a 34 percent improvement in its Geekbench score.
Single-threaded performance, according to ARM’s Ronco, is up by 20 percent,
purely by improving the instructions-per-clock efficiency. The A75 chip is
roughly 2.5x the size of the A55, and its intended uses are for infrastructure,
automotive, and rich mobile applications. Yes, that means VR, AR, and
high-fidelity games, the latter of which ARM’s research has shown have been
rapidly increasing in popularity.
A
major architectural change with the A75 is the opening of a larger power envelope
for chips using this core, scaling up to 2W of power consumption, and thus
offering up to 30 percent of extra performance on larger-screen devices. This
is entirely targeted at the upcoming Windows on ARM reboot, expected later this year. It’s worth noting that in
ARM’s world a "large" screen basically amounts to a laptop — and the
company set up a dedicated Large Screen Compute division a year and a half ago
to more aggressively target the clamshell devices that Intel has been dominant
in.
ARM
As
to the new Mali GPU, it has 32 shader cores, 25 percent higher energy
efficiency, and a 20 percent better performance density (aka performance per
mm² of space). The Mali-G72 is at the heart of ARM’s push toward improving
machine learning efficiency, and ARM claims it’s showing itself to be 17
percent better than the G71 in ML benchmarks. The design optimizations from the
company are tailored to accelerate inference engines rather than training
engines — that’s to say ARM chips will be best at using accumulated ML
capabilities rather than developing them, which makes perfect sense for mobile
applications. Training AI will be a task better left to Nvidia and AMD graphics cards or Google’s custom TensorFlow TPUs.
The
Cortex-A75 and A55 designs were released to ARM’s partners at the end of 2016,
so by this point, they’ve all had a few months to decide what to do with them.
ARM says a "realistic time window" for new mobile devices powered by
its latest designs would be the first quarter of 2018 — though the company is
also conscious of a new phenomenon it describes as "China speed,"
where Chinese phone vendors will put its designs into products almost
immediately. The Huawei Mate 9, for example, was released just eight months
after ARM distributed the Mali-G71 to partners. This faster Chinese cadence
could lead to some A75- and A55-based designs this year, but then bulk of them
are likely to arrive with the usual smartphone refresh cycle early next year.
No comments: