ARM Cortex-X1 Mega-core Evolution
Last year, ARM officially released its Cortex-A78 and Cortex-X1 architecture, the former position as a large core, the latter is a mega-core. Cortex-X series is ARM’s new high-performance core architecture, the first product is Cortex-X1, whose performance is 30% higher than the A77, 22% higher than the A78, and 100% improve machine learning capabilities.
Cortex-X1 also allows customers to customize and build more different features, but this requires customer involvement in the early development stage. Today, Xiaomi brought some detailed highlights of mega-core Cortex-X1 used in Snapdragon 888 SoC on Xiaomi 11.
Xiaomi phones have always been the first to carry the Snapdragon 8 series flagship mobile platform, and this time, Xiaomi 11 is the world’s first Snapdragon 888 flagship processor, bringing the most cutting-edge mobile technology innovation to the majority of consumers.
The most significant enhancement of Snapdragon 888 is the introduction of the ARM Cortex-X1 mega-core architecture, which is the ultimate architecture in the pure pursuit of performance. X1 is about 2.3 times larger than the A78 mega-core, and its giant size brings giant energy, and its peak performance is increased by 30% compared to the previous generation A77, which is an unprecedented performance leap and truly opens up the mega-core in the Android camp. That is an unprecedented performance jump, truly opening up the era of mega-core in the Android camp.
The working process of a cell phone processor looks a bit complicated, but it is much simpler to understand when we can compare it to a highway transportation system. A transportation task begins with the dispatching of vehicles. The vehicles that are needed are brought from various locations to standby and wait to be dispatched. The L2 cache is an important part of the processor used to store instructions and data and is equivalent to the parking area for the spare vehicles.
And corresponding to the processor, it is the front-end instruction prediction prefetching stage. The processor first makes a prediction of the needed instructions and data, reads the instructions and data that may be needed from the Level 3 cache or external memory into the Level 2 cache, and then reads them from the Level 2 cache into the Level 1 instruction cache and Level 1 data cache, respectively.
Compared to the Cortex-A78, the Cortex-X1 directly doubles the L2 cache capacity. The increase in L2 cache capacity means that more instructions and data can be prefetched for backup, resulting in a higher instruction and data prediction hit rate and a lower impact on execution efficiency from re-reading resources due to prediction errors.
At the same time, the bandwidth of the L2 cache has also been increased exponentially, doubling the bandwidth of the Cortex-X1’s Level 1 data cache and Level 2 cache, preventing bandwidth from becoming a bottleneck for data transfers. This instantly turns a two-way two-lane highway into a two-way four-lane highway, greatly increasing traffic capacity and allowing more vehicles to travel unimpeded.