ARM Cortex-X1 Mega-core Evolution
Last year, ARM officially released its Cortex-A78 and Cortex-X1 architecture, the former position as a large core, the latter is a mega-core. Cortex-X series is ARM’s new high-performance core architecture, the first product is Cortex-X1, whose performance is 30% higher than the A77, 22% higher than the A78, and 100% improve machine learning capabilities.
Cortex-X1 also allows customers to customize and build more different features, but this requires customer involvement in the early development stage. Today, Xiaomi brought some detailed highlights of mega-core Cortex-X1 used in Snapdragon 888 SoC on Xiaomi 11.
Xiaomi phones have always been the first to carry the Snapdragon 8 series flagship mobile platform, and this time, Xiaomi 11 is the world’s first Snapdragon 888 flagship processor, bringing the most cutting-edge mobile technology innovation to the majority of consumers.
The most significant enhancement of Snapdragon 888 is the introduction of the ARM Cortex-X1 mega-core architecture, which is the ultimate architecture in the pure pursuit of performance. X1 is about 2.3 times larger than the A78 mega-core, and its giant size brings giant energy, and its peak performance is increased by 30% compared to the previous generation A77, which is an unprecedented performance leap and truly opens up the mega-core in the Android camp. That is an unprecedented performance jump, truly opening up the era of mega-core in the Android camp.
Instructions read into the first-level instruction cache still need to be decoded into microinstructions (μOP) before they can be executed by the processing unit. To enhance decoding efficiency, Cortex-X1 specifically increases the number of instruction decoders by 1.25 times the decoding capacity of Cortex-A78. It is equivalent to increasing the toll window on the highway to improve the capacity and reduce vehicle congestion.
After that, the decoded macro instructions (MOP) are sent to the reorder buffer to be split into smaller microinstructions (μOP), waiting for centralized scheduling and final execution by the execution unit. However, when the execution of instruction needs to depend on other instructions or data, it needs to wait in the reorder buffer. Instructions that need to be reused are then temporarily stored in the MOP buffer.
The higher decoding performance allows for more instructions in parallel, and the MOP macro instruction buffer and reorders buffer of Cortex-X1 are significantly increased to carry more microinstructions (µOP). 100% more MOP macro instruction buffer and 1.4 times larger reorder buffer than A78. The macroinstruction buffer and reorder buffer are equivalent to a service area on a highway, where a larger service area allows more vehicles to wait to be dispatched. As a result, the processor’s ability to process instructions is greatly increased.