Huawei Kirin 9000 ISP+NPU Fusion Architecture: Power that Missed Everyone
In these two years, the smartphone SoC field is full of excitement, and in 2020 we have witnessed the debut of Huawei Kirin 9000, Apple A14, Qualcomm Snapdragon 888, and other “good guys” in turn, among which Huawei Kirin 9000 is arguably the most special and worth studying.
As the world’s first 5nm process 5G SoC complete solution, Kirin 9000 has achieved innovative breakthroughs in performance, connectivity, AI, imaging, security, and other aspects, especially in the field of imaging, its empowered Huawei Mate 40 Pro+ and Mate 40 Pro two phones in the DxOMark list with 139 points, 136 points, respectively. The Mate 40 Pro+ and Mate 40 Pro ranked first and second in DxOMark with 139 and 136 points respectively!
In the past, when talking about the imaging capability of a cell phone, we tend to focus on the CMOS sensor and camera configuration, but it is easy to ignore the ISP (Image Signal Processor) that contributes silently behind the scenes. In fact, in the cell phone imaging system, ISP in many scenarios is more important than the camera.
To use an image analogy, if the camera is seen as a soldier who can fight, ISP is the officer who commands the battle and war – without reasonable command, the strongest soldier is also a headless fly; if the camera is the “eye” to see the world, ISP is the control everything “brain”.
On the other hand, we just keep saying “image” instead of “photo” because users are pursuing more and more, not only to take good photos but also to take good videos, after all, this is an era of video, it is more exciting to move.
One of the unique features of the Kirin 9000 is that it takes ISP to a whole new level, especially the industry’s first ISP+NPU fusion architecture, which not only takes unparalleled photos but also gives a new look to video capture.
For example, superb detail reproduction, such as significant noise reduction, such as very high energy efficiency, especially in the dark environment is called an upgraded version of the “night vision”, once again at the forefront of the times, leaving only a distant back of the friendly business.
It is with such a strong foundation that the Huawei Mate 40 Pro series not only rides on the top of the photo level but also leads the way in video capture, dominating the top two of the DxOMark list.
The low-profile model: ISP has too many things to do
To understand the subtleties of Kirin 9000 ISP+NPU fusion architecture, we need to turn back the clock a bit and understand a few basic concept terms to see how difficult ISP is.
As we all know, the image sensor (Image Sensor) is the “eye” of digital cameras and smartphones to take pictures and videos (CCD in the camera/CMOS in the phone), the final color and details depend on it, and its principle is to sample and quantify the light through one light-sensitive point.
But many people may not know that the image sensor is actually “color blind” if only use it to take pictures in black and white, you need to match the color filter (CFA) to get color information.
In 1976, Kodak’s Bayer invented the RGB CFA, or color filter, which can be understood as a two-layer structure: the upper red (R) green (G) blue (B) color block is the color filter, white light through the filter to separate the red, green and blue primary colors; the lower gray color block is the photodiode (PD) part, responsible for converting the light signal sent by the filter into an electrical signal The lower gray color block is the light-sensitive photodiode (PD) part, which is responsible for converting the light signal from the filter into an electrical signal, and then further processed by various subsequent algorithms, and finally imaging.
It can be said that the performance of the filter is the basis for the color and detail reproduction of photos and videos. Of course, from the filter filtered RGB color, to the final photo on the wonderful, need to go through a variety of complex algorithms, technology processing, which there are three is playing a decisive factor.
One is the Demosaic interpolation algorithm.
Each pixel in the RGB Bayer array can only collect one color channel information, the other two color information needs to be calculated through the interpolation algorithm, combined with other color pixel information adjacent to the pixel so that the color of a pixel is complete.
The second is the Automatic White Balance (AWB).
Due to the effect of color temperature, white is not always pure white, it will be yellowish at low color temperature and blueish at high color temperature. If not calibrated to bring it back into balance, the colors would be completely confused; after all, white is the basis for the three primary colors and any color.
So there is white balance, which allows the RGB trichromatic ratio of white objects at any color temperature to be a standard 1:1:1, presenting an accurate white color. There are many white balance algorithms, the most common are grayscale world algorithm, perfect reflection algorithm, and dynamic threshold algorithm.
The third is the color correction matrix (CCM).
The camera or image sensor is mechanical, while the human eye is biological, the two have very different light sensitivity curves, or RGB response curves are not consistent.
White balance can only deal with white, the accuracy of other colors need to be calibrated by CCM, but also can be used to adjust the color style, which is a variety of different “filters.”
The principle and process of CCM are very complicated, so I won’t expand here, the specific algorithm can be broadly divided into two categories: the model method and the empirical method.
After all, whether it is interpolation algorithm or AWB, CCM, you need a strong and excellent ISP to do better and better, so that the final color presented is closer to nature or looks more eye-catching.
Kirin 990 Series: ISP+NPU Initially Join Forces, RYYB Make the Best Use of It
Over the years, Huawei has been doing its best to improve the image level of cell phones, especially in recent years has always been at the forefront of the industry, which is not only the contribution of the camera but also the ISP level of innovation.
In 2015, Huawei completed its first self-research ISP, which was applied to Kirin 950, and since then each generation has undergone a metamorphosis, gradually becoming the root of Huawei’s cell phone photo topping the world.
By the Kirin 990 series, a new ISP 5.0 has been built-in, with a 15% increase in throughput rate, 15% increase in energy efficiency, 30% and 20% increase in photo and video noise reduction, the first SLR-level noise reduction technology BM3D on the mobile side, and the world’s first dual-domain joint video noise reduction technology.
There is also the innovative Huawei self-developed Da Vinci architecture NPU (Neural Processing Unit), special architecture of two large cores plus a micro-core that balances high performance with high energy efficiency, and the ISP and NPU have started to initially combine to explore a new AI photography.
The Kirin 990 5G-based Huawei P30 series innovatively introduces the RYYB CFA super-sensitive image sensor, where Y stands for yellow, replacing G green in the traditional RGGB format, with a wider spectral response, wider spectral coverage, and the ability to sense more photons, resulting in a 30-40% increase in overall light intake, a better signal-to-noise ratio in dark scenes, and a higher level of night photography.
However, RYYB as a new thing is not easy to manage, the traditional ISP interpolation algorithm, AWB, CCM is difficult to deal with the rich color information of Y yellow pixels, accurate restoration is very difficult.
For this reason, Huawei on the one hand in the subsequent Mate 30 series with RYYB, RGGB with a balanced design, on the other hand, the introduction of AI neural network-based interpolation algorithm, AWB, CCM, and integrated into the Kirin 990 ISP pipeline image processing, for the traditional ISP process to increase the computational photography processing, and then after a large number of RYYB sensor RAW After the data training, it can effectively find the complex mapping relationship between object details and color components.
It can be said that if you simply change the image sensor without simultaneous innovation in hardware and algorithm, you will not only fail to improve the image level of the phone but also fall into chaos.
It is with the powerful NPU performance that Kirin 990 gradually improves the support for the new complex RYYB CFA, releasing its powerful potential for color processing, especially a big step forward in video real-time processing, improving the dark light detail performance and color reproduction effect of 4K video.
Kirin 9000 ISP+NPU fusion, beyond the limits of the human eye
In the latest generation of Kirin 9000 processor, Huawei has gone a step further by implementing the world’s first ISP+NPU fusion architecture, instead of simply pursuing more ISPs like its friends, the fusion architecture is designed to organically integrate the ISP processing pipeline and NPU matrix computing, which not only makes photography more comfortable but also realizes pixel-level processing of the real-time video. Fusion architecture is designed to integrate the ISP processing pipeline and NPU matrix computation.
Such a fusion architecture for still photo processing has been effortless, here not to repeat, focus on the video processing.
After all, photos are individual frames, while the video is composed of continuously changing still frames. 24FPS video processing is equivalent to 24 photos per second, and real-time pixel-level processing of video is an unprecedented test for both hardware design and software algorithms.
In the traditional ISP video stream processing process, due to the limited performance of ISP, hardware modules isolated from each other, insufficient processing bandwidth, and other factors, ISP can only honestly frame by frame processing, everything is queued.
With the addition of an NPU boost, the process can be accelerated, but the processing process is frame-by-frame and still requires queuing. For example, when the ISP processes the first frame, the NPU needs to wait for the ISP to finish processing before it can take over.
Kirin 9000 has changed all this, not only integrating the latest ISP 6.0, but also supporting quad pipeline parallelism, increasing throughput by 50%, video noise reduction by 48%, 3A (autofocus/auto exposure/auto white balance) processing capability by 100%, and the biggest highlight is the first ISP+NPU fusion architecture.
The ISP+NPU fusion connects data and information, using direct hardware connection to directly integrate the original independent NPU computing into the ISP processing pipeline, combined with a large-capacity, high-bandwidth intelligent cache SmartCache 2.0, making the input data flow and output data flow continuously, without any pauses and waiting during the whole process, data can be seamlessly buffering, real-time processing, and more than one grade of efficiency improvement.
At the same time, the ISP+NPU convergence architecture changes the traditional way of queuing frame by frame processing, and slices each frame into smaller units, so that the basic unit of processing is no longer a whole frame, but a small slice, thus completely connecting each frame inside and between different frames, and accelerating data transmission and processing.
In this way, ISP+NPU combine to achieve intelligent processing of video streams based on intra-frame block.
As shown above, assuming that each video frame is split into four slices, ISP can process two of the first, and then hand them over to NPU for further processing after fast processing, at which time ISP can process the remaining two slices of the frame and the first two slices of the next frame at the same time, and so forth, thus greatly reducing the waiting time and improving the processing efficiency.
Of course, the splitting of each video frame into several slices and the processing of several slices by NPU/ISP at a time are extremely flexible and can be intelligently cut according to the different data volume and processing difficulty of each frame, as well as the application of multiple different algorithms within the same frame, which allows for a richer processing effect while the pipeline efficiency is indistinguishable.
Naturally, this kind of joint processing requires a high level of efficiency for ISP and NPU to work together, after all, a slight mistake will affect the smoothness of the whole process. With Kirin 9000’s rich experience in ISP development and the powerful AI arithmetic of its own Huawei Da Vinci Architecture 2.0 NPU, everything is matched seamlessly.
Data shows that Kirin 9000 can complete the computing task within 33 ms or even lower time interval when processor 4K video, especially in the complex environment that will generate massive data such as night scene, it can give full play to the extremely high processing efficiency of ISP+NPU fusion architecture to achieve accurate color reproduction with rich details.
If you think these theoretical analyses are a little hard to understand, let’s look at a practical example here to feel the magic of ISP+NPU fusion architecture.
As in the above motion picture, the visual object in the circle is a mini windmill, with four long and thin blades in dark tones, and a dark environment.
But on the Kirin 9000 platform, thanks to the intelligent processing of the ISP+NPU fusion architecture, you can see that in the captured video, the windmill fan blade is exceptionally clear, the color is clear, the transition is natural, and the background is differentiated, and the operation is as smooth as running water, even the blurring of the fan blade head and tail due to the different speed of movement is shown Even the blurring of the head and tail of the fan blades due to the different speed of movement is shown to the fullest extent, and even the human eye may not be able to capture such realistic and informative details.
Kirin 9000 ISP+NPU fusion architecture: more work, better power consumption control
So, after the ISP + NPU fusion architecture, there is more work to do, will not be more power consumption? Here is another benefit of the fusion architecture, computing performance up, power consumption can also be well controlled, better to complete more tasks.
As we all know, it is quite power-consuming to shoot video on cell phones, much higher than static photos, but Kirin 9000 has achieved the magical effect of “the horse runs fast and the horse doesn’t eat much” through multiple efforts.
The first is to intelligently slice and dice data input scenarios, which significantly reduces the need for computational memory in the middle layer of the network. Second, slice-level data interaction effectively controls algorithm latency and combined with SmartCache caching, effectively controls the power consumption of video scenes.
The difficulty behind the Fusion Architecture is beyond your imagination
You may say, it all seems not difficult, but innovation at the chip level has never been about simplicity, and the difficulty and technical challenges of ISP+NPU fusion are unimaginable.
For this reason, while designing the hardware fusion architecture, Huawei has also spent a lot of thought and made a lot of innovations in software algorithms to release the potential of hardware, such as pixel-level AI algorithms for IPS links and partner AI chips with strong computing power to achieve a complete program of energy-efficient end-side software and hardware integration.
At the same time, in today’s imaging scenarios, data processing are massive level and extremely complex, which requires the entire process solution to have strong robustness (robustness) and to improve the efficiency of processing large data volumes, the model structure must also meet certain computational constraints, and must use techniques such as network structure search and hybrid quantization to make the model structure and acceleration hardware work together efficiently.
In particular, capturing video in 4K ultra-HD resolution, the amount of image data in the pipeline is several orders of magnitude more than before. After all, a single frame of 4K image processing already places high demands on AI, not to mention the need for real-time multi-frame processing in video scenes.
For example, in a 4K 30fps video capture scene, the entire ISP link must complete the calculation of a single frame within 33ms, which leaves the AI algorithm to calculate the processing time is even shorter, the real lightning between the processing in place, and once the calculation efficiency, performance can not meet, the efficiency of the video processing process will drop sharply, the response to the user experience is a serious lag, which is naturally unacceptable in any case.
Besides, any hardware design, algorithm design to consider the power consumption, must be within the controllable range, otherwise, it will lead to obvious cell phone heating. I believe we all have the feeling that the phone will be significantly hot when taking pictures and videos continuously daily, not to mention the addition of such a complex calculation process.
Therefore, to achieve the best image effect and user experience, breakthrough algorithm effect, fast and efficient computing performance, and excellent and controllable computing power consumption, all three are indispensable, and Kirin 9000 ISP+NPU fusion architecture does exactly this almost perfectly to have the wonderful images finally presented in our eyes.
Kirin is strong all the way: the future is promising
In general, Huawei cell phones continue to be at the forefront of the world in terms of image capability in the past few years, taking the first place and sitting at the top of the list until lonely, not only from the exquisite camera system but also depends on the full support of Kirin chip, ISP processor, NPU neural processing unit.
It is from this continuous innovation, we can see more and more clearly through the small cell phone, record the whole world of wonderful, leaving that an unforgettable moment.
From the whole industry, only Apple and Huawei are the only two top giants that have done the collaborative upgrade from the core to end. While Apple revels in its closed ecology, Huawei presents an open world.
Right now, Huawei Kirin’s development has encountered unprecedented shackles, and the original unlimited bright future is overshadowed by a thick shadow. After years of steady and steady fighting, I believe Huawei Kirin has the strength to face any difficulties and obstacles. Looking forward to Kirin’s next stop!