Looking down at the 10nm of sentient beings, Qualcomm Snapdragon 835 is analyzed in detail
Looking down at the 10nm of sentient beings, Qualcomm Snapdragon 835 is analyzed in detail


Snapdragon 820 is a milestone in Qualcomm's SoC landscape. Its absolute performance and energy efficiency ratio have improved significantly compared with Snapdragon 808/810, and it is also very successful commercially. More importantly, behind this is Qualcomm's view and layout of mobile computing platforms - heterogeneous computing:

  1. The greatly enhanced Hexagon 680 DSP (digital signal processor) has its own independent introduction, which supports Hexagon Vector Extensions/HVX, Be responsible for computing load in image processing applications (virtual reality, augmented reality, image processing, video processing, computational vision and other tasks that used to cost more power on CPU/GPU can be handed over to DSP for more efficient processing);

  2. It has the first 64 bit CPU core of Qualcomm's independent architecture, Kryo, which focuses on improving the performance of floating point operations;

  3. With the addition of the upgraded Adreno GPU, the stronger ALU (logic operation unit) not only improves the experience, but also makes it possible to improve artificial intelligence, machine learning (object recognition), imaging optimization in photos and videos, and AR/VR experience.


Structure composition of Snapdragon 835

On the basis of this concept, Snapdragon 835 is now part of Qualcomm's mobile platform. This SoC with more than 3 billion transistors (however, Apple's A10 processor launched in September 2016 has 3.3 billion transistors) is the first product to use Samsung's 10nm process, which ultimately reduces the packaging area by 35% compared with Snapdragon 820. The new CPU architecture and X16 LTE baseband are the most important changes. Its baseband can already provide a download speed of up to 1Gbps (Cat. 16). Other parts of the SoC have also received some corresponding minor upgrades.

 

Comparison of three generation Snapdragon parameters

Qualcomm has been in the habit of inviting media to its headquarters in San Diego, California, for function demonstration and limited testing since the generation of Snapdragon 800. Of course, it is through Qualcomm's own prototype (MDP mobile development platform), which is used for software and hardware testing. Mobile phones or tablets with full functions, but their size will be one circle larger than mobile phones. The Snapdragon 810 was put on the tablet test machine, while the 820 test machine was a mobile phone with a 6.2 inch giant screen. In the 835 generation, the test prototype continued to become smaller, becoming a 5.5 inch 2K resolution screen, a mobile phone with 6G memory and 2850mAh battery.

It is a good trend for the prototype to become smaller, because the smaller the machine, the smaller the area that can be used to absorb and disperse heat, which indicates that the power consumption of the Snapdragon 835 will be further reduced. Of course, this needs accurate power test to confirm. Because the testing time is limited, we focus on CPU/GPU and memory performance testing. Note that we are testing the prototype and informal version of the system, so the results are for reference only. The mass production version is likely to have some differences.

CPU part

Kryo on 820 is Qualcomm's first fully self-designed 64 bit CPU. This unique architecture has strong performance in floating point IPC Love to play computer games : The full name of IPC is Instruction Per Clock cycle, which refers to the number of instructions that can be processed per cycle. It can be generally understood that performance=IPC * frequency), but the integer IPC performance is not as good as the A57 architecture before ARM, and the energy efficiency ratio is also a little lower. Qualcomm has taken a completely different path in 835 than using the modified 820 4-core Kryo architecture.  

The new Kryo280 architecture, except for its name, has no inheritance relationship with the original Kryo architecture. The Snapdragon 835 uses an 8-core core size architecture, consisting of four large cores responsible for extreme performance output and four small cores pursuing "energy efficiency ratio". The most special thing about this architecture is that it is the first product to improve the new architecture of ARM. The BoC license allows the manufacturer to adjust the architecture according to requirements, especially the modification of the instruction prefetch area and the transmission queue, but some parts of the architecture are not within the scope of the license. Both the decoder and the execution pipeline cannot be moved. After all, their project is too big to change. Qualcomm did not disclose that it was modified based on the ARM architecture, but Qualcomm said that both large and small core CPUs are semi self-designed architectures, and the memory controller is also designed by Qualcomm itself.

The Kryo 280 CPU of Snapdragon 835 has significantly improved the performance of integer computing IPC (the number of instructions that can be processed per week) compared with 820/821, which is unexpected. After all, the strength of Kryo core is not integer computing. Although most of the tests have seen significant growth, some tests such as JPEG, Canny and Camera have regressed, which is similar to our A73 performance on Kirin 960. The performance of these integer tests and the L1/L2 cache behavior are very similar to the unique performance of A73. Kryo 280 may be modified based on the latest A73 architecture of ARM.

 

Geekbench 4 Single thread Integer Operation Small Item Score

The specific score can be used to stamp the article of Dragon 835's debut in Asia before we love computer games 200000 is not a dream Dragon 835 debuts in Asia Here we briefly compare the integer test results of Snapdragon 835 and Kirin 960 in Geekbench 4 The degree of similarity of the sub item scores of the test scores of the test scores of the test scores of the test scores of the test scores can not be caused by the frequency and normal test variables. In their test scores, only a few special items have differences, ranging from - 5% to 9%. This once again shows that the achievements of semi autonomous architecture after minor modifications within the BoC license can also be predicted.

 

After taking the frequency into account (above table), the score obtained per Mhz (meaning close to the same frequency performance) will be obtained, which will make it easier to compare the IPC performance between different architectures. The performance of Kryo 280, a semi self-designed architecture, is not very different from that of A73 on Kirin960. It is only 6% higher than A72, 14% higher than A57, and 22% higher than the previous generation of Snapdragon 820/821 (because the Kryo core itself performs poorly on LLVM and HTML5 DOM, it lowers the score of 820/821). Snapdragon 835 The results of the small item test can not sweep across other flagship products. Like the A73 on Kirin 960, the results of the small item test are high and low compared with the previous generation.


Geekbench 4 Single thread floating point operation small item score

Kryo 280 of Snapdragon 835 showed a marked regression in the floating point test of 820's original strength, even lost to A72. The results are still very similar to the A73 of Kirin 960, even the test scores of small items are very close.

The part of A73 that is slightly backward from A72 is also the part of Kyro 280, which is somewhat unexpected. Because their NEON execution unit and A72 are almost unchanged. If you insist on the difference, A72 will have advantages in front-end latency, instruction prefetching area and memory system, but the range is not large. The disadvantage of A73 in instruction decoding will affect the performance, but will not have a big impact on the overall situation. Compared with A72, the A73 of Kirin 960 and the Kryo 280 of Snapdragon 835D both suffer from the decline of L2 cache read/write bandwidth and L1 cache write bandwidth, which also affects the performance.

 

The floating point IPC performance of Snapdragon 835 is 23% lower than that of 820/821. It is not clear whether this is a compromise or a change in design thinking. When Qualcomm started designing the Kryo core two years ago, it should have predicted the future needs that have not yet arrived. From the perspective of the current layout of Qualcomm, Qualcomm should think that more computing needs will be transferred to GPUs and DSPs in the future, which can improve the overall energy efficiency ratio, while the "give up" floating point performance can save valuable chip area and power.

 

Geekbench 4 Single Thread Memory Test Results

Kryo 280, A73, A72 and A57 all have two AGU address generation units. However, A72/A57 separates the loading and storage operations by different AGUs, while the AGUs in Kryo 280 and A73 can perform access operations. On the Kirin 960, this change has resulted in lower memory latency and improved memory bandwidth than the 950.

In terms of memory delay control and bandwidth, the 835 is even stronger than the 960. After the factor of frequency difference is eliminated, the increase still reaches 11%, which is a huge increase compared with the 820/821. However, the bandwidth increase of the two generations of Kryo is not as large as that of Kirin when upgrading the A73 architecture from A72, because the two AGUs of the previous generation of Kryo can already perform loading and storage operations, but the delay of the previous generation in some scenarios will be higher.

System performance

So far, our impression of Kryo280 on Snapdragon 835 is that it is the product of the modification of the A53 and A73 architectures. The IPC performance of both integer and floating point operations per cycle is very similar to Kirin 960. System level tests such as PCMark can use Android API to test CPU/GPU/memory/NAND storage according to the real production environment, but in addition to the CPU's IPC performance and memory latency, it is also affected by the manufacturer's optimization of system software, which can control the program priority and DVFS dynamic voltage frequency adjustment strategy to achieve a balance between performance and endurance.


It can be confirmed that, like other SoCs, Snapdragon 835 on different machines will have a large score gap. The Snapdragon 835 on the prototype machine set a new record for PCMark. Its total score was slightly higher than that of Mate 9 with Kirin 960, and 23% faster than that of the fastest 821 machine.

 


In the web browsing test, Snapdragon 835 prototype performs well, but only 10% higher than Mate 9. In this test, which mainly examines integer operations, it is 34% higher than 820/821. It is worth noting that 820/821, which is already laborious in integer operation, ranks behind all A72 and A73 machines, and even Snapdragon 650 cannot win.

 

PCMark's write tests are frequency sensitive and require the instant burst performance of the big core. The tests include simulating the opening of PDF files, file encryption (the first two are integer computing loads), memory tests, and even flash memory read and write tests. Therefore, the results of this test will have many variables, and even 820/821 machines will have large differences. For example, the LeEco Pro3 is 40% higher than the Galaxy S7 edge. But the difference between Snapdragon 835 prototype and Mate 9 is very small. However, Snapdragon 835 has made significant progress compared with the previous generation, 24% faster than the fastest LePro 3 (821), 80% faster than the Nexus 6P (810), and 162% faster than Lenovo ZUK Z1 (Snapdragon 801/8974AC).


 

PCMark data processing test is also mainly to assess the performance of integer operation. The test includes processing a large number of different types of files at a speed comparison, and recording their real-time frame rate when interacting with dynamic charts. The achievements of Snapdragon 835 prototype and Mate9 are still very close, but they are more distant from other machines this time. The Snapdragon 835 prototype is 28% higher than Pixel XL and 111% higher than LG G5. Similar to the write test, the performance gap between 820 models is very large, which proves once again that the adjustment of OEM manufacturers will have a great impact on the user experience.

 

In the video clip test, the clip shader of OpenGL ES 2.0 is used to provide video effects. This is a very light workload. Most mobile phones are here. The GPU and large core are almost idle. Only small cores such as A53 are used, which is why the performance gap is so small.

 

In the image editing test, a bunch of different image special effects and filters were provided, requiring the CPU and GPU to work simultaneously. Thanks to the powerful ALU (Arithmetic Logic Unit) performance of Adreno GPU, the Snapdragon 835 prototype and 820/821 model perform well. The Adreno 540 GPU of the 835 is 33% stronger than the Mali G71 (the strongest GPU architecture of ARM at present) on Mate9.

 

The performance of iPhone in JavaScript test is very good, but this cannot be used to compare Apple A series chips and Android platform chips, because their browsers are also different, which is largely due to the JavaScript engine of Safari browser. When Android platform also uses the latest version of Chrome browser, Snapdragon 835 prototype is good. Although there is no difference between the 820/821 and the Kraken test, there is no difference between the JetStream and Mate9, but the latter has an improvement of 15% to 37% compared with the 820/821 machine. Its testing in WebXPRT 2015 was unexpectedly good, 24% higher than Mate 9 and 67% higher than the S7 edge using 820.

To see how the impact of software will be, we used Qualcomm's internally tested browser (optimized specifically for Snapdragon SoC) to test, and the results showed that the Kraken test value increased to 2305ms, but JetStream increased by 24% to 87 points (but these two projects still lag far behind iPhone), WebXPRT 2015 test even jumped 82% to 280 points (finally won and abandoned iPhone).

GPU part

Adreno 540 on Snapdragon 835 and Adreno 530 on 820 use the same architecture, but minor optimizations have been made to avoid previous bottlenecks, and ALU and registers have been optimized. By improving the depth rejection (depth filter), the computing load of each pixel is reduced to improve performance and reduce energy consumption.

Qualcomm claims that its 3D rendering has increased by 25% compared with the Adreno 530 of 820. Although there is no official explanation, it is obvious that most of the improvement comes from the advantages of the 10nm process, which enables the GPU frequency to be increased to 710MHz, which is 14% higher than 820 alone.

 

The Tyrannosaurus Rex test in GFXBench is an old test project based on OpenGL ES 2.0 API. Different from the new test, its results have no strict correspondence with the shader performance, which is why the flagship can reach 60 frames of V-Sync limit in the bright screen test. However, the iPhone 7 Plus and Mate 9 with 60 frames were both 1080P resolution before, but this Snapdragon 835 prototype is the first 2K screen device with 60 frames.

 

In the off screen test, the Snapdragon 835 test machine surpassed the iPhone 7 Plus and Mate 9, was 25% faster than the Pixel XL (820), and was close to the increase claimed by Qualcomm. Almost twice the Adreno 430 on the Nexus 6P and 4.5 times the Adreno 330 on the 801 on the Lenovo ZUK Z1

 

GFXBench's Car Chase scene uses the latest rendering pipeline on OpenGL ES 3.1 and Android Extension Pack (AEP). Like many of the latest games, this mainly squeezes the performance of ALU units. In this test, the Snapdragon 835 also increased by 25% compared with the previous generation, but more unexpectedly, the performance of Adreno 540 was 55% higher than that of Mali-G71MP8 on Mate 9, which is the latest GPU of ARM's Bifrost architecture, and the 960-1037MHz UHF was maintained in the test.

 

In the Sling Shot Extreme scenario of 3DMark, the Android platform will use OpenGL ES 3.1, and the iOS device will use the Metal graphics API. This test will squeeze the GPU and memory system at the same time, and the off screen test resolution is 1440P, rather than 1080P used in other tests. The total score of Snapdragon 835 has increased by 30%, which is quite good. After all, there is only an 8% gap between Apple A10, Exynos 8890, Kirin 960 and Snapdragon 820/821. In graphics, the Snapdragon 835 prototype is 10% higher than the iPhone 7 Plus and 24% higher than the 820 and 8890 S7.

In the first graphic test of 3DMark Sling Shot, after all, the architecture did not change much. Adreno 540 did not show a great leap forward in geometric operations like the previous Adreno 530. However, the Adreno 540 is still about 11% stronger than the ARM Mali series GPU, which has always performed well in geometric computing tasks.

In the second graph test focusing on shader performance, Adreno 540 went crazy. Compared with Adreno 530 on S7, it increased by 34%, and was 50% higher than Mali-G71 on Mate 9. Qualcomm's modifications on ALU and registers had very obvious effects in this test. The physical test runs on the CPU and is directly affected by the random access performance of the memory controller of the SoC. Although the CPU performance is similar, the Snapdragon 835 prototype is 14% higher than Mate9, probably because the memory controller of the 835 is better than Kirin 960 in latency control and bandwidth.


 

The Basemark ES 3.1 test will use OpenGL ES 3.1 on the Android platform, and the Metal graphics API on iOS devices, but there is no tessellation secret computing in the GFXBench 4.0 Car Chase scenario. Before joining Vulkan support at the end of this year, Android platforms will be dragged down by OpenGL, and will not be able to raise their heads in front of Apple's Metal Graphics API. The gap in API makes the iPhone 7 Plus 73% away from the Snapdragon 835 prototype.

Under the working scenario of Basemark ES 3.1, ARM's Mali series GPUs will be more dominant. In off screen testing, the Mali-T880MP12 on Exynos 8890 is 15% faster than Adreno 530 on 820, and the Mali-G71MP8 on Kirin 960 is 25% faster than Adreno 530 on Snapdragon 835. In this test, the Snapdragon 835 prototype has increased by 40% compared with Pixel XL, which is higher than 25% in other tests.

 

In these game simulation tests, Adreno 540 has better ALU performance, so we are curious about its performance in GFXBench's comprehensive ALU test. Surprisingly, this architecture upgrade has brought no improvement. In the result, the increase of 14% in 835 vs. 820 and 8% in 821 is strictly corresponding to the GPU frequency, indicating that the bottleneck of this scenario lies elsewhere. Of course, this result is 32% higher than Mate 9.

Display of energy consumption, camera and iris recognition functions

In order to create the impression that Snapdragon is a platform, not just a CPU or baseband, in the minds of the public, Qualcomm has made many demonstrations in the laboratory of the headquarters. However, in CES and GDC exhibitions, Qualcomm seems to have nothing to show. It is basically to plug mobile phones into VR/AR devices, or directly use the 835 VR/AR prototype.


In the same VR test, compared with 820, the power consumption rate of Snapdragon 835 decreased by 23%. Of course, this is Qualcomm's own test environment, and the advantages of using real scenarios may not be so great.


 

Qualcomm's camera test lab has a lot of awesome and expensive equipment. The public display is to tell the world that Qualcomm's ISP and software improvements are based on a large number of test results. In addition to controlling the light intensity and color temperature, even the "shaker" used to test ISP electronic anti shake is specially designed. It can set different shake modes and frequencies to facilitate engineers to test the EIS electronic anti shake system.

 

An interesting display is the iris recognition of the 835 engineering machine. This function can partially replace the work of fingerprint recognition, and can be used when it is inconvenient to use hands. However, this function of the tester is not perfect. Qualcomm's product managers use it normally, but the author cannot use it.


The laboratory also has a demonstration of computer vision recognition, which is no longer a new technology. But like its competitors in the same field, Qualcomm has also benefited from recent breakthroughs in machine learning.

summary

Mobile SoC includes: CPU, GPU, high-performance DSP (for computing purposes), low-power DSP (coprocessor), baseband SDP (signal processing), ISP (photo processing), fixed function modules (video and audio), etc. All these parts will affect the user experience, but many of them cannot be quantified.

 

The CPU and GPU are the core performance parts and have the greatest impact on the endurance, so they will also be the top priority of our test. The preliminary test results show that the CPU of Snapdragon 835 is a 4+4 size core structure, composed of four modified versions of A73 and four modified versions of A53. The integer and floating point IPC cycle performance is very close to that of A73 on Kiirin960. Compared with the Kryo architecture of 820/821, the integer performance is improved and the floating-point performance is decreased, but the advantages of this rounding outweigh the disadvantages, and its overall performance is better than that of 820/821.

 

Qualcomm continues to make efforts on VR/AR, not only mobile phone platform, but also headwear devices. VR requires very powerful GPU performance for high resolution and low latency. The Adreno 540 GPU of the 835 has 25% performance improvement under small architecture optimization and high frequency, which is certainly a good thing for VR devices.


In addition, these results are based on engineering machines and cannot represent the final mass production results. Even so, it is almost certain that there will be a significant increase in 835 compared with 820. However, the biggest impact on the user experience may not be the slightly improved performance or new features, but the power consumption/endurance improvement brought by the 10nm process.

via:


Follow our Weibo @ Love Computer

Follow our WeChat official account: playphone

Of course, we also pay attention to our Bilibili account: love computer

Share:
Charles Fang
Ordinary geek
Benefits!

Scan QR code and follow the author

Share Weibo Share WeChat
 Aigoji WeChat

Aigoji WeChat

 WeChat

WeChat

Recommended products

Sorry, the product you are looking for is not available in the product library

on trial