When talking about the ranking of mobile phone photos, many people first think of DxOMark And "photo ladder". But can the level of mobile phone photography really be reflected by a simple score? DxOMark and many evaluation institutions photograph How reliable are the rankings and scores?
Professional evaluation institutions also have areas of missing links
DxOMark, founded in 2003, is a website that measures and ranks the imaging quality of cameras and lenses according to industrial standards. Today, it is one of the most famous and professional camera testing institutions on the earth.
It has a customized test room, which is covered with matte materials, with the temperature controlled at 21-25 degrees and the humidity between 30-70%. The on-site light source even uses spectral equipment to conduct color temperature correction, so as to strictly control the entire test environment.
They have their own testers and calculation formulas, which can process RAW format files generated by cameras and obtain fairly objective and valuable data. With the open test description, the generated data can describe various performance indicators of the machine like a parameter table. Of course, the prerequisite is that readers understand their test environment and have a certain technical foundation.
It doesn't matter if you can't understand it, as long as you know that their camera evaluation is very professional and meticulous. However, in the past few years when the quality of smart phone photos has soared, DxOMark has also joined in the test and rating of mobile phone photos. There are many unsolved mysteries and legends:
Why is the photo score of Sony's mobile phones always incredibly high?
Why can Sony's Z5 score surpass Samsung's S6 edge and iPhone 6s Plus ?
S6 edge+and Note5 Why are there different scores?
What happened? Now we begin to approach science.
Simple score judgment is not feasible
DxOMark's rating table for each mobile phone is as above, which is divided into 7 items: "exposure and contrast, color, focus, detail texture, noise control, Artifacts, flash effect". Among them, Artifacts is the problem of image quality degradation caused by sensors, lenses, electronic noise, and algorithms, including purple edges, sawteeth, noise, sharpening halos (white edges caused by excessive sharpening), moire, and other phenomena.
If there is a comparison of multiple machines, the chart above will be used to intuitively display the advantages and disadvantages of different projects between mobile phones.
However, the domestic media usually intercept only the total score of DxOMark, and will not explain its scoring items, so there are often some theologies, for example, Sony flagship can always shoot the top few in the same generation
In addition to the "out of context" of scores by different media, the scores of the seven sub items with higher reference value are also full of doubts. The first is the specific definitions and standards of different sub items, and the second is the scoring weight of sub items. In addition, this total score also integrates the video shooting that is rarely compared.
Here we intercept the rating results of Samsung's flagship in recent years. We can see that every generation of Samsung's flagship is making progress, but after careful analysis, we will find many strange things (marked in red): S6 edge+, Note5 and S6 edge hardware are completely the same, but why are the focus scores significantly different? The first two are almost the same, even the software. Why are the video anti shake and texture detail scores different? Why is the focus fraction of S5 with phase focus even lower than S4?
Compared with different models of different brands: G4, 1020, Z5 and other modern models, why are the Artifacts scores far behind the old models? Why are the focus and flash scores of Soda's Z5 so high?
The Nexus 6 and Droid Turbo 2 are the works of the same generation of moto. The former is the optical shake proof Sony IMX214, the latter is the 2100W IMX220, and the Moto X Sytle is the Sony IMX230 sensor; The BlackBerry Priv is rumored to be the truncated IMX230, and the Nexus 6P is the 13 megapixel Sony IMX377.
Comparison of equipment scores at the top of the photo rating and popularity list
There are many doubts in it, and there are many different ones here.
In addition, the samples released in some tests also hit the hearts of many DxOMark believers. The lens of G4 in the above figure was obviously not wiped clean, causing light
Even if these doubts are ignored, their ratings are not very meaningful to normal users, and their rating system is not based on scenarios. The night scene and weak light performance as the bottleneck of mobile phone photography cannot be directly known from the score.
However, the discrimination (corresponding to texture and detail scores) that is valued by the plane user seems to have a small weight and low precision: the S6 edge with backward sharpness can be equally scored with Note5? G4, the peak of the day discrimination in active service, only gets the same score as machines like Note5? The highest score of this item is Nexus 6P.
What's the problem?
The key to objective evaluation is to eliminate the influence of human and subjective factors as far as possible. DxOMark's professionalism in the field of camera and lens evaluation is beyond doubt. Their professionalism is reflected in the strictly controlled testing environment, the means of converting samples directly into data, and the possibility of eliminating human subjective factors as far as possible, but it is questionable in the field of mobile phones.
Extract of some formulas for DxOMark camera testing
The core problem here is that the whole solution of DxOMark is based on the processing of RAW format files. However, many mobile phones are unable to output RAW photos, and cannot use the "sample data" method on the camera. Each sample is a lens CMOS、ISP、 In addition to the joint influence of the algorithm, there are HDR, night scene noise reduction and other algorithms that will perform multi frame synthesis on the "original film".
In the field of photography, the specific score is a very sensitive minefield. There are too many factors for taking photos. The proof and EXIF information are objective, but these so-called objective scores are the most subjective. The official did not publish the standards and reference documents (maybe the author's English is not good, and didn't find them?) And the original, we can not verify the authenticity. To sum up, DxOMark's mobile phone photo test is still a "black box" that cannot be reproduced and spied by the outside world, and its credibility itself has been compromised.
Of course, even ignoring these problems, its professionalism should be more professional than that of most mainstream media in China.
Domestic evaluation institutions also have many small problems
Well known peers have stepped into the pit of the "scoring system". Although almost all of the test items that we pay attention to will be involved, which items are more important? What is the specific weight? These are all undecided projects, and it is unlikely that there will be weight indicators that everyone agrees with. After all, everyone has different needs for photos. Maybe this person likes the discrimination of counting gross, while that person only needs to have pleasing colors and contrast, and the discrimination is enough to make friends.
In addition, the issue of the weight of scoring items will lead to many fallacies, such as the scoring system in the war of words between Xiaomi 3 and MX3, which was blackened as above.
Recently, another colleague (above) accidentally stepped on this hole. Note5, which everyone agreed to take photos, was lost to 6s Plus and even Sony. I believe that even Suofen would not believe it
More evaluation agencies even take photos when the light, scene, shooting position and focus are not unified. What's more, some of them even don't wipe the lens clean.
Blind test and mass test also have obvious limitations
Comparison of total blind evaluation scores
Ranking of votes of different projects in blind evaluation
Since professional institutions may have some problems, will blind testing be more credible? Unfortunately, not really. Blind review voting is conducted through web pages, but many voters do not have the heart or speed to click to view the large picture, and some blind reviews are not even provided with the original sheet.
This can only contrast obvious items such as metering, contrast, color saturation, etc. Many aircraft users value discrimination, noise control, and even white balance can not be considered (there is no standard sample provided, and it is impossible to confirm the true white balance).
The final conclusion is very metaphysical, very simple, very hasty, and nonsense: learn to use a dialectical perspective to look at photography and comparison, there are many traps even the tester himself did not notice. In addition to the direct comparison of the original photos with well controlled shooting environment, they are not absolutely convincing. But it only needs a simple score or ranking to judge the photography level of mobile phones. It's good for everyone to be a topic of entertainment.
Some of the pictures in the article are from the network