Just about all benchmark applications can be manipulated in one way or another. We have a long history of both hardware manufacturers and software developers finding ways to "fake" scores to make their devices appear faster than they actually are. Smartbench 2011 is certainly not bullet-proof either. I have been fixing numerous bugs which causes the application to report misleading scores since the initial release.
Recently, a blogger from Android Community managed to break Smartbench 2011 and produced scores that are just not realistic. So far, Android Community and Engadget have posted articles which pointed out the weakness in Smartbench 2011. (See the articles from Android Community and Engadget)
So what's going on here? Can any of the scores reported by Smartbench 2011 be trusted?
What causes it?
Smartbench uses a rather simple method to determine the duration of each test. It uses a timer at the beginning of the test and stops at the end of the test.
With the current code, when one of the test suit fails, the app itself does not crash but the test in question will end abruptly, proceeding to the next one. This of course, will produce faster test runs.
What are we going to do about it?
A new version of Smartbench 2011 (v1.2.1) is being released to address this bug.
Here are the additional checks I have implemented in this version:
- If any of the test suit fails, the entire test run is considered invalid. The result will not be displayed nor submitted.
- If any of the test suit produces scores that are unrealistically high, those are ignored as well. We have plenty of sample scores now. Most of the invalid scores can be filtered out safely.
Are the current results reported by Smartbench still valid?
Several users from XDA Forum and Blog commenters have pointed out that perhaps the results stored in the Smartbench DB server are invalid due to this bug. Is there any truth in this?
We already had a server-side script that checks for invalid results (i.e. scores that are extremely high) and not report them in the chart. This is why you don't see these extremely high scores in both within Smartbench 2011 or smartphonebenchmarks.com today. The invalid results are stored in the DB for debugging purposes only.
Also, we collect millions of test result submissions. One of the reason for doing this is to average out as many results as possible for a given configuration. With more results, average values become more accurate.
Smartbench 2011 is written by a human (thats me!). Just like any other software that exist today, Smartbench also contains bugs and I do expect users to discover them, especially as the user-base grows. These reported bugs will be fixed!
I do believe it is important to listen to the users and implement fixes when called out for. Users tend to come up with the best bug reports, suggestions and advices. It is no different for Smartbench 2011.
Please do try the new version (v1.2.1) and let me know if you can still manage to produce unrealistic scores.
June 22, 2011
It appears that simms from AndroidCommunity/XDA can still reproduce this issue in some cases. It is apparent that we are dealing with two separate issues rather than one. I will soon issue another release, which, hopefully, will fix the second issue.