BHQC: What Are Benchmarks, and Do They Matter?
The Brutally Honest Question Corner is a continuing series on disputed or hot-button topics in the mobile industry.
In reviews of new mobile phones and tablets, we often see the term “benchmark.” Benchmarking is used throughout the computing world, but in mobile technology, it usually takes the form of an app (native or web-based) that tests your device’s performance. These tests produce one or many numeric scores which rate the device’s ability to perform certain tasks. The principal advantage to benchmarking is that it takes subjective impressions out of the equation; the tests are standardized so, in theory, all devices are evaluated on an equal playing ground.
The down-side is a symptom of the competitive environment of mobile computing and almost religious zeal of platform fans: too often, benchmark scores are used as weapons in the “my device is better” wars. The scores, when taken in a vacuum, can easily be used to extoll the virtues of one platform while condemning another. Such scores also pop up pretty often in the posts of those with a need to justify their recent technology purchase.
Look how awesome MY device is! Oh wait.
I want to talk about the mythical “average person” for a second. Benchmarking is crucially important to software designers and developers; it helps them evaluate their products under different conditions and on different platforms and devices. But to a normal consumer, how relevant and reliable are these metrics when considering a purchase?
First, let’s introduce the ever-present grain of salt. While it’s wonderful to have the option of running a standardized test, that terminology suggests that the results should be reproducible; I should be able to run the same test twice in quick succession on the same device, and achieve the same score. Sadly, that’s not the case:
These tests were taken about ten minutes apart.
Granted, those are minor variations, but they’re not insignificant; they illustrate an underlying inconsistency in a testing method that’s supposed to be standardized. Lest you think I own a faulty device, the folks at Engadget were also able to demonstrate fluctuations in their test results on the same unit, with back-to-back benchmark sessions. These tests aren’t infallible.
Second, it’s possible for OEMs to “game the system” by calibrating their hardware/software to excel at one popular metric, while sacrificing performance in other areas. Depending on who you ask, this practice is either common or almost universal. What sometimes results is a device that shines on paper, while real-world use is laggy, sluggish, or unstable.
Okay, so we have an evaluation system that’s vulnerable to gaming, which produces somewhat inconsistent results. That’s not enough of a condemnation to call benchmarking irrelevant, by any means; it’s not. As with many things technological, the trouble and confusion are caused by the users.
This poor chump can’t get a break.
There’s a war on in the world of mobile users, between those who value specs and those who look only at an ethereal concept called “the user experience.” To make a crude and extreme generalization, the former group respects things like numbers and quantifiable figures, throwing around terms like “processor cycles” and “Linpack” and “quad-core,” while the latter group says condescending things like “I don’t care about chips, spec-head; I want my device to just work.”
Can you tell I don’t write dialogue?
Anyway, hey: ever see the movie Crimson Tide? It’s awesome; you should. At the end (SPOILER ALERT), Jason Robards is giving Gene Hackman and Denzel Washington a good old U.S. Navy tongue-lashing, in which he says “now you may have been proven right, Mister but insofar as the letter of the law is concerned, you were both right. And, you were also both wrong. This is the dilemma that will occupy this panel long after you leave this room.”
“A good old-fashioned clipboard is the only ‘tablet’ you’ll ever need, Commander.”
That’s the deal with the spec-vs-experience people; each camp’s argument has merit, but standing alone, neither one is defensible. Top-shelf specifications and raw performance are awesome, but they mean absolutely nothing if the experience hasn’t been optimized and tuned for a great user experience. Reference the HP TouchPad running webOS; the tablet featured very capable hardware, but performance on the release version of the software was atrocious. By contrast, look at the first generation of WP7 devices; these aged, single-core phones with pedestrian-to-dull specs are still running their OSes more smoothly than a lot of more powerful Android devices.
On the flip side, the people who take the opposite extreme of “specs don’t matter” are also wrong. It’s illogical to suggest that great performance doesn’t take great components, at least to an extent. Devices that feature what some call a “great user experience” -high responsiveness, minimal lag, solid reliability- frequently have the specs to back it up. It’s just that those specs are downplayed in importance because, frankly, they don’t matter very much if the phone or tablet you bought is doing what you want it to do, and doing it well.
Really, that’s what it all comes down to: buying the right device for your needs. If that means you need a certain level of performance because you’re running time-sensitive or resource-intensive applications, then benchmarks are absolutely going to be a crucial part of your review process but not the only part. If you’re shopping mainly for a device that’s going to give you great overall responsiveness and reliability, benchmarks aren’t going to matter as much as hands-on user reviews but you should still take them into account.
So, is benchmarking important? Yes. Is it the end-all be-all of device comparisons? Not even close. It’s one part of a larger whole. The world isn’t black-and-white, as so many platform champions and commentators would have you believe. When evaluating new devices, gather as much information as possible through whatever means available, including benchmarks, spec comparisons, and user reviews. Learn all that is learnable before pulling the trigger. It’s all about the gray area, friends.
All that said go download Quadrant and watch your device play some FPS games by itself. It’s equal parts amazing and relaxing. And just for fun, you know you want to see how you stack up. Admit it.
Gah, I was wrong! It’s creepy! Turn it off! Turn it off!!
Mentioned Engadget test can be found here.