Moving to Real World Benchmarks in SSD Reviews

Many of our readers embrace our "real world" approach with hardware reviews. We have not published an SSD review for almost 2 years while we have been looking to revamp our SSD evaluation program. Today we wanted to give you some insight as to how we learned to stop worrying and love the real world SSD benchmark.

Chasing the Benchmark Dragon

Love makes us do things that are sometimes a little crazy. Various forms of love have driven me to indulge such madness as leaving a lucrative corporate job to join a startup, boarding a transcontinental flight booked only a day earlier with a cheap engagement ring in my pocket, or buying a convertible after driving past the dealership and seeing it gleaming in the sun on a gorgeous Spring day. Recently, my love of technology led me to spend 36 consecutive hours testing SSDs.

At the conclusion of those tests, love of doing things the right way led me to throw out the playbook and decide to start over.

As [H]ard|OCP is rebooting SSD coverage after being dormant for a couple years, and this is my own first attempt at rigorously benchmarking drives outside of my own narrowly-defined needs, we’re working from a blank slate. As luck would have it, my first big review was slated to be the Intel SSD 750, the first consumer SSD with NVMe, and therefore one of the biggest pieces of consumer storage news in a long while. I didn’t say that it was good luck, however as NVMe represents a very new and different way of doing things than what’s come before, and I had to test a number of competing drives to provide a meaningful basis for comparison. All of this together forced me to reevaluate some assumptions about measuring SSD performance.

Article Image

Intel's new SSD 750 in PCIe add-in card format

Perhaps the biggest revelation from my testing marathon is that I was able to see the extreme degree to which the rated IOPS and throughput numbers from different SSD manufacturers are incomparable. Having a literal stack of high-end SSDs that are directly competing in the same market niche on the bench, I was able to see how the parameters necessary to get optimal results out of one drive were completely different from another. And that's before addressing the question of whether these figures have any practical relevance. Put mildly, trying to get the peak/rated performance out of each of the drives with a consistent set of credible tests proved challenging. And then it clicked, not only should I not change my testing to meet someone else's marketing objectives, even if those synthetic values were easily attainable, they wouldn't really matter to me.

Philosophically, benchmarking for benchmarking's sake solves little. Faster drives for faster drives' sake solves equally little. We must look at everything through the emotionless lens of meeting user objectives, which means focusing on real-world outcomes rather than synthetic figures that are lacking in context and relevance. The questions that matter are not centered on IOPS or MB/s. They're centered on value proposition, enhancing productivity, market impact, and providing a better user experience.

The Real World vs. the Test Lab

Photographer and writer Ken Rockwell famously coined the term "measurbator" to talk about photographers who were much more interested in technical bragging rights than in the quality of the work they produced . When we talk about SSDs, we’re approaching the point where most comparisons are measurbation, and useful comparisons become difficult. There’s divergence in synthetic measures like IOPS or transfer rate, but will choosing Brand A over Brand B actually help you get more done? Reading our own forum becomes fascinating as so many threads start with a usage-oriented question. "I want my games to load faster, how much overkill should I apply?" and end with a generic, appropriate response, "Pretty much any current SSD will do what you need."

The fact is that for the vast majority of consumers, pretty much any current SSD will do what you need. A few edge cases provide exceptions; individuals running what are actually server workloads, people editing massive 4K video files, or applications that just aren’t written well. But the fact is even the cheaper SSDs that you’ll find in 2015 will be good enough for most people, and will be an immense improvement from a mechanical hard disk.

I’m all for buying what’s shiny and new, because I like shiny and new things. However, I’m reminded of a recent Facebook post by a very bright friend who’s an expert in machine learning, "Some days you gotta ask yourself: are we here to play with computers, or to get stuff done?"

Manufacturers are happy to plaster IOPS or sequential throughput numbers on marketing materials for SSDs. This isn’t like comparing cars based on rated MPG as you know you’re not likely to hit the advertised MPG, but at least all of the cars on the market use a standardized testing cycle. With SSDs, there are wildly divergent methodologies behind the advertised performance figures, and they are typically not rooted in real-world use cases.

Synthetic benchmarks done by third parties are guilty of the same questionable utility. We need to understand what’s happening at the application level to say anything educated about whether a drive is a good fit for its intended users. But before that, we should understand what we're even attempting to quantify.