Today's Hard|Forum Post
Today's Hard|Forum Post

Benchmarking Wrong

A HardOCP editorial that takes a look at the underlying issues surrounding 3DMark03 and the reasons you or any company should not take it too seriously.

Introduction

The synthetic benchmarking landscape has been in complete turmoil for the past few weeks. Three players in the video card industry have either confessed to or have been accused of cheating in synthetic benchmarks. We are going to forego the entire "Optimization Vs. Cheat" discussion to focus on what is truly the matter here. At the heart of this fiasco is FutureMark. They may be a small company of under 30 people but they are heavy hitters when it comes to huge computer OEMs spending big money. FutureMark builds the 3DMark series of benchmark that started out as an enthusiast plaything that has morphed into what is considered by many to be the number one tool for evaluating 3D graphics hardware in the industry. One of FutureMarkآ’s main sources of income is to charge companies for being members of their BETA Program. Some companies pay hundreds of thousands of dollars to be members of this "program". In exchange for that cash, they get to have input on the benchmarks that FutureMark develops.

Dell Steps In

Dell is one of the heavy hitters that uses FutureMarkآ’s tools to make hardware evaluations and they are also a "BETA Member". Here is what Dell had to say recently about the latest benchmark from FutureMark, 3DMark03, quoted on the FutureMark site.

Dell uses many tools to evaluate system and graphics subsystem performance. We believe 3DMark03 is a solid synthetic graphics benchmark that covers a wide range of usage models and complements application-specific testing. Synthetic benchmarks like 3DMark03 help to differentiate graphics subsystem performance characteristics of both high end and lower end cards by utilizing sets of tests with varying degrees of graphics complexity. Entry level cards will be able to run at least one simple test to be used in comparisons for those interested in basic 3D functionality. Additionally, those interested in leading edge technology with be able to make graphics hardware comparisons with a range of tests using the new APIs, shaders, rendering techniques, etc. Dell believes 3DMark03 is a versatile tool that allows a fair comparison of today's wide range of 3D graphics solutions.

Now while I do not wish to focus on Dell, it does allow an illustration of the power of FutureMark and dollars at stake based on it. Do you think Dell would even comment on this issue if huge sums of money were not at stake? Do you think that a company that has based millions of dollars worth of purchases on FutureMarkآ’s tools is going to say, "You know what, the tool we use for evaluating 3D performance is terribly flawed.آ” No, of course they are not.

BETA Really Means "MONEY"

If you review the آ“Strategic BETA Membersآ” list linked above you will notice two companies that are huge forces in the market place not listed. Quite simply, those companies not seen on the list have chosen not to pay hundreds of thousands of dollars to FutureMark. When one or more the market leaders in the industry are no longer paying FutureMark to participate in the 3DMark BETA Program, there is simply no possible way you can look at the tool as being a nonbiased utility for evaluation. If you pay, you get to see the benchmark beforehand and give input on how you think it should work. If you donآ’t pay FutureMark, you donآ’t get to the benchmark before it is released and you do not have the opportunity to direct how the testing works.

Our Thoughts

My thoughts on paid synthetic benchmarks are this; either everyone pays or no one plays. Of course it can also be easily argued that a benchmark company should not accept one dime from the companies it is evaluating. Ask yourself this. Are the companies that are paying FutureMarkآ’s BETA Program fees getting anything in return, or are they paying so they can have their logo on that page? I think logic dictates that unless you are getting something for the huge sums of money you are spending, you do not spend it.

If you think that all of this hoopla in the last couple weeks is a technology discussion you are very much incorrect. It is all about the almighty dollar, although it has been shrouded in the context of something technically meaningful. This is all about money and whose pocket it is going to be in. FutureMark's past failed business models over the years have turned 3DMark03 into what it is today. In my opinion, 3DMark03 is a paid for tool. 3DMark03 scoring based on their game demos is not meaningful for depicting real world gaming performance since those specific tests do not represent real world game engines or technologies sold in retail. Anyone that asks you to rely on those scores simply does not see the big picture or is simply trying to cover their own interests, and that includes Dell.

Even based on FutureMark's own statements, none of the scores previous to their latest 3DMark03 Build 330 are to be relied on.

Can 3DMark03 be used as a reliable benchmark for DirectX 9 generation graphics cards?

Yes, with the new 3DMark03 build 330, it can.

Logically, that statement would seem to suggest to me that all scores previous to using build 330 are not reliable. We of course think that with or without the latest build 3DMark03 is not trustworthy. Do we wait for FutureMark to declare in writing that their next build is really the one that will give you reliable results. I would have to suggest that anyone using 3DMark03 scoring tests is irresponsible in one form or another.

Don't let the fanboy rhetoric cloud your thoughts and judgment about this issue. Stay focused on the big picture. We need better benchmarks based on game and game engines that we will be using on our own desktops down the road. 3DMark03 should not be thought of to represent how your hardware is going to perform in actual games we will be playing later this year. Here is a HardOCP editorial from February that outlines our thoughts and points out some of the other dangers associated with a synthetic benchmark becoming too powerful. Here also is a PDF that was prepared by HardOCP for game developers that explains our wants and needs in future benchmarks. Please feel free to share the document or use the information contained within as your own. The community working together is what is needed.

I ask that you approach this with an open mind and step back and see that this ugly situation that has been spread out over weeks stretches far beyond brand loyalty. We need better tools from the game developers community. With those tools we can ensure that we all have a better gaming experience, and that is after all, the focus of the matter. The old clichأ©, "There are lies, damn lies, and benchmarks" does not have to be true, but you can be assured if we sit around and let the dollars of huge companies decide our benchmarking fate, that will come to pass.

I humbly ask that anyone evaluating computer hardware to stop publishing the game demo scores taken from 3DMark03. The media is greatly responsible for giving FutureMark its power in the industry and it is my thought that if we work collectively, we can shift that power to a more responsible set of real world utilities that I suggest will become a reality this year.

In a related FutureMark fiasco today, this press release was sent to us by Tero Darkkinen of FutureMark.