Updating AnandTech’s 2013 Mobile Benchmark Suite (RFC)
by Jarred Walton on January 29, 2013 9:45 PM EST- Posted in
- Laptops
- Gaming
- Ultrabook
- Benchmarks
- Notebooks
If it seems like just last year that we updated our mobile benchmark suite, that’s because it was. We’re going to be keeping some elements of the testing, but with the release of Windows 8 We’re looking to adjust other areas. This is also a request for input (RFC = Request for Comments if you didn’t know) from our readers on benchmarks they would like us to run—or not run—specifically with regards to laptops and notebooks.
We used most of the following tests with the Acer S7 review, but we’re still early enough in the game that we can change things up if needed. We can’t promise we’ll use every requested benchmark, in part because there’s only so much time you can spend benchmarking before you’re basically generating similar data points with different applications, and also because ease of benchmarking and repeatability are major factors, but if you have any specific recommendations or requests we’ll definitely look at them.
General Performance Benchmarks
We’re going to be keeping most of the same general performance benchmarks as last year. PCMark 7, despite some question as to how useful the results really are, is at least a general performance suite that’s easy to run. (As as side note, SYSmark 2012 basically requires a fresh OS install to run properly, plus wiping and reinstalling the OS after running, which makes it prohibitively time consuming for laptop testing where every unit comes with varying degrees of customization to the OS that may or may not allow SYSmark to run.) We’re dropping PCMark Vantage this year, mostly because it’s redundant; if Futuremark comes out with a new version of PCMark, we’ll likely add that as well.
At least for the near term, we’re also including results for TouchXPRT from Principled Technologies; this is a “light” benchmark suite designed more for tablets than laptops (at least in our opinion), but it does provide a few other results separate from a monolithic suite like PCMark 7. We’ll also include results from WebXPRT for the time being, though again it seems more tablet-centric. We don’t really have any other good general performance benchmarking suites, so for other general performance benchmarks we’ll return once again to the ubiquitous Cinebench 11.5 and x264 HD. We’re updating to x264 HD 5.x, however, which does change the encoding somewhat, and if a version of x264 comes out with updated encoding support (e.g. for CUDA, OpenCL, and/or Quick Sync) we’ll likely switch to that when appropriate. We’re still looking for a good OpenCL benchmark or two; WinZip sort of qualifies, but unfortunately we’ve found in testing that 7-zip tends to beat it on file size, compression time, or both depending on the settings and files we use.
On the graphics side of the equation, there doesn’t seem to be a need to benchmark every single laptop on our gaming suite—how many times do we need to see how an Ultrabook with the same CPU and iGPU runs (or doesn’t run) games?—so we’ll continue using 3DMark as a “rough estimate” of graphics performance. As with PCMark, we’re dropping the Vantage version, but we’ll continue to use 3DMark06 and 3DMark 11, and we’ll add the new version “when it’s done”. We’re considering the inclusion of another 3D benchmark, CatZilla (aka AllBenchmark 1.0 Beta19), at the “Cat” and “Tiger” settings, but we’d like to hear feedback on whether it makes sense or not.
Finally, we’ll continue to provide analysis of display quality, and this is something we really hope to see improve in 2013. Apple has thrown down the gauntlet with their pre-calibrated MacBook, iPhone, iPad, and iMac offerings; if anyone comes out with a laptop that charges Apple prices but can’t actually match Apple on areas like the display, touchpad, and overall quality, you can bet we’ll call them to the carpet. Either be better than Apple and charge the same, or match Apple and charge less, or charge a lot less and don’t try to compete with Apple (which is a dead-end race to the bottom, so let’s try to at least have a few laptops that eschew this path).
Battery Benchmarks
As detailed in the Acer S7 review, we’re now ramping up the “difficulty” of our battery life testing. The short story is that we feel anything less than our previous Internet surfing test is too light to truly represent how people use their laptops, so we’re making that our Light test. For the Medium test, we’ll be increasing the frequency of page loads on our Internet test (from every 60 seconds down to every 12 seconds) and adding in playback of MP3 files. The Heavy test is designed not as a “worst-case battery life” test but rather as a “reasonable but still strenuous” use case for battery power, and we use the same Internet test as in the Medium test but add in looped playback of a 12Mbps 1080p H.264 video with a constant FTP download from a local server running at ~8Mbps (FileZilla Server with two simultaneous downloads and a cap of 500KBps, downloading a list of large movie files).
Other aspects of our battery testing also warrant clarification. For one, we continue to disable certain “advanced” features like Intel’s Display Power Saving Technology (which can adjust contrast, brightness, color depth, and other items in order to reduce power use). The idea seems nice, but it basically sacrifices image quality for battery life, and since other graphics solutions are not using these “tricks” we’re leaving it enabled. We also disable refresh rate switching, for similar reasons—testing 40Hz on some laptops and 60Hz on others isn’t really apples-to-apples. Finally, we’re also moving from 100 nits brightness to 200 nits brightness for all the battery life testing, and the WiFi and audio will remain active (volume at 30% with headphones connected).
Gaming Benchmarks
In truth, this is the one area where there is the most room for debate. Keep in mind that when testing notebooks, we’re not solely focused on GPU performance most of the time (even with gaming notebooks); the gaming tests are only a subset of all the benchmarks we run. We’ll try to overlap with our desktop GPU testing where possible, but we’ll continue to use 1366x768 ~Medium as our Value setting, 1600x900 ~High as our Mainstream setting, and 1920x1080 ~Max for our Enthusiast setting. Beyond the settings however is the question of which games to include.
Ideally, we’d like to have popular games that also tend to be strenuous on the graphics (and possibly the CPU as well). A game or benchmark that is extremely demanding of your graphics hardware that few people actually play isn’t relevant, and likewise a game that’s extremely popular but that doesn’t require much from your hardware (e.g. Minecraft) is only useful for testing low-end GPUs. We would also like to include representatives of all the major genres—first person shooter/action, role-playing, strategy, and simulation—with the end goal of having ten or fewer titles (and for laptops eight seems like a good number). Ease of benchmarking is also a factor; we can run FRAPS on any game, but ideally a game with a built-in benchmark is both easier to test and produces more reliable/repeatable results. Frankly, at this point we don’t have all that many titles that we’re really set on including, but here’s the short list.
Elder Scrolls: Skyrim: We’ve been using this title since it came out, and while it may not be the most demanding game out there, it is popular and it’s also more demanding (and scalable) than most other RPGs that come to mind. For example, Mass Effect 3 generally has lower quality (also DX9-only) graphics and doesn’t require as much from your hardware, and The Witcher 2 has three settings: High, Very High, and Extreme (not really, but it doesn’t scale well to lower performance hardware). Skyrim tends to hit both the CPU and GPU quite hard, and even with the high resolution texture pack it can still end up CPU limited on some mobile chips. Regardless of our concerns, however, we can’t think of a good RPG replacement, so our intention is to keep Skyrim for another year.
Far Cry 3: This is an AMD-promoted title, which basically means they committed some resources to helping with the games development and/or advertising. In theory, that means it should run better on AMD hardware, but as we’ve seen in the past that’s not always the case. This is a first-person shooter that has received good reviews and it’s a sequel to a popular franchise with a reputation for punishing GPUs, making it a good choice. It doesn’t have a built-in benchmark, so we’ll use FRAPS on this one.
Sleeping Dogs: This is another AMD-promoted title. This is a sandbox shooter/action game with a built-in benchmark, making it a good choice. Yes, right now that's two for AMD and none for NVIDIA, but that will likely change with the final list.
Sadly, that’s all we’re willing to commit to at this point, as all of the other games under consideration have concerns. MMORPGs tend to be a bit too variable, depending on server load and other aspects, so we’re leaving out games like Guild Wars 2, Rift, etc. For simulation/racing games, DiRT: Showdown feels like a step back from DiRT 3 and even DiRT 2; the graphics are more demanding, yes, but the game just isn’t that fun (IMO and according to most reviews). That means we’re still in search of a good racing game; Need for Speed Most Wanted is a possibility, but we’re open for other suggestions.
Other titles we’re considering but not committed to include Assassin’s Creed III, Hitman: Absolution, and DmC: Devil May Cry; if you have any strong feelings for or against the use of those titles, let us know. Crysis 3 will hopefully make the grade this time, as long as there's no funny business at launch or with the updates (e.g. no DX11 initially, and then when it was added the tessellation was so extreme that it heavily favored NVIDIA hardware, even though much of the tessellation was being done on flat surfaces). Finally, we’re also looking for a viable strategy game; Civ5 and Total War: Shogun 2 could make a return, or there are games like Orcs Must Die 2 and XCOM: Enemy Unknown, but we’re not sure if either meet the “popular and strenuous” criteria, so we may just hold off until StarCraft II: Heart of the Swarm comes out (and since that games on “Blizzard time”, it could be 2014 before it’s done, though tentatively it’s looking like March; hopefully it will be able to use more than 1.5 CPU cores this time).
Closing Remarks
As stated at the beginning, this is a request for comments and input as much as a list of our plans for the coming year. If you have any strong feelings one way or the other on these benchmarks, now is the time to be heard. We’d love to be able to accommodate every request, but obviously there are time constraints that must be met, so tests that are widely used and relevant are going to be more important than esoteric tests that only a few select people use. We also have multiple laptop reviewers (Dustin, Jarred, and occasionally Vivek and Anand), so the easier it is to come up with a repeatable benchmark scenario the better. Remember: these tests are for laptops and notebooks, so while it would be nice to do something like a compilation benchmark, those can often take many hours just to get the right files installed on a system, which is why we’ve shied away from such tests so far. But if you can convince us of the utility of a benchmark, we’ll be happy to give it a shot.
47 Comments
View All Comments
Dman23 - Tuesday, January 29, 2013 - link
You should also be using GLBenchmark when testing GPU performance on mobile products! The Kishonti guys do truely create Great cross-platform analysis tools for Windows and Mac and Android products. Please include these.Dman23 - Tuesday, January 29, 2013 - link
Also, I should add that you should use their CLBenchmark Suite for comparing the computational performance of different platforms using OpenCL.Dman23 - Tuesday, January 29, 2013 - link
Also, if you are trying to make your mobile benchmark suite platform-agnostic, you should seriously consider a game that is available on multiple platforms (i.e. Windows and Mac) for a more apples to apples comparison on gaming performance such as The Witcher 2, Borderlands 2 or Batman: Arkham City.JarredWalton - Wednesday, January 30, 2013 - link
We're not trying for platform agnostic. Mostly, gaming tests on other platforms would only tell us how well optimized the game is for other platforms, which is more of an OS/gaming question than a laptop question.Dman23 - Wednesday, January 30, 2013 - link
Why wouldn't you want to be platform-agnostic? What is the point of doing Windows-only gaming tests if the idea is to test all mobile platforms including Android and OS X??Also, based upon your "gaming tests on other platforms only tell us how well optimized the game is", why have gaming tests at all?? If you truly believe that statement, then basically your saying the gaming tests in general are rigged and it "only tells us how well optimized that game is" for that platform.
Come on now, just because hardcore gaming is my predominant on the Windows platform, doesn't mean you should exclude other mobile platforms such as OS X or Android/Chrome. Stick to Anandtech's tradition of providing a very comprehensive, unbiased review process and either provide a more apples-to-apples comparison of gaming performance using cross-platform games or don't include it at all if you believe that gaming benchmarks are rigged to begin with.
JarredWalton - Wednesday, January 30, 2013 - link
Be reasonable. First, most of the games we test are only available on Windows. Of the games that we test that are available on OS X, past indications are that OS X performance is far worse than Windows performance on the same games. So, either OS X is poorly optimized for gaming or the games are poorly optimized for OS X. Until that changes, why continue to try and compare apples and oranges? I figure people already know the limitations of OS X in regards to gaming (and if OS X doesn't support games well, Linux is even lower down the hierarchy), and unless/until that changes there's no sense beating a dead horse and wasting the time of our reviewers.This is not to say that we won't add GLBenchmark or CLBenchmark, but I'll leave gaming tests on OS X to Anand and Vivek when/if they run such tests.
Dman23 - Wednesday, January 30, 2013 - link
First of all, who says they don't support gaming on OS X? If that were the case, then they wouldn't even bother with implementation of a whole online-platform Game Center in the first place, let alone integrate it into the OS itself!! And like I said before, they have a whole bunch of high-end games like Bonderlands 2 and The Whitcher 2 that work great on Apple's current mobile platforms, not to mention a whole myriad of other good games that they sell on the App Store or you can purchase via Steam.Look it, I get it. Your probably a Windows user that thinks in order to measure high-end gaming performance or be a high-end mobile gamer, you have to use Windows. This is completely untrue. That may have been true 10 years ago but not today. All these mobile products from Apple, Dell, Hp, Lenovo, Acer, etc. all use similar graphics chipsets and processors from Intel, AMD, and Nvidia. This is why it is important to provide cross-platform gaming performance when reviewing each product as a measure of what strength and weaknesses each of them have.
If your going to have a Gaming Performance section in your reviews for mobile products such as laptops / tablets, you should include games that are available on all mobile platforms! This includes Chrome, Mac OS X, and Linux. (Hell, even games that are only on Windows and Mac OS X would be a good first step.)
If your not willing to implement that because you think it is a "waste of time" to compare ALL major mobile platforms and just have a section SOLELY on Windows gaming, then you should (like I've said before) either think long and hard about having a section on gaming performance, as a way to diagnose the strength of a mobile laptop / tablet or update the title of this post to "Anandtech's Mobile WINDOWS Benchmark Suite" because you sure aren't trying to provide an unbiased benchmarking suite that compares and contrasts other platforms and products.
Btw, your argument that gaming performance is poor on Mac OS X, as a basis of not doing cross-platform gaming performance, is really short-sighted. The whole point of reviewing a mobile product and the platform that it's based on is to understand the strengths and weaknesses on that said product/platform. And who knows, by exposing the weaknesses of said platform when it comes to gaming in either Chrome or Mac OS X, it will provide an incentive for the company to improve upon those weaknesses. Isn't that really the whole point of reviewing products at a sight like AnandTech, so that down the line that company can improve upon the weaknesses that AnandTech has exposed??
I guess I'd like to end with by not providing these cross-platform comparisons, you are doing a disservice to the readers of Anandtech and in my view, watering down what makes AnandTech great...in that it covers ALL major platforms / products and provides Comprehensive benchmarking suites across all products. (To much the chagrin of fanboys who only want to hear about their "favorite" platform, while dismissing / degrading other platforms and products.)
Jared, I hope you take these suggestions to heart because I would hate to see AnandTech become more of a platform-centric site (i.e. HotHardware, etc.) instead of a platform-agnostic site, which is what makes this site so unique and great!
JarredWalton - Wednesday, January 30, 2013 - link
You're talking to the guy that does all of the Windows laptop benchmarking. The point is to benchmark games that are relevant and new, so if we only choose games that are available on all platforms we inherently limit us. I've already ruled out Borderlands 2 as a game that's not particularly demanding on hardware, not because it's a game that's available on other platforms.Ultimately, I'm going to create a list of games for our laptops reviews, and if the games are available on OS X and Anand wants to run them, great. I'm not going to ask our other editors (e.g. Dustin) to do extra work benchmarking games that aren't meaningful to most users just to make for cross-platform comparisons.
There are really only a couple non-Windows laptops to consider, all of them from Apple, so what you're talking about is having MacBook reviews run as many gaming tests as possible. That's fine and I hope they will do that when it comes time for another MacBook review.
Or to put it another way, I wouldn't expect Anand to not run Final Cut Pro or iMovie tests (or anything else he wants to test) just because those programs aren't available on Windows. The fact is, the vast majority of games are still Windows only affairs. Steam is changing that (slowly), and we might even see Steambox push stuff onto Linux.
Right now, this is about testing the majority of laptops. Overlap with tablets will occur in some areas, and likewise with MacBooks, but it's impractical (e.g. a waste) to try to only test cross-platform offerings. I could have put "Windows" in the title, but I figured that should have been immediately apparent from the context of the article.
Notmyusualid - Saturday, February 2, 2013 - link
Agreed.We game on Windows. Please continue testing games mostly on Windows.
But I did enjoy the Anand article, which compared Linux / Windows gaming performance I must say.
But regardless, I don't need the headache of using a *Nix operating system any more than I'm required to. And I can just imagine how buggy gaming under Linux would be...oh, and with Crossfire too? I'd cut my wrists.
aryonoco - Tuesday, January 29, 2013 - link
Since most tablet and smartphone benchmarks are JS benchmarks, I think it would be good to include at least one JS benchmark here as well, to put thins into perspective.I nominate Mozilla's Kraken for this, relatively complex to be meaningful on a laptop, but it should still be very easy to perform and very repeatable.