Benchmarks IBM DB2 8.1.3: Intel versus AMD



The first question that most people will ask is, of course, how the best AMD Opteron compares to the newest Intel Xeon "Nocona" CPU. Below is a quick table to refresh your memory and to enable you to compare price/performance:

Intel Xeon CPUs Core L2 cache L3 cache x86-64 bit In Test Price
3.60 GHz w/ 1M cache 800 MHz FSB (90nm) Nocona = "Prescott server" 1 MB No Yes Yes $851
3.40 GHz w/ 1M cache 800 MHz FSB (90nm) Nocona = "Prescott server" 1 MB No Yes No $690
3.20D GHz w/ 1M cache 800 MHz FSB (90nm) Nocona = "Prescott server" 1 MB No Yes No $455
3 GHz w/ 1M cache 800 MHz FSB (90nm) Nocona = "Prescott server" 1 MB No Yes No $316
3.20C GHz w/ 2M cache 533 MHz FSB (.13) Galatin = "P4 EE Server" 0,5 MB 2 MB No Yes $1,043
3.20 GHz w/ 1M cache 533 MHz FSB (.13) Galatin = "P4 EE Server" 0,5 MB 1 MB No No $690
3.06A GHz w/ 1M cache 533 MHz FSB (.13) Galatin = "P4 EE Server" 0,5 MB 1 MB No Yes $455
3.06 GHz w/ 512k cache 533 MHz FSB (.13) Prestonia = "Northwood Server" 0,5 MB No No Yes $316
AMD Opteron CPU's Core L2 cache L3 cache x86-64 bit In Test Price
Model 250 (2.4 GHz) Sledgehammer 1 MB No Yes Yes $851
Model 248 (2.2 GHz) Sledgehammer 1 MB No Yes Yes $690
Model 246 (2.0 GHz) Sledgehammer 1 MB No Yes No $455
Model 244 (1.8 GHz) Sledgehammer 1 MB No Yes No $316

We were also very curious about the Xeon Nocona, as the it brings higher clock speeds, a bigger L2-cache, no L3-cache and a pipeline 11 stages longer than the previous Xeon "Prestonia" and Xeon "Gallatin", which maxed out at 3.2 GHz. The first two features mentioned should boost the performance quite well, while the two last are disadvantages.

We should emphasize that, as we tested with SUSE SLES 8 (kernel 2.4.21), the Xeon Nocona was disadvantaged, since we could not test it in 64-bit mode. We assure you that we will update this report with 2.6 kernel. For now, we decided to give you a full report on SLES 8 and kernel 2.4. (All numbers are expressed in queries per second.)

Concurrency Xeon 3.6 GHz Dual Xeon 3.2 L3 (2MB) Dual Xeon 3.2 Dual Xeon 3.06 L3 (1MB) Dual Xeon 3.06 Opteron 250 DDR400 32 bit Dual Opteron 250 DDR 400 64 bit Dual Opteron 248 DDR 400 64 bit
1 55 46 44 43 42 57 61 57
2 87 74 61 72 61 105 118 107
5 128 104 100 98 98 123 137 129
10 136 112 107 105 102 129 145 132
20 136 113 106 106 104 131 147 132
35 138 113 106 104 99 133 150 129
50 138 110 106 102 100 130 145 128

All concurrency tests below 5 are not reliable enough to make any firm conclusion, especially for the Xeon. The margin of error is somewhat higher, but that is not all.

As the Dual Xeon with Hyperthreading spawns 4 logical CPUs, with a concurrency of 2, it is possible that only one physical CPU is doing all the work. Looking at the numbers and the linux tool top, we feel pretty sure that this is exactly what happens most of the time. Compare Row "5" with "2", and "2" with "1" to see what I mean. Note that the results of rows 10 to 50 do not vary a lot; so, we look at these numbers for our conclusions. In the table below, you can see an overview of how the different CPUs compare in percentages.

3.6 vs 3.2 2 MB L3-cache vs none 1 MB L3-cache vs none Xeon 3.2 vs 3.06 Xeon 3.2 vs 3.06 (both with L3) Xeon 3.6 vs Opteron 250 Opteron 64 bit vs 32 bit
20% 3% 1% 7% 7% -4% 6%
17% 22% 18% 3% 3% -17% 12%
24% 4% 1% 5% 5% 5% 12%
21% 5% 3% 6% 6% 6% 13%
21% 6% 2% 6% 6% 3% 12%
22% 7% 5% 8% 8% 3% 12%
26% 4% 2% 8% 8% 7% 12%

If we had published a similar report back in August, the Opteron would enjoyed a landslide victory. Luckily for Intel, Nocona is very competitive and is about 5% faster than the Opteron 250.

The gigantic - for x86 - L3-cache can not help the Xeon much. We measured only a 2% to 5% performance boost from the 1 MB L2-cache (at 3.06 GHz), and a 4% to 7% performance boost from the 2 MB L3-cache (at 3.2 GHz). The L3-cache seems to boost performance as much as 5% to 6% clock speed increase - nothing to write home about. So a Xeon "Galatin" 3.2 GHz 2 MB L3-cache performs more or less like a Xeon "Galatin" 3.4 GHz, if such a beast should exist.

A comparison between the 3.2 GHz and 3.06 GHz shows that CPU clockscaling - given equal cache sizes - is almost perfect, a testimony to how CPU intensive this benchmark is. Clearly, the generalisation, "databases are all about I/O" is not accurate for a number of database applications. Read-heavy databases seem to be "all about the CPU".

Using a 64 bit database (DB2 8.1.3) on a 64 bit operating system delivers about 12% to 13% better performance. Since we didn't use more than 2 GB, the most likely explanation is the fact that the software can make use of 16 registers instead of 8. We also tested with a twice as large database and 4 GB of RAM, and the results were very similar.

The performance of the Nocona Xeon compared to the older Xeons is also remarkable. The database doesn't mind the longer pipeline and absence of the L3-cache. On the contrary, it performs better than its clock speed indicates, leaving the older 3.2 GHz Xeon (with 2 MB L3 cache!) behind with 21% to 22%, while the Nocona has only a 13% clock speed advantage over the latter. To be honest, we expected Nocona, with its huge branch misprediction penalty, a result of its extremely long pipeline, to scale much worse.

The reference machines versus HP and SUN Benchmarks IBM DB2: DDR400 vs DDR333
Comments Locked

46 Comments

View All Comments

  • smn198 - Thursday, December 2, 2004 - link

    Would love to see how MS SQL performs in similar tests.
  • mrVW - Thursday, December 2, 2004 - link

    This test seems foolish to me. A 1GB database? All of that fits in ram.

    A database server is all about being the most reliable form of STORAGE, not some worthless repeat queries that you should cache anyway.

    Transactions, logging... I mean how realistic is it to have a 1GB of database on a system with 4GB of RAM and expensive DB2 software.

    A real e-commerce site likeMWave, NewEgg, Crucial could have 20GB per year! Names, addresses, order detail, customer support history, etc.

    Once you get over a certain size, a database is all about disk (putting logging on one disk indepdent of the daata, etc.). The indexes do the main searching work.

    This whole test seems geared to be CPU focused, but only a hardware hacker would apply software in such a crazy way.

  • mrdudesir - Thursday, December 2, 2004 - link

    man i would love to have one of those systems. Great job on the review you guys, its good to know that there are places where you can still get great independent analysis.
  • Zac42 - Thursday, December 2, 2004 - link

    mmmmmmm Quad Opterons......
  • Snoop - Thursday, December 2, 2004 - link

    Great read
  • ksherman - Thursday, December 2, 2004 - link

    is that pic from the 'lab'? (the one on pg 1)

Log in

Don't have an account? Sign up now