Thursday, May 25, 2006

More on Intel's Woodcrest performance claim fraud


Examining Intel's Woodcrest performance claims on TPC-C, Floating point, Integer, Java, Web, HPC and application

Today, former Enron execs were found guilty on charges of fraud, false statements and conspiracy. Let's revisit Intel's Woodcrest performance claims. I pointed out that Intel's changing of the Opteron TPC-C benchmark description from 32 bit x86 to 64 bit x64 was a fraud.

Some of the readers said that Intel simply picked up the highest reported TPC-C results for two way servers, Woodcrest and Opteron, regardless of the operating system used. Let's test this assumption on other benchmarks. Let's look at floating point performance.

For SpecFP_rate_2000, the highest reported score for 2P 2.6GHZ Opteron 285 was 85 under Solaris 10. Guess what? Intel ignored this result, instead, it uses a lower Opteron result for Linux with a score of 72.9. The 3GHZ Woodcrest scored 83 under Linux. The 3GHZ Woodcrest (Linux) was 3% slower than 2.6GHZ Opteron (Solaris). Also, notice that Intel chose the SPECfp_rate_base2000 scores for comparison. The SPECfp_rate_base2000 is for conservative optmization of the benchmark, so it's always lower than the SPECFp_rate_2000 score. For some strange reason, the DELL 2950 Woodcrest server's optimized SPECfp_rate_2000 score was the same as the conservative SPECfp_rate_base2000 score, which may indicate that there were some issues with how the benchmark was done. Any way, Intel was shopping for the lowest Opteron scores. This clearly shows that Intel knew different configurations lead to different results. Had Intel chosen the highest score regardless of OS, the 2.6GHZ Opteron would outperform 3GHZ Woodcrest in SpecFP_rate_2000.

According to this report, the 3GHZ Woodcrest (Xeon 5160) will be the topmost chip, the next will be the 2.66GHZ Xeon 5150. Intel's topmost desktop chip the Conroe XE will be at 2.93GHZ. This indicates that a 3GHZ Woodcrest will be a cherry-picked chip. According to this page, the 2.8GHZ Opteron x90 has been in the wild for quite a while now.

For SpecInt_rate_2000, Intel again changed the OS description of the systems. The Woodcrest benchmark was done in 64 bit. The Opteron benchmark was done in 32 bit. This time, Intel changed the Dell PowerEdge 2950's benchmark description from "Microsoft Windows Server 2003 Enterprise x64 Edition" to just "Microsoft Windows Server 2003", making it look like the same as the Opteron test. This is just like Intel's Woodcrest TPC-C performance claim fraud.

Let's look at another example: Intel's page on Java performance. Intel used an unpublished Woodcrest test result on a Fujitsu Siemens PRIMERGY server running Windows Server x64 with BEA JRockit 5.0 P26.4.0 JVM. But for Opteron, Intel decided to use the score from a Tyan S2895 server with two 2.6GHZ Opteron and a SATA drive, the score was only 54490. However, from www.spec.org, we can find a Fujitsu Siemens PRIMERGY server with two 2.4GHZ Opteron 280 (running Linux, JRockit 5.0 P26.0.0) scoring 61155. Again, Intel was shopping for the lowest Opteron scores.

Let's look at yet another example: Intel's page on web performance. An IBM 3GHZ Woodcrest server got a SpecWeb2005 score of 9182. Mysteriously, there is no Opteron scores on this Intel page. However, going to www.spec.org, we quickly found this 2.4GHZ Opteron 280 server achieving a score of 8394. The 3GHZ Woodcrest has a 25% clockspeed advantage but only 9% performance lead over the 2.4GHZ Opteron.

Let's look at one more example: Intel's page on application performance. For the SunGard ACR test, Intel sent two servers to a company called Principled Technologies. One was an Intel built Opteron server and one was a Woodcrest server. Not surprisingly, the Woodcrest won the benchmark. The details of the benchmark is in this PDF file. The motherboard Intel chose for the Opteron was an UNIWIDE SS232_128_03 model using Nvidia NF4 chipset. One has to ask why Intel built the Opteron server themselves instead of using a proven server such as SUN's X4200 or HP DL385. We know server performance does vary from system to system. Not only Intel built and configured the Opteron server, it also provided the Intel compiled test application "SunGard ACR Intel Demo 2.5". It is unclear how Intel optimized this test application, but in a previous report (later removed), it was reported SunGard ACR is significantly faster for Xeon when compiled with Intel C++ compiler.

The more we examine Intel's presentations, the more problems we find. Looking at Intel's HPC performance page, pay attention to the fluid dynamics results (Fluent). Intel used a Woodcrest 3GHZ (2530.44) against an IBM 2.2GHZ LS20 Opteron blade (2014.34) , with the Woodcrest having 36.4% clockspeed advantage and 26% performance lead. However, if you go to the Fluent full results page, you can see there are quite a few Opteron results better than the 2.2GHZ IBM LS20 Opteron blade. In fact, there is a 2.6GHZ IBM LS20 Opteron blade scoring 2404.72. Using this result for 2.6GHZ Opteron, the 3GHZ Woodcrest would have only 5% performance advantage, despite 15% clockspeed advantage. Actually, both results show that Woodcrest being 10% slower than Opteron clock for clock, in agreement with our previous analysis. One can imagine Intel tabulated the Fluent benchmark results, and decided to use AMD's entry level 2.2GHZ Opteron 275 for comparison against the topmost Woodcrest 3GHZ (Xeon 5160). On the same HPC performance page, for "Finite Element Analysis for Crash Simulation", Intel also picked a low score for Opteron, despite existence of better Opteron results (see user comments).

So, why did Intel change the Opteron TPC-C description from x86 (32 bit) to x64 (b4 bit)? Why did Intel consistently choose the lower Opteron scores for comparison?

The answer is obvious, to create a false impression that the Intel CPU is much better.

Fraud: Any act, expression, omission, or concealment calculated to deceive another to his or her disadvantage. (Merriam-Webster’s Dictionary of Law, 1996).

Intel's behaviour satisfies the above legal defintion 100%.

16 Comments:

Anonymous Anonymous said...

I think using the TPC-C benchmark for comparison is a stupid or fraud thing because each benchmark test is absolutely unique in terms of hardware (numbers and types of disks, SCSI controllers, etc.), software, topology and even clients workload. What purposes TPC-C is for?

11:41 PM, May 25, 2006  
Anonymous Anonymous said...

this is really serious. you should email intel / amd executives about this. if you want, you can actually sue them :P.

4:35 AM, May 26, 2006  
Anonymous Anonymous said...

the reason intel uses the lowest opteron scores is because they need to show woodcrest is "30% better" than k8, or they fail to deliver their promises.
i was thinking in AMD's case, and how AMD would likely to use the same trick. but on another thought, Ruiz's style is to obtain victory technologically, not through "shopping for the lowest scores"

7:21 AM, May 26, 2006  
Anonymous Anonymous said...

It's not big news that intel plays nasty with benchmarks results. I'd rather read a third-party review with FULL system specs AND running on a 64-bit OS WITH 64-bit apps. Intel lacks in 64-bit performance against a comparable AMD system.

9:12 AM, May 26, 2006  
Anonymous Anonymous said...

Another "error" @ Intel's "startyourengines" page:

Check:
finite element analysis @ http://www.intel.com/performance/server/xeon/hpcapp.htm

Woodcrest 3GHz: 2.52 x faster than a single core Xeon 3.6 GHz
Opteron 2.2 GHz: 1.98 x faster than a single core Xeon 3.6 GHz

For opteron they used: http://topcrunch.org/benchmark_details.sfe?query=2&id=463

Wallclock time: 51388 seconds

But there is faster result: 49460 seconds. Here: http://topcrunch.org/benchmark_details.sfe?query=2&id=411

So the ratio should have been 2.06 not 1.98

BTW: A 3GHz Opteron would have scored 2.81 (corrected result) versus 2.52 for a 3GHz Woodcrest.

1:03 PM, May 26, 2006  
Blogger Sharikou, Ph. D. said...

BTW: A 3GHz Opteron would have scored 2.81 (corrected result) versus 2.52 for a 3GHz Woodcrest.


Indeed. Look at this result for 4 3GHZ single core Opteron nodes, the wall clock time was 34928. The ratio at 3GHZ is 2.06 * 49460/34928 = 2.91 . This gives a rough estimate.

7:32 PM, May 26, 2006  
Anonymous Anonymous said...

You must be the most annoying AMD-fan I've ever read anything by.

If you think sites are duped and fake the benchmarks then check the new world record in 3D Mark 2001, 2005 and Aquamark, what CPU was it taken with? Merom@3.4 GHz, who is the little brother of Conroe.

I don't believe the numbers where "Conroe is 15% slower than AMD64" and so on when you clearly (well, not you, but everyone that isn't a AMD-fanboy/Intel-blind nerd) see that it even bashes AMD's most expensive (overclocked) FX.

I can say that I got an AMD myself, but I can count and I will not be bought. That's the reason we won't have same opinions.

7:57 AM, May 27, 2006  
Blogger "Mad Mod" Mike said...

To the poster right below Shari:

FYI: Woodcrest/Conroe/Merom are the SAME processor, their only differences are FSB speeds and their names...that's it!

That fact that most records held by Conroe are ones that either fit into their 4MB Cache, or are 3DMARK scores ran @ 640x480 (that's the CPU TESt) and since you probably don't know, running a resolution of 800x600 or below (sometimes 1024x768) the performance increases with more cache (by ALOT).

Conroe is a bad architecture, it is not good because of it's 4-Issue wide core or any new instructions; Conroe is performing decent purely because of it's 4MB Cache and DDR2. If you've seen multithreaded benchmarks of Woodcrest, you'd see once it get's into having both cores share the cache and each gets 2MB, the advantage dwindles down into nothing.

NGMA is not special, it is a feabile attempt by Intel to throw cache at a crappy architecture and impress people by lying, as usual. Once AMD releases K8L and throws in enough cache, it will be Architecture vs. Architecture and you'll see AMD come out ahead 20% or more performance-per-watt or more.

Everybody needs to boycott Intel CPU's and maybe Intel will try and develop a respectful architecture.

11:38 AM, May 27, 2006  
Anonymous Anonymous said...

'x86' and 'x64' are not the way Microsoft distinguishes between 32-bit and 64-bit OS's on its website. 'x64' is 64-bit for sure, but there is no 'x86'. Most likely, there has been a name-change somewhere. Don't forget there are Itanium versions as well and 'x86' has probably been used for both 32-bit and 64-bit non-Itanium versions by HP. In other words, the HP-submission is likely to be also 64-bit, given the configuration.

4:30 AM, May 30, 2006  
Blogger Sharikou, Ph. D. said...

In other words, the HP-submission is likely to be also 64-bit, given the configuration.


No. The HP test was 32 bit. Read the original article which exposed the Intel fraud.

8:10 AM, May 30, 2006  
Anonymous Anonymous said...

I read it. Did you understand my comment?

7:53 AM, May 31, 2006  
Anonymous Anonymous said...

I've sent the following note to the editor of theinquirer.net for the erroneous article http://www.theinquirer.net/?article=32155


In your article "Intel Woodcrest whops Opteron, report", you are confusing software timings with hardware timings. It's an apples and oranges comparison. BEA's JRockit software was run on the yet-unreleased Intel chip, and Sun's JVM software was run on the AMD Opteron. This report cannot be used to quantify relative CPU speed as the same software was not run on the two machines. Also, the timings for the benchmark were not posted for Sun's JVM software on the same Woodcrest machine. The JRockit software may very well be faster than Sun's JVM software on the same machine. It is not unusual for software algorithm improvements to yield huge gains in performance when tuned to a specific piece of hardware. Your article would be more accurately titled "JRocket Lands Top SPECjbb2005 Score".

9:28 PM, June 03, 2006  
Anonymous Anonymous said...

Its pretty stupid to even consider comparing a AMDs "score" is solaris to intel's score in Linux. The operating system and peripherals have a significant impact on real-world benches and you can't compare different OSes. ONLY THE OS YOU WILL BE USING MATTERS. If one is better in windows but not linux, and I'm using linux, them who cares about windows?

TPC is *more* valid than other tests, since its real world. Real world is all that matters, since that's what people buy CPUs for. AMD has been using benchmarks to say they're better for years now, and now the AMD-clowners are whining that the benches aren't fair. What *should* be done is have amd build their best system for a test, and let intel build theirs, and then compare the numbers. But since benchmarkers are always biased towards one or another, the benches are always misleading.

8:50 AM, June 11, 2006  
Blogger Sharikou, Ph. D. said...

Its pretty stupid to even consider comparing a AMDs "score" is solaris to intel's score in Linux.

I agree. Performance results are only comparable running under similar environments. This is exactly why Intel is a fraud. In numerous cases, Intel tried to alter system description to make the tests look similar, but they are not. For instance, in the TPC-C comparison, the Opteron result was done in 32 bit, the Woodcrest was 64 bit using far more expensive storage. However, Intel changed the Opteron description to 64 bit also, creating a false impression that both tests were 64 bit. In spec_int_rate, Woodcrest was done in 64 bit, and Opteron was 32 bit. Again, Intel tried to hide that difference. We know Opteron runs faster in 64 bit mode, because the number of registered is doubled under 64 bit. For specjbb2005, Intel used a performance enhanced JVM version P26.4, the Opteron was running R26.0. According to a guy from BEA, each P version is a 5-10% performance increase....I can go on with the list. But Intel's fraud is proven. AMD has challenged Intel to a dual core duel. If Intel is up to the challenge, just answer the duel! Intel is a fraud. Period.

9:05 AM, June 11, 2006  
Anonymous Anonymous said...

Did you remember the i740 case at the past?

About these who aren't:

- benchmark results was incredible (even far more than Matrox GC)

- the REAL results - a poor think!

- AND WHAT BOOM WAS, WHAT :)

I think now is almost the same.

10:32 AM, June 21, 2006  
Blogger Clayton Slade said...

This is interesting:
http://www.intel.com/performance/server/xeon/app.htm

They don't show opteron performance on the first test. I assume the opteron must have been faster.

I have not looked deeper into this one but it certainly looks like another candidate.

2:28 PM, September 15, 2006  

Post a Comment

<< Home