Saturday, April 08, 2006

Conroe performance claim being busted

FUD is like ghost movies, you don't get scared by seeing a ghost, you get scared by not seeing one -- Sharikou





Recall Intel's Mooly Eden said Con-roe will be 20% faster than AMD's future chips without even knowing AMD's plans? During the Spring 2006 IDF, Intel setup a Conroe and an Athlon 64 box, then directed benchmarkers such as Anand to push buttons*, but peaking into Windows device manager of the alleged Conroe wasn't allowed.

During the IDF, I emailed various Intel execs, AMD execs and Anand, I pointed out that such a pre-arranged blackbox Intel setup against AMD was unfair and challenged Intel to lend the Conroe box to Anand for a real drill. However, Intel dared not to answer such a simple challenge based on the rules of fair competition. The INQ sharply criticised this kind of guerilla benchmarketing.

In fact, Anand had no way to verify Intel's IDF Conroe setup, the Conroe configuration parameters were provided by Intel. Anand noted that "it looked like Intel had done the unimaginable" with regard to the situation. Nonetheless, Anand assured readers that "there was nothing fishy going on with the benchmarks or the install" based on his trust on Intel's honesty -- which was seriously lacking from past records. Thus we had an interesting situation: Anand relied on Intel's reputation to validate the Conroe setup while Intel relied on Anand's reputation to validate the Conroe scores -- a loop of trust was formed to convince the world + dog.

Now, for the very first time, someone actually got hold of a Conroe chip in their own lab and did some tests. It was a 2.4GHZ Conroe (Link: CPU-Z) against an Athlon 64 overclocked to 2.8GHZ. The overclocked Athlon 64 had a 2.8/2.4 -1 = 16.7% clockspeed advantage.

The following results were obtained by running 32 bit ScienceMark binaries optimized for Intel Pentium:

Molecular Dynamics
A64: 1872.68
Conroe : 2133.38 -- 14% faster

Primordia (Energy calculations for 1 atom)
Athlon64: 1506.83 -- 10% faster
Conroe: 1365.85

Cryptography
Athlon64: 1345.05 -- 26.3% faster
Conroe: 1065.59

STREAM
Athlon64: 1512.55 -- 21.7% faster
Conroe: 1242.94

The above results were for an Athlon overclocked to 2.8GHZ and a Conroe at 2.4GHZ, with the Athlon having a 16.7% clockspeed advantage. For a direct comparision at the same clockspeed, we normalize the Conroe scores by taking into account the frequency difference. Assuming the best scenario in which Conroe scores scale linearly with clock speed, we multiply the Conroe scores by a factor of 2.8/2.4. Thus, with a 2.8GHZ Conroe, we would have

Molecular Dynamics
Athlon 64 2.8GHZ: 1872.68
Conroe 2.8GHZ : 2133.38 * 2.8/2.4 = 2489 -- 32.9% faster

Primordia (Atom)
Athlon64 2.8GHZ: 1506.83
Conroe 2.8GHZ: 1365.85 * 2.8/2.4 = 1593.49 -- 5.7% faster

Cryptography
Athlon64 2.8GHZ: 1345.05 -- 8.2% faster
Conroe 2.8GHZ: 1065.59 * 2.8/2.4

STREAM *
Athlon64 2.8GHZ: 1512.55 -- 4.3% faster
Conroe 2.8GHZ: 1242.94 * 2.8/2.4 = 1450


ScienceMark is a strictly CPU/memory test, it doesn't involve video or disk I/O, it is basically a raw speed test. The ScienceMark is freely available from http://www.sciencemark.org/ for both Windows XP and Windows XP x64.

However, the above results showed a violent CPU performance fluctuation for Conroe, from it being 32% faster to being 8% slower. How can this be explained?

The cause of the Conroe performance fluctuations can't be the types of computation involved. We notice that MolDyn is a floating point computation while the Cipher is an integer computation. However, both MolDyn and Primordia are floating point calaculations on quantum mechanical properties of matter, yet, Conroe's Primodia performance is only 5.7% faster than Athlon 64, a 27% relative performance drop from MolDyn.

As we look deeper in the ScienceMark, we notice that in the default MolDyn benchmark setting, there are only 4 cells with a simple cubic lattice, no more than 32 molecules are involved. The program is basically tracking the momenta and positions of a handful of molecules and computing scattering effects. About 2MB to 4MB memory is needed. The Primodia calculation for a single Ag (silver) atom with 47 electrons needs just a bit more memory than MolDyn. However, both the Cipher and STREAM tests involve a lot more than 4MB.

The reason why Conroe did so well in the MolDyn test is simple: Conroe has a huge 4MB of unified cache, for such single threaded tests that can fit in 4MB*, Conroe can just run off the cache with very high speed. Since cache misses drastically reduce peformance, applications run off cache exhibit unrealistic performance numbers.

However, once you go over the 4MB limit, Conroe is slower than Athlon 64 at the same clock. Both the Cryptography and STREM tests use a lot more than 4MB, larger than Conroe's 4MB cache, and Conroe immediately falls below Athlon 64 on the performance curve.


I can bet on this: if one increases the number of cells in the MolDyn test to 9, thus increases the working set to larger than 4MB, Conroe will perform worse than Athlon 64 at the same clockspeed.

There is another set of results on Conroe and Athlon 64, showing Athlon 64 beating Conroe on WinRAR file compression at the same frequency.

Most games are also cache sensitive, increasing Athlon 64's cache by 512KB, you see up to 8% performance increase in FPS.

I have added a comparison between Clovertown(double Conroe) and Athlon 64 2800+.

The conclusion is: clock for clock, Athlon 64 will beat Conroe in real application environments that require a working set of larger than 4MB, or in other words, larger than Conroe's 4MB cache. This means in any real multi-tasking or server environment the Core architecture will be an underdog. Even worse, for Intel's shared cache architecture, cache thrashing is a distinct possibility under heavy loads.

Most modern applications need a lot more then 4MB. IE needs at least 50MB when viewing a normal web page(with Flash, JS, DHTML, AJAX..); Photo Editing apps need around 40MB; FireFox takes 23MB when I use it to view yahoo.com; DivX grabs 23MB even before I open a video...

Frankly, I am really disappointed by Intel's decisions. This gimmick of using 4MB cache to get unreasonably good scores on the most simplistic tests is cheap from design point of view but expensive for manufacturing. Mooly Eden kept talking about the 4 Meg cache in the technology analyst meeting, and promised to add even more cache, however, the 4MB cache is definitely eating a lot of die area and Intel's limited capacity. It is almost like using Netburst's ridiculous hyperpipeline to pump up GHZ at the expense of power consumption and real performance. I wouldn't accuse Intel of benchmark fraud, but people need to know the 4MB limitation of the Conroe.

So far, Athlon 64 is being tested under 32 bit mode with executables optimized for the Pentium. Athlon 64 gets 10-40% performance improvement running in 64 bit mode, a benchmark under Windows x64 or Windows Vista should show the real strength of AMD64 architecture.

As a test drive, I downloaded the 64 bit version of ScienceMark and ran it on my Athlon 64 2800+(Socket 754, 130nm, 512K L2, at 1.799GHZ stock frequency, with 1GB PC3200 DDR) under Windows XP x64. For the 64 bit MolDyn test, I got a score of 1479.12 ScienceMarks, almost 50% faster than the 32 bit result on the same old PC. I suspect that on a Socket 939 Rev E6 platform with SSE3 support, the 64 bit result will be even better. A reader submit the 64 bit result for a 2GHZ Athlon 64, you can view the result here.

AMD should work with benchmark creators to ensure that application benchmarks have a working set larger than the cache size of Conroe -- 4MB.

AMD's Rev F socket AM2 will be available for system builders on May 15, 2006. At 65nm, using Stress Memorization Technology co-developed with IBM, AMD will be able to increase clockspeed to 4GHZ. AMD is also working on Z-RAM, a SOI based technology that may increase cache density by 500%.

*For those who question this authenticity of this Conroe benchmark, the person who posted the result had shown at least some CPU-Z screen captures indicating the various properties of the Conroe CPU. Anand wasn't even allowed to look at the Windows device manager, all he did was pushing some buttons as directed by Intel IDF staff. All the system specs of the Conroe system was provided by Intel. Anand had no verification of the setup. Also, unlike Anand, who receives a lot of ad money from Intel, this person who posted the Conroe results had nothing to gain financially either way. Clearly, this test has more credibility than Anand's. Anand's failure to mention that he was merely a button pusher and his obvious pumping style made his credibility very much in doubt.

*Intel touted its 1 cycle SSE execution, but the STREAM results weren't impressive. Henri Richard mentioned Conroe is more like K8.

*To verify this, you can download ScienceMark, then run the MolDyn, Primordia, Cipher and STREAM benchmarks on your own PC. You will find that the default MolDyn test uses very little memroy, Primodia uses a bit more, but Cipher and STREAM use a lot more than 4MB. To check this, you launch the ScienceMark program, then launch the dialog box for running MolDyn benchmark, at this point, the simulation hasn's started, two threads are created for this task, using a process viewer program, you note the memory used for the task so far is about 7MB. Then you click at the Run Simulation button, you will notice that another thread is created to run the simulation, now the memory used by whole task is smaller than 11MB for most of the time, meaning the benchmark thread uses less than 4MB and thus can fit in the 4MB cache of a Conroe.

111 Comments:

Anonymous Anonymous said...

conroe is dual core 32 bit right?

also doesnt combined cache between 2 cores lead to thrashing?

9:32 PM, April 08, 2006  
Blogger Sharikou, Ph. D. said...

So far, there is zero sign from Intel or elsewhere that Conroe can run 64 bit Windows. Cache thrashing is definitely a possibility for heavy multi-tasking loads. Thought it may not happen on desktops, but I expect Woodcrest will suffer from cache thrashing in applications such as databases.

10:13 PM, April 08, 2006  
Anonymous Anonymous said...

there is aboslutely no information on the system used. How can you even post about these benches. At least we knew something of the machines that anand ran, there is NO information on these systems. I say BS.

11:22 PM, April 08, 2006  
Blogger Sharikou, Ph. D. said...

See my comment added at the end of the article. This test is far more credible than Anand's report. Anand didn't have any chance to look inside the IDF Conroe box, he wasn't even allowed to look at the Control Panel of the IDF Conroe box. If Intel put Pentium D overclocked to 5GHZ there, Anand had no way to tell. Such restrictions were reported by others, but Anand hid this critical detail from his readers. This Victor Wang has provided CPU-Z screen captures, which our Anand never got a chance to see himself.

12:01 AM, April 09, 2006  
Anonymous Anonymous said...

Sharikou,
Did you even read the blog you are linking to? Conroe got amazing results. The test you are talking about was with optimizations made for a P4, as Conroe optimizations are unavailable. I'm excited by these results as you should be. Its a great thing for all of us when Conroe arrives.

1:08 AM, April 09, 2006  
Blogger Sharikou, Ph. D. said...

The conclusion here matches my previous conclusion that Woodcrest will be slower than Opteron 280. Clock for clock, Conroe will be slower than Athlon 64 for applications that can't fit into Conroe's 4MB cache. Please note, the Athlon 64 was running code optimized for Intel Pentium, basically, Athlon 64 was fighting with one hand tied to the back. Once we have Windows Vista optimized for AMD64, we should see Conroe cluster fragged.

1:19 AM, April 09, 2006  
Anonymous Anonymous said...

http://www.xtremesystems.org/forums/showthread.php?t=95021
this is the original post site of conroe's benchmark, done by Victor Wang.
data show that conroe performed super Pi calculation @ 21 secs.

1:54 AM, April 09, 2006  
Anonymous Anonymous said...

found this website through google. no idea about their credibility though but they sort of seem to know where intel is headed.

http://www.xbitlabs.com/news/cpu/display/20040316084519.html

4:05 AM, April 09, 2006  
Anonymous Anonymous said...

If science test are done on very small portion of data, why you question 4MB cache used as an advantage? If it's small it should fit 1MB A64 cache.
For me the question is - how far Intel and AMD can extend the speed for CPUs.
If A64 can reach 3Ghz at 90nm, I think that 3.4-3.6 is possible at 65nm.
Intel had problems with netburst 3.46 at 90nm, and cannot reach 4Ghz with 65nm.
If we assume that Conroe has 10% advantage over A65, the 3Ghz AMD CPU will beat any Conroe clocked at 2.4GHz.
P3 construction was not able to reach 1.13GHz, so these problems may comeback.

I'm waiting for dual core X2 3.0/3.2Ghz X2@65nm.

4:51 AM, April 09, 2006  
Anonymous Anonymous said...

That is a good point. If the data set is small enough(< 1MB), cache size should not be matter to either chips. Having said that, those bench marks do not mean much in real life applications. One way to look at this, Conroe is simply another HYPER version of Pentium 4 (with even larger cache size to boost performance).

My feeling is : INTEL is getting desperate to impress people now.. That is just another gimmick, I am sure there are more..

6:29 AM, April 09, 2006  
Anonymous Anonymous said...

How much does AMD pay you to come up with the garbage you do????

7:44 AM, April 09, 2006  
Blogger Sharikou, Ph. D. said...

First, anything done at IDF was a setup by Intel cowards.

On memory usage, if the working set is smaller than 1MB, then Athlon 64 should perform extremely fast also. However, since 1M is the normal cache size, most benchmark tests are designed to have a working set greater than 1MB. For MolDyn, the working set is between 2MB to 4MB, won't fit in Athlon 64's cache but fits in Conroe's 4MB cache. Once we go beyond 4MB, as we saw from the Cypher and STREAM tests, Conroe falls behind immediately.

If I worked for AMD, I would crush the Intel cheaters right now: I would ask the Benchmark writers to modify their benchmarks to use a working set greater than 4MB, I would pay good money to get a Conroe chip from folks like Victor Wong, then I would give the Conroe to Tomshardware for tests but with one condition -- excluding those benchmarks that fit in 4MB cache of Conroe.

8:49 AM, April 09, 2006  
Anonymous Anonymous said...

Thats cool sharky but how do you explain all the other results by victor where an AMD 4600+ OC to 2700mhz is pwn by a conroe running at 2400mhz is that the 4mb L2 cache as well.....

pi_1m=39.6sec(conroe 2.4G=21.25sec)
pcmark05 cputest=5510(conroe 2.4G=6101)
3dmark03 cputest=1073(conroe 2.4G=1413)
3Dmark05 cputest=5582(conroe 2.4G=9015)
scienmark2.0=1454(conroe 2.4G=1550)
pi_fast=46.03/58.44(conroe 2.4G=32.55/40.41)

And why are you

"Frankly, I am really disappointed by Intel. This gimmick of using 4MB cache to get unreasonably good scores on most simplistic tests is so cheap!"

When the test you are pointing to was not done by intel but by Victor wang who is not affiliated with intel at all???

Its like me doing an intel test with an amd optimized software and saying i am ashamed at amd for pulling this scam.

9:12 AM, April 09, 2006  
Blogger Sharikou, Ph. D. said...

0) Those benchmark suites are many individual sub tests in a weighted average.

1)ScienceMark is a pure CPU/Memory test. Unlike other benchmarks, which involves a lot of other stuff. For instance, in 3D tests, you got the issue of viedo cards and their drivers, etc,etc.

2)I can easily get the code for ScienceMark and see what kind of memory usage pattern its sub tests have.

3) I chose the best ScienceMark result posted by Victor Wong on the Conroe.

My analysis pin pointed the origin of some of the extreme numbers of the Conroe - 4MB cache. If you take a Super PI 1M test and conclude Conroe is 2x the speed of Athlon64, you are doped. For the same reason, if you take the default ScienceMark MolDyn test and conclude Conroe is 30% faster, you are also kidding yourself. Both these tests fit in Conroe's 4MB cache, and that's why they look too good. Once you go to more memory intensive ones, Conroe falls below like a rock.

That's why a final ScienceMark score is not good enough. You have to dig a bit deeper to analyse the results.

9:29 AM, April 09, 2006  
Blogger netrama said...

It is easy to fool those benchmarks with a slightly bigger caches ..why all those benchmarks uses daxpy's and ddot's..
I am sure Athlon will blast the hell out of conroe ..when it comes to I/O and Bandwidth related apps live those used in real life !!

12:20 PM, April 09, 2006  
Blogger Sharikou, Ph. D. said...

Memory is a bottlneck in system performance. Consider the Conroe, with a ~ 1333GHZ FSB, the bandwidth is about 10GB/s. To move 1KB of memory into the CPU, it takes 1.0e+3/(10* 1e+9) = 100 ns. For a 2GHZ CPU, this is 200 clock cycles. Then you add a latency of 200 clock cycles for Conroe. That's 400 cycles of idling time. Moroever, if there are other bus activity going on, then the memory request hit a collision and must wait. You can see an application that can sit inside the cache enjoys a huge advantage.

1:22 PM, April 09, 2006  
Anonymous Anonymous said...

I think that Conroe is better chip than Athlon64.
Does it have 4MB cache? So what? 1333Mhz FSB - for sure not 1333GHZ ;).
For sure AMD has a lot of advantages over Intel in the future, so AMD chip will be my next:

- HT 3.0 gives 22Gb/s with DDR3 controller
- Licensed ZRAM will give a 2-4-8MB of cache L3
- K8L will bring more computing power
- 64-bit instruction set is better optimized than Intel implementation
- 65nm production
- Video, gravity etc. HT co-processors and other fun stuff will help AMD bringing more power and flexibility to the platform.

CONROE is needed! It must be fast, because AMD will be forced to give us more of this nice things sooner than later...
So in two months we will get miserable 2% of performance difference for DDR2 memory in AM2.
Sounds like monopoly..don't you think?

2:54 PM, April 09, 2006  
Blogger Sharikou, Ph. D. said...

My mistake, of course it wasn't 1333GHZ. Actually, Conroe has a 1066MHZ FSB, only Woodcrest is going to have 1333MHZ FSB. Both Conroe and Woodcrest don't have enough FSB bandwidth for dual channel DDR2 at 800MHZ, which requires 12.8GB/s. Basically, when the memory starts pumping 12.8GB/s, the 10GB/s FSB of Intel has to say "slow down, too fast".

AMD's HT3.0 is not related to DDR3 at all. AMD64 doesn't have a FSB. The memory channels are dedicated to the CPU. A rev F AMD2 has thus a dedicated 12.8GB/s memory interface. AMD has licensed various memory technologies from RAMBUS to improve its IMC.

As for AM2 performance, we will know all about it next month. INQ reported a rumor of 10+% improvement clock/clock. There was also report that latency issues on DDR2 will be improved. As for Conore, right now, the only marginally reliable info is this Victor Wong's results. I hope the 3.07V voltage on the CPU-Z was an a bug in CPU-Z.

4:28 PM, April 09, 2006  
Anonymous Anonymous said...

If we are going to criticize Intel for pre-benching Conroe, then we better stop guesstimating K8L performance. Its way to far in the future whereas the Conroe is 3 months away. Also, there is a 100% likelyhood that Intel is working on follow-ups to Conroe, so the K8L better bring it.

9:58 PM, April 09, 2006  
Anonymous Anonymous said...

Regarding to my previous post about bandwidth and HT 3.0.
Of course - HT is not related to DDR3, but HT 2.0 gives us 14.4 GB/s link.
To get performance advantage over 1333MHz bus and 4MB of cache AMD must move to lower-latency/faster memory type (like DDR3)
or setup L3 cache. Today, we have situation where 939 chip beats Intel on every field combining AMD64 strenghts but with Conroe the situation will change till AMD will not gives us better processor or bandwidth.
I have read the Inquirer, but I have seen some AM2 benchmarks made during Cebit in Germany. 10% improvment is a hoax.
There is no reason why 800MHz DDR2 CL4 memory should be faster than 400MHZ CL2.5 if we talk about AMD memory controler.
Or maybe the secret is a software driver which will be delivered in June and will unlock some functions in the CPU.
Everything is possible...

10:15 PM, April 09, 2006  
Blogger Sharikou, Ph. D. said...

DDR2 has higher latency(delay), however, the higher bandwidth (from 6.4GB/s to 12.8GB/s) cuts the time of moving memory into cache by half.

Intel's 1333MHZ (64 bit) bus (10GB/s) is not fast enough to handle DDR2 800MHZ (128 bit). AMD2 has a total bandwith of 12.8GB/s (IMC) + 8GB/s (HT).

10:43 PM, April 09, 2006  
Blogger Sharikou, Ph. D. said...

I read some comments saying the 4MB cache is not a gimmick but can be a real boost for performance in some cases. In some sense, that's true, if you are running something like Super Pi 1M for fun. Even for large apps, more cache is better. But considering the complexity of today's applications, 4MB cache will only bring a small increase in performance, definitely not something like 10-20%. If you look at some of today's typical usage

1) Web Browsers: todays web pages are very rich in content (gifs, jpegs, flash, CSS, JavaScript, DHTML, AJAX), each web page needs a lot of memory.

2) Office: big memory needs.

3) Photo editing: a 4 mega pixel digital photo needs 16MB in memory.

4) Server: huge throughput needs.

5)......

In general, it's hard to find an application environment where 4MB cache can give you a big boost.

11:07 PM, April 09, 2006  
Anonymous Anonymous said...

What version of Science Mark was used ?
If its using FPU for math, its highly possible that half of FPU resources on Conroe are idling (ie if the binary is compiled with iP4 as target)

11:30 PM, April 09, 2006  
Anonymous Anonymous said...

4MB is not a gimmick, its a feature. We are playing a game of symantics. Its like saying an on-board memory controller is a gimmick. Let's stop kidding ourselves.

11:35 PM, April 09, 2006  
Blogger Sharikou, Ph. D. said...

Integrated memory controller is an architectural advantage, it boosts perfotrmance for all application loads and it improves system scalability. 4MB cache only produce visible enhancements for small apps.

11:51 PM, April 09, 2006  
Anonymous Anonymous said...

are games small apps? In mean, the performance difference in FEAR is enormous. The Conroe simple blows AMD away.
How do you explain that?

5:19 AM, April 10, 2006  
Blogger Sharikou, Ph. D. said...

The gaming benchmarks from IDF has zero credibility. Gaming performance depends on a lot of factors such as video card, motherboard, video driver, OS settings, game binary, even hard drive speed...

7:58 AM, April 10, 2006  
Anonymous Anonymous said...

Anand has published today AM2 vs 939 comparision. The bandwidth is higher, but the overall score is practically the same.
I think that AMD is feeling very confident about Conroe - it looks like they know something more than we do.
X2 939 specification was complete in January 2005. This is 15 months advantage over Intel/Conroe and I don't belive that they only achieved DDR2 controller and Revision F Opterons during this time. Capacity and 65nm conversion are the most important now and Conroe has very low impact till end of 2006 - Intel is just not ready to switch production for this chip. Conroe impact can cause only margins eroding and lower AMD income. In 2007 K8L should solve the Conroe problem.

9:19 AM, April 10, 2006  
Anonymous Anonymous said...

cache does improve gaming performance in many games, as was seen in the intel 955 and 965 xtreme edition gaming benchmarks vs 800 series processors with less cache. (Still crap compared to amd though) Sure more cache has its uses, and undeniably gaming is one of them. I have no doubt that when one game can use all of that 4 megs of cache, it will fly. In many other real world apps though, the gains from cache will be minimal. It will be interesting to see intel claim the gaming benchmarks and AMD the real world apps, especially the 64 bit ones. Then their "Serious gamers need intel extreme edition" ads may actually have some merit.

9:20 AM, April 10, 2006  
Anonymous Anonymous said...

If an integrated memory controller is an architectural advantage, then 4MB is also an architectural advantage. Same thing.

9:42 AM, April 10, 2006  
Blogger Sharikou, Ph. D. said...

4MB cache is not an architectural advantage, you can change the cache size without changing the architecture.

10:10 AM, April 10, 2006  
Anonymous Anonymous said...

You only benefit from a larger cache if you are able to make your program fit into it. For memory-bound applications, where the size is too small anyhow, the cache size won't matter much at all. I'm working on real-time computer vision and for our particular applications the speed is directly dependent on the memory speed. Unlike the L1 cache, the L2 cache size hardly matters. However, as the cache size increases more and more applications are likely to fit into the cache. Some applications will make a sharp transition in speed as the cache size is increased, whereas the majority is unaffected. Thus one has to be very careful when interpretting benchmark results.

10:48 AM, April 10, 2006  
Anonymous Anonymous said...

Question: If you claim 50% marketshare for the end of 2006. What you think what will happen next year with FAB30?
Current capacity is only viable for 90nm and AMD doesn't talk about 65nm of FAB30 conversion.
FAB36 will be converted next year. AMD says that new fab is going to produce 20,000wspm in 2008.
So this is 30-50% more than Fab30 alone. But this is it - 21.4%*1.5 that's about 30% of the marketshare.
Without FAB30 converted also to 65nm AMD is going to live with one FAB...

In the begining 2008 they are going to have one 65nm facility and good old one 90nm...

10:56 AM, April 10, 2006  
Anonymous Anonymous said...

There are rumors that AMD will convert FAB30 into 65nm.

10:59 AM, April 10, 2006  
Blogger netrama said...

2 things Intel has used as marketing tools are GHz and Cache size ..they gave up GHz race ..we all know it..
But any basic Comp Arch book explains the relation between cache size and law of diminishing returns ...there is an
upper limit to the cache size and performance. But then a few benchmarks might exclusively take advantage of a bigger cache like
this case with Conroe.

11:03 AM, April 10, 2006  
Anonymous Anonymous said...

I got some interesting sciencemark results on a 3ghz Opteron 146.
32 bit scores
http://img56.imageshack.us/img56/922/sciencemark325zs.jpg
64 bit scores (some of the benches don't work properly in SMx64 apparently)
http://img56.imageshack.us/img56/1935/sciencemark647va.jpg

If conroe performance scales then even a 2.8ghz conroe won't be anything compared to AMD even in apps like the Molecular Dynamics benchmark while running XPx64 or Vista in 64 bit.

11:21 AM, April 10, 2006  
Anonymous Anonymous said...

Sciencemark isn't very cache dependent, a 1.8 GHz Sempron with 128KB of L2 scores within 1% of an A64 with 512KB.

http://www.xbitlabs.com/articles/cpu/display/sempron-3000_7.html

Sciencemark 64-bit also shows big gains on the P4, I doubt it will be any different for Conroe.

How do you explain the 32M SuperPi scores? Its score would require a 3.3+ GHz A64 to match.

And nice of you to ignore BLAS, while mentioning STREAM which is more of a memory test. Conroe's BLAS scores destroy the A64, no surprise considering Conroe has more than double the FP power.

11:55 AM, April 10, 2006  
Anonymous Anonymous said...

what puzzels me is that intel released conroe in way back in Feb, and intel decided to release after amd's M2 socket? from their marketing strategy, wouldn't it be logical to release the processor now so that intel can gain market shares lost to amd?
another point Sharikou points out is that conroe's performance was all over the place, from 32.9 faster to 8.2 slower than AMD64 @ 2.8 GHz. if conroe is a better processor in general, then its performance would be pretty consistent.
Michael Dell, intel's biggest customer, didn't even endorse intel's NGMA when conroe's test was out.
maybe conroe has a better architecture, but there is definately fishy about it.

12:32 PM, April 10, 2006  
Blogger Sharikou, Ph. D. said...

I think we all understand that if cache is much smaller than the working set, then doubling the cache doesn't help. The Conroe results proved this point. Even though Conroe has 4MB cache and Athlon 64 only has 1MB, Athlon 64 still beats Conroe in Cipher and STREAM tests. However, if you can put the whole working set inside the cache, you see huge boost.

If you try applications with working set at a size greater than 256KB of the Sempron but smaller than the 1MB of the Athlon 64, you will see a huge difference. However, if the working set is much greater than 1MB, then you see very small difference.

The important concept is the working set, which is always smaller than required total virtual memory. Even with Super PI 32M, the working set is still quite small.

The BLAS benchmark performs matrix multiplications, the maximum size of the matrix is only 1536x1536. If Conroe did extremely well on this test, that will be additional proof that the 4MB cache is the all the difference. Try a matrix size of 8096x 8096, you will see Conroe slower.

STREAM is not just memory copying. It performs addition and multiplications on large vectors. It's a raw performance test.

1:57 PM, April 10, 2006  
Anonymous Anonymous said...

First, Conroe will have 64-bit extensions. I'm not sure why you question that or act like it won't benefit the same way AMD64 does when running 64-bit code (purely due to more registers, by the way, nothing magic about the architecture).

Second, your conspiracy theories and rabble rousing about Intel's "cowardly" benchmarks are laughable.

Third, Dothan matches/exceeds AMD64 chips on these same benchmarks, clock for clock. Yonah exceeds AMD64 chips on these same benchmarks.

So Intel already has the AMD64 architecture beat, you honestly think that Conroe is a step _back_? Humorous.

The simple fact is that it's destroying, at stock speeds, much higher clocked AMD CPU's, and it's doint it on a pre-release, barely supported motherboard.

It will be fun to come back here and gloat when the 965P motherboards hit the scene, and you're proven completely, utterly wrong.

2:25 PM, April 10, 2006  
Anonymous Anonymous said...

I did a test on amount of memory used in Super Pi. for 1M, the program uses nearly 10MB of memory.

2:30 PM, April 10, 2006  
Blogger Sharikou, Ph. D. said...

The numbers speak for themselves, the conclusion is very clear -- Conroe's 4MB cache is the main trick. Its 14 stage pipe may be less efficient than Pentium M's 10-12 stage pipe.

I certainly hope Intel can get AMD64 correctly implemented. From my MolDyn test under Windows x64, A64 gets a 50% performance boost under native AMD64. I am in the process of getting Windows Vista x64 installed and do a MolDyn test under that. So far, Intel's EM64T runs slower in 64 bit mode.

2:37 PM, April 10, 2006  
Blogger Sharikou, Ph. D. said...

Super Pi 1M allocates 8MB memory. However, what matters is the size of the working set.

2:41 PM, April 10, 2006  
Anonymous Anonymous said...

my testing has shown a 150% performance increase in the Molecular Dynamics test under 64 bit mode. My screenshots are 8 or so posts up.

2:48 PM, April 10, 2006  
Blogger Sharikou, Ph. D. said...

Thanks for this test result under Windows x64. I notice you had the same problem as I did with the 64 bit version of ScienceMark: the Cipher tests failed to run, and the BLAS test doesn't use SSE. However, the performance boost for MolDyn is very visible.

2:56 PM, April 10, 2006  
Anonymous Anonymous said...

BLAS is specifically designed to not be cache dependent or matrix size dependent. Once BLAS implementations get optimized, you can expect Conroe to sustain close to 4 DP flops/cycle, leaving everything else in x86 land in its dust.

5:45 PM, April 10, 2006  
Blogger Sharikou, Ph. D. said...

Please note, other benchmark scores (such as PCMark) have already been posted in the comments, and were discussed. Please refer to earlier discussions, unless you have anything new to add.

According to Intel's IDF presentations, Conroe's main strength of its ability to analyse and optimize x86 code on the fly. However, from the STREAM test, which is basically one line of code a(i) = b(i) + q*c(i). The Conroe's four FP units should have no problem running the data in paralell. Howver, Athlon 64 beat Conroe in this test.

5:56 PM, April 10, 2006  
Anonymous Anonymous said...

Sniff... Sniff...

I smell a fanboi!

6:13 PM, April 10, 2006  
Anonymous Anonymous said...

This is my first (and last) post ever on your blog.

You accuse Anand of malicious bias and of selling out, even though when you look at things with a historical perspective Anand has always been a proponent of the best performance per dollar contender - which for three years has been AMD.

To quote their most recent article (AMD Socket-AM2 Performance Preview)
"...the fastest X2 chips still outperform the fastest Pentium D chips - but it looks like after three years K8 may finally get some competition for the performance crown." [Emphasis is mine]

Also, if you actually read Anand's article on Conroe, not only did Anand have physical access to the Conroe system (and it's Control Panel and driver configuration, they also took the time to correct some of their own self admitted errors in benchmarking the games.

You can also complain all you want about benchmark sets fitting under 4mb, and even try and make a case that games are dependant upon drivers and external factors (all of which were identical between Anand's two test systems), but if you look at the original and updated benchmark page you'll realize that Anand benchmarked not only games (Quake 4, FEAR, HL2, UT2k4), but Windows Media, Divx, and iTunes encoding times - all of which are real world applications.

7:49 PM, April 10, 2006  
Anonymous Anonymous said...

well at least that poster at least showed he was using the real "conroe" processor, rather than anandtech.
sharikou got a point here. that anandtech guy was never allowed to look into the device manager or control panel. how did he know he was using the real "conroe" processor? did he take a picture of that processor? did he have a CPU-Z screenshot?
sharikou may be an amd fanboy alright, at least he backs his claims up with something solid.

7:51 PM, April 10, 2006  
Blogger Sharikou, Ph. D. said...

All comments without substance or merely repetition of previously discussed material will be deleted to save space.

I wrote this reagarding Anand's credibility. Anand is very suspicious in this Conroe business.

1) Anand didn't see the Conroe CPU, nor did he have the chance to peak into the device manager, nor was he allowed to run CPU-Z. In fact, Anand had no way to tell what kind of CPU was running the Intel box. Victor Wong had his hands on Conroe, he got CPU-Z screens to show.

2) Anand's pumping of Conroe is very suspicious. The title of his article was "Intel regains performance crown", which was false. Conroe is months away, may be delayed as such delays have become common with Intel's launches. There is nothing certain about it gaining performance crown against AMD's future products.

3) Anand willfully hid the fact from his readers that he was merely a button pusher.

As reported by TechReport:
"We used test systems pre-configured by Intel before the show, and we had very limited time to conduct testing or inspect the systems. We were not allowed to look inside of the case of either PC, and the scope of the benchmarks we were allowed to run was defined by Intel. We weren't given the leeway to record our own custom timedemos for the games, and we didn't have enough time to run each test three times or even reboot between the tests....
our role really was confined largely to clicking a few icons and menu items to kick off a test and then writing down the results."

4) After the public suspicion on Intel. I emailed Pat Gelsinger and others at Intel requesting for a fairer test environment. Intel invited Anand back for another test. This made Anand even more suspicious. Others didn't get such treatment. Remember, Intel setup the Conroe box and FX box for a very specific purpose. Intel knew their setup had zero credibility, so they desperately needed someone like Anand to smear some credibility on their "setup".

5) If Intel really has something, answer AMD's challenge for a public duel. Intel should follow the basic rules of ethics, even if it's in a time of massive market share and revenue loss.

9:10 PM, April 10, 2006  
Anonymous Anonymous said...

I'm convinced that Conroe does in fact offer some serious competition to AMD's desktop line. Should we be so surprised? Hardly! What is the point of going to 65nm, if it doesn't give you any competitive advantages? If Intel didn't go to 65nm ahead of AMD, Intel would have been out of the game completely. Note, they did compare a not yet released 65nm chip to an existing 90nm offering.

I have another comment on caches. In order to benefit from a larger cache you have to make your application fit into the larger cache, while failing to make it fit in the smaller alternative. If it already fits the smaller cache, there are no serious performance gains to be made. Furthermore, when we go from a single-threaded to a multi-threaded environment the problem becomes more complicated. Cache optimization in an multi-threaded (or multi-application) environment is very difficult, if possible at all. At the end of the day, memory speed is what really counts. This will be evident when more cores are added to the same chip. Computational muscles won't matter much at all. Chips will be measured in terms of bandwidth. Period.

3:25 AM, April 11, 2006  
Anonymous Anonymous said...

Remember SoftRam? That company made $210 million on a piece of software that can double your ram(of course it couldn't). At least Intel is putting some real cache here..

11:43 AM, April 11, 2006  
Anonymous Anonymous said...

AMD's X2 4600+ has 512MB of L2 Cache. I wonder if the result would have been different if Vic had overclocked a 4800+ instead.

11:59 AM, April 11, 2006  
Anonymous Anonymous said...

You knew you were asking to be attacked by the zealots didn't you? -sigh-

Fact is we'll just have to wait to see how much of this is real and how much of this is the typical Intel spin.

12:30 PM, April 11, 2006  
Anonymous Anonymous said...

Great piece of research you've done here, I'm 100% convinced that the Intel Conroe benchmarks were fishy and shouldn't be trusted by anybody.

Keep up the good work! :)

1:32 PM, April 11, 2006  
Anonymous Anonymous said...

I agree, to some degree it is almost and I stress almost a waste of time to debate about things of this nature until the actual product is released. Sharikou is to some degree Pro-AMD but in saying that I am running 6 AMD rigs so I am as well.

By releasing benchmarks of this nature Intel is hurting itself more then anything. They will continue to not sell their current Netburst processors and their stock will hurt accordingly. It is much harder to launch products when your company is losing a massive amount of money. Even if the new Conroe beats the K8 in some tests, I would honestly hope that it would! If a company puts a chip under development (PIII or not) for as long as it has you would hope that it can at least beat the competitor in something. The benchmarks that were presented at IDF were obviously designed to make the Conroe look good. In regards to Anand, he jumps bandwagons a ton it seems. He likes to attach the "Performance Crown" to anything that he can as it would seem that he thinks people care about what he thinks. The whole Antec PSU's do not work with Tyan boards for him was a total knock on his rep IMO.

1:42 PM, April 11, 2006  
Anonymous Anonymous said...

zetro said...
By releasing benchmarks of this nature Intel is hurting itself more then anything..

I believe that is simply not true. Intel wouldn't have done this if it wasn't beneficial to them. The NetBurst architecture has been dead to the Enthusiast crowd for months - it's not like they're going to lose a lot of P4 sales because the people that read Anandtech and other techie sites all of a sudden stop buying P4s, since the enthusiast crowd makes up so small a portion of the overall sales pie in the first place. But I digress - the point is, this is beneficial to Intel because now even the AMD fanbois have to stop and think - "Well gee, should I buy AMD now because it's better than the P4, or wait a few months and pick up this newfangled Conroe?"

However, just because it's good PR (which it exactly what it is - good PR), and just because the tech sites bit at it (which they did) does NOT mean that the evidence presented is inherently fabricated, false, misleading, or that Intel is desperate in any way shape or form. Intel is the farthest thing from desperate - yes, they are losing a bit of ground to AMD, but let's face it - they have lots of ground to lose before it really hits them where it hurts, the bottom line. It's also not inconcievable that they could design a chip capable of outperforming AMD by 15%+ on the performance side - they definately pump enough money into R&D, and say what you like about the company's competetive practices, they employ some of the finest engineers in the world and nobody there is stupid.

Anand said it best - "Honestly it doesn't make sense for Intel to rig anything here since we'll be able to test it ourselves in a handful of months." We'll all know what the true story is soon enough.

2:14 PM, April 11, 2006  
Blogger Sharikou, Ph. D. said...

Let's put this in perspective. On March 3, 2006, Intel warned that its 1Q06 revenue will suffer a $1 billion drop from a previously lowered guidance. Regardless of the eventual performance numbers, the early Conroe promise should hurt Intel's current sales. But Intel did the Conroe show any way. So there must be good reasons for that. I think Intel though Conroe benchmark on Anand could benefit Intel in a number of ways

1) Intel hoped to slow down AMD. Intel must have concluded that people already know Netburst is inferior, therefore it's now more urgent to halt AMD.

2) Intel is short of cash right now. It borrowed $1.6 billion last quarter in the form of convertible debt. After IDF, many financial analysts started to make some positive comments for Intel and negative comments for AMD. This will be very helpful for Intel's effort to borrow money -- Intel filed a mixed shelf registration on March 30.

3) Intel can now use Conroe as an excuse for bad results in the coming quarters.

3:49 PM, April 11, 2006  
Anonymous Anonymous said...

I think the entire thing was ment to regain market share, as said everyone that was planning on buying an amd will now wait to see if this conroe is as good as they say, If billy bob is going to wait to buy this conroe then he is not spending the cash and gaining market shares for AMD.

5:37 PM, April 11, 2006  
Anonymous Anonymous said...

as good as the conroe is, surely it won't win every single benchmark out there against the a64 x2. in purely memory-intensive benchmarks, the X2 will still have a distinct advantage due to its on-die memory controller. the conroe's large 4mb l2 cache somewhat lessens this advantage but not entirely so we still see the X2 win these specific benchmarks by about 5-8%.

6:34 PM, April 11, 2006  
Anonymous Anonymous said...

Does anyone have any information at all as to what K8L will be? I keep reading "K8L is coming" but no word on what the change will be from the current architecture and cpu.

10:21 PM, April 11, 2006  
Anonymous Anonymous said...

These are interesting conspirecy theories, but thats all they are.

Sure intel could have rigged the systems at IDF, but its just as possible they didn't.

VictorWang has a early model conroe, which might even be one of the first version of conroe.

If i'm wrong please correct me, but did every chip AMD makes work perfectly the first time, was the opteron such a completly perfect masterpiece that it didn't go threw any problems that needed to be fixed?

And its amazing, i didn't think it would happen like this, but intels releasing competition for AMD, and now a blog is the hangout for hardcore, never rollover, AMD fanboys.

And please, since all this evidence is rock hard. Please cite credible sources. No offence to VictorWang, but an enthuasist, while still a good guy, is not the most credible source. And saying that the AM2 is early model thing to is BS, because the architecture has been out for 15 months(i think is what was said, not that it matters, being perfect)

12:07 AM, April 12, 2006  
Anonymous Anonymous said...

"Even for large apps, more cache is better. But considering the complexity of today's applications, 4MB cache will only bring a small increase in performance"

Considering the complexity of today's apps? Yes they are very complex... So what exactly is your point here?

You also make the statement several times that the cache is only useful (in terms of performance) if you can fit the entire program's working set into it. What tosh. The cache only stores selected _pages_ of memory, not entire processes! You may have a PHD but it certainly isn't in Computer Science!

As numerous people have pointed out in the article's comments section, you are simply ignoring history. The P6 architecture from which Yonah, Merom and Conroe are derived consistently receives a performance boost whenever more cache is added. You just seems to ignore it though. You also totally ignores the fact that Intel already has a chip (Yonah) that beats the A64 in the exact benchmarks he says Conroe won't. So what, Intel is taking a step back with Conroe are they?

This article is just satire. It'll be on the headlines of several tech news sites later today and it will have served its purpose. You will have given every AMD 'boy a couple months more lease of life.

4:07 AM, April 12, 2006  
Anonymous Anonymous said...

Sciencemark authors seem to think that the results show Conroe in a good light

http://www.xtremesystems.org/forums/showthread.php?t=95021&page=37

Comment?

4:23 AM, April 12, 2006  
Anonymous Anonymous said...

While the pipeline may be longer you fail to address that it is one stage wider than both the athlon design as well as netburst

While the hub-style memorcontroller will hinder Conroe, AMD isnt doing themselves any favors by keeping the HTT below Intel's.

Ive seen hard numbers from both and unless AMD can beat Conroe in bandwith (in speed @ or above 266~333+) then they have little hope of regaining any ground for quite some time.

Overall @XS many people have shown the IPC of Conroe to be that of A64 + ~30%. Unless they can match that on 90nm (which they cannot) I dont see how the author can justify claiming Conroe to be inferior.

FYI CPUs dont scale on a straight line in terms of peformance. If anything Conroe should scale exponentially as the bus speed would need to be increased

6:35 AM, April 12, 2006  
Blogger Sharikou, Ph. D. said...

"The cache only stores selected _pages_ of memory, not entire processes!"


No, the cache operates as cache lines, not pages.
However, when your working set is smaller than 4MB as in the case of MolDyn, the whole thing will be put inside Conroe's cache, because nothing else will be there to evict it.

You guys may want to root the Intel underdog, stuff saying Victor Wong's Conroe might be an older one and the IDF invisible Conroe might be a later one are without factual basis. For me, the IDF Conroe was a piece of unverifiable bench-marketing, as INQ call it. The IDF Conroe tests had zero credibility. Look, if Intel could give Anand a Dempsey server last year for an extensive test, if Intel allowed Anand to publish Conroe numbers, why can't Intel lend the Conroe to Anand for a serious drill??

If Intel wanted to say it will have the performance crown, don't be shy, don't hide, prove it! Intel may be losing big chuncks of market and it has 100,000 people to feed, but ethics are no less important.

Unless Intel can prove the Conroe numbers at IDF, it should be viewed the same as the Skype 10-way call marketing ploy, where Intel claimed only Pentiums are good enough for Skype.

8:21 AM, April 12, 2006  
Anonymous Anonymous said...

intel has been decieving the public for quite some time; from linux's compiler to skype's 10-way conference calls. after all these years of decieving, you think intel suddenly wants to come clean, when they are losing heaps of market shares to amd?

on the side note, conroe may perform better than k8L under x86 enviroment, but x64 would be the real battle ground when both of these processors. when windows vista comes out, intel's imitated x64 technology will be significantly inferior to A64's architecture.

intel is still trying to catch up on the x86 front.. when the war in x64 has already begun

9:45 AM, April 12, 2006  
Blogger Sharikou, Ph. D. said...

Let me address the Pentium-M and Conroe difference a bit. I expect Conroe to have the same or lower IPC than Pentium-M due to its 20-40% longer pipeline (Conroe has 14 stages, Pentium-M 10-12 stages). The lengthening of Conroe's pipe is needed for achieving higher frequency. But it will definitely hurt IPC.

With a 4-issue core and 14 stage pipeline, Conroe will have 56 instructions in flight at any moment, not much less than a Northwood Pentium 4.

10:35 AM, April 12, 2006  
Anonymous Anonymous said...

Okay, instead of everyone SPECULATING on the performance of Conroe, let's just wait until it comes out to actually benchmark it. It's funny that everyone either thinks Conroe is crap, or is absolutely amazing. We won't know this answer until Conroe is out. So everyone stop whining.

We don't even know how the AM2 processors will do at this point either. Anandtech has another AM2 chip benchmarked, where performance is 1-2% better or 1-2% slower. So even if it has an onboard memory controller, we'll see how well it works with DDR2 RAM. But then again, these aren't final products yet! So anything can happen!

10:48 AM, April 12, 2006  
Anonymous Anonymous said...

With a 4-issue core and 14 stage pipeline, Conroe will have 56 instructions in flight at any moment, not much less than a Northwood Pentium 4.

And how is that a bad thing? I thought more work per clock was a good thing...

I think your less credible then Intel is. And wouldn't it be a bad thing to not beat AMD, they're loosing because AMD performs better. So the only thing they could do is make better performing chips. And if they lied to use, we'd find out when they release the chip, and then they'd be screwed because people would still buy AMD because it would perform better.

11:29 AM, April 12, 2006  
Blogger Sharikou, Ph. D. said...

Prescott has 93 instructions in flight, but it has lower performance than Northwood at the same clockspeed (which has 22 stages). Once you have too many instructions going the same time, you suffer more from mis-predictions.

I am just analysing independently verifiable data. Intel's IDF stuff was an unverifiable blackbox test.

11:40 AM, April 12, 2006  
Anonymous Anonymous said...

It's not the number of instructions that need to be flushed and redone that counts, it's the time it takes to recover from a branch misprediction. The width of the pipe is inconsequential to that. (and btw, a mispredicted branch isn't exactly common on today's OOE architectures)

12:20 PM, April 12, 2006  
Blogger Sharikou, Ph. D. said...

I recommend everyone to read this elementary introduction on the effects of cache miss (or lack of when the memory fits in the cache).

If anyone can argue that Conroe's wild performance fluctuations were not due to the boundary effect of the 4MB cache limit, please present it. Personal beliefs, one or the other, are not going to carry this discussion any further.

6:06 PM, April 12, 2006  
Anonymous Anonymous said...

my venice 2547mhz beats wang's conroe 2.4ghz in sciencemark 32 bit

http://img88.imageshack.us/img88/6536/sc32v25409yd.jpg

1319 venice vs 1308 conroe



http://img62.imageshack.us/img62/8311/sciencemark6ex.jpg

now i know why intel didn't give us sciencemark results of conroe

6:36 AM, April 17, 2006  
Anonymous Anonymous said...

Anyone in their right mind should know that intel does their own secret optimazations and trickery IE mmx that was a load of garbage , hyperthreading ahmm now overloading the cpu with more memory , Come on intel you can do better than that , When I viewed those results I knew intel was up to their old tricks , Just like PT Barnum theres a sucker born every minute Ha Ha Ha .

1:42 PM, April 18, 2006  
Anonymous Anonymous said...

Working set is not equal to program size but # of unique pages / some time window. Just because IE uses 50MB memory it doesn't mean working set is 50MB.

Amount of cache doesn't matter. I say look at gaming benchmark, productivity benchmark, and server benchmark that are based on real life applicatoins. Better architecture is only in the context of the application you're interested in.

3:39 PM, April 20, 2006  
Blogger Sharikou, Ph. D. said...

Working set is not equal to program size

Of course not. Larger programs do tend to have larger working set simply due to the increased complexity. With IE as an example, when you view a single page, zillions of different code got fired up to handle all sorts of stuff: XML, DHTML, JS, AJAX, JPEG, GIF, FLASH, JAVA........

However, in the case of ScienceMark's MolDyn, since the memory requirement is less than 4MB, the working set is smaller than 4MB, and fits in the cache. It's in such cache fitting cases Conroe shines.

4:14 PM, April 20, 2006  
Anonymous Anonymous said...

for those who are going on about 1066 fsb or whatever. i think you need to remeber QUAD PUMP!!!!!!

this is not the actual FSB of the cpu.

also they are comparing these new intel chips to OLD amd chips.

all your intel fanboys out there.. i hope you wake up someday.

all intel are good for are heating up your room in winter

10:38 PM, April 21, 2006  
Anonymous Anonymous said...

That's a fun read, but I am not sure about your estimation of 4 GHz AMD.
The linked text reads "resulting in a 40 percent increase in transistor performance compared to similar chips produced without stress technology".
We know AMD already realized 24% gains with the earlier stress technique, so another 16% can be expected, not 40%.

7:33 PM, April 25, 2006  
Anonymous Anonymous said...

anonymous, ph.d says...
I will ask you a different question.

Hypothetically, if it turns out when these parts are available to everyone and you are proven wrong about your argument, why would it be true?

12:04 AM, April 27, 2006  
Anonymous Anonymous said...

Anand's credibility not in question?

Well, then I suggest that you all take a close look at slide 51 from the Intel Spring 2006 analyst meeting presentation:
http://media.corporate-ir.net/media_files/webcast/2006/april/intel/PDF/SAM-42606-morning.pdf

A quick JPG snapshot here:
http://img140.imageshack.us/my.php?image=hype3wy.jpg

Oh, the shame...priceless!

10:10 AM, April 27, 2006  
Anonymous Anonymous said...

you also keep on in the article about the benchmarks been optimized for the pentium, and that means the A64 is working with one hand tied behind its back.

I would argue this is infact the same for the conroe, its NOT a pentium processor, obviously, so any previous optimizations are null and void for the conroe as well.

5:45 AM, April 28, 2006  
Blogger Sharikou, Ph. D. said...

conroe, its NOT a pentium processor

Conroe is a modified Pentium III.

7:21 AM, April 28, 2006  
Anonymous Anonymous said...

Not really a modified P3. the designed was based much more closely to the P3 than the P4, but its still not the same. even you use the word modified.

The fact remains still even if it is just a "modified Pentium 3" it doesnt matter as the code in question is Intel P4 optimized.

9:31 AM, April 28, 2006  
Anonymous Anonymous said...

As this is evidently a subjective opinion of a person baised towards AMD, given without any true, verified facts, I think Ill wait for Toms Hardware to form an opinion, they at least are less subjective than you!

Also, how can you compare a 64-bit system to a 32-bit system? compare apples to apples then we may get a more valid response. Any 64-bit system will outperform a 32-bit one clock for clock

You harp on the fact that the test programs were optimised for P4, Conroe you claim to be a modified P3. Modified P3 != P4, different architecture, instruction sets, etc. So your claims that the tests are Conroe biased are not valid, they are P4 biased. So both Conroe and A64 had their hands tied.

So Sharikou, now that your BS has been published, and Ive had my say, I say we wait and see.

-
Numero Uno AMD Fan

3:22 AM, April 29, 2006  
Anonymous Anonymous said...

What do you mean compare a 64bit system to a 64bit system, are you talking about conroe as not 64bit capible? or that AMD64's consistantly perform better. And they run 64bit and 32bit systems?

8:14 PM, April 30, 2006  
Anonymous Anonymous said...

Just a link to further push the issue, although it is coming from a questionable source, they dont speak negatively about AMD, nor do they favor INTEL. It seemed fair.

"Intel Core versus AMD's K8 architecture."

link : http://www.anandtech.com/cpuchipsets/showdoc.aspx?i=2748

7:16 AM, May 01, 2006  
Anonymous Anonymous said...

If history, regardless of core, is anything to go by. Intel fan boys will buy Intel and rant on about their big numbers. AMD fanboys will buy AMDs and rant on about their big numbers.

Some enthusiasts may actually identify their own needs and buy accordingly. Fanboys will never cross over.

Intel will always sell in huge numbers because of Dell (& etc) and Joe "Computer=Pentium" Average.

Signed
Happy with my Desktop Atlon XP 2400 and my Lappy Celeron 2.4G

7:13 PM, May 01, 2006  
Anonymous Anonymous said...

I guess you guys are so in love with AMD and hates Intel for reasons. The fact is that AMD is not that great company. Look at it from all angles and study how AMD started its business at the first place

5:46 PM, May 07, 2006  
Anonymous Anonymous said...

I doubt Intel would set up a false benchmark... imagine the outcry when Anandtech does *real* tests and Conroe fails to meet the expectations or match the original benchmarks.

No enthusiast would trust Intel for *years* - and I doubt Intel wants this.

Before we start slamming this site or that one, lets wait for the real conroe to stand up.

6:37 PM, May 07, 2006  
Anonymous Anonymous said...

Sharikou,

Sorry to say it is apparent that you don't have an architecture or performance background. Integrated memory controller with DDR channels does take up die area. So there is a tradeoff: use the die area for more cache or integrate the MC, or do both when your process technology can accomodate it with good cost. Just wait until the benchmarks publications start rolling out. You may likely have to eat yor words...

10:56 PM, May 08, 2006  
Blogger Sharikou, Ph. D. said...

Integrated memory controller with DDR channels does take up die area. So there is a tradeoff: use the die area for more cache or integrate the MC

I heard this argument from Intel amateurs. I suggest you look at the die area numbers for the IMC and 1MB of cache...

However, another Intel VP admitted that Intel is still struggling with IMC. It would probably take another 2 years for Intel to figure it out.

11:16 PM, May 08, 2006  
Anonymous Anonymous said...

Intel claims a performance gain over yonah of 20%.
See xtremesystems.org (forum) for test of conroe on msi mobo.
You will find Intel to be pretty spot on with that claim.

6:29 AM, May 10, 2006  
Anonymous Anonymous said...

Now you bring in the whole lengthened pipeline argument.

K7 => K8

Barton's pipepline was increased from 10 to 12 stages in the transformation to Athlon 64 architecture.

Yet performance was increased by roughly 20% clock for clock.

Then you don't even address the whole wider pipeline issue. What's up with that?

Here's my argument. Athlon 64s kill Pentium 4s. Intel fanboys hate this, and they refuse to acknowledge this by countering how they can clock to 4GHz @ 100 deg C. They try to ignore the whole Pentium 4 issue when Pentium Ms were brought to desktops with a socket adapter. Pentium-M kills Athlon64s clock for clock. AMD fanboys won't accept this, but it's quite obvious.

When we went from P3 to Pentium-M, we went with a pipeline increase also yet we saw no IPC issues. Pentium M would NOT be slower than Pentium 3 if it's an improved version. Thus Conroe is an improved Pentium M. Intel would not be taking steps backwards here.

If you want to talk about pipeline increases, look at AMD too. Yet we have a 20% IPC increase. So enough of that.

Seriously, your article here is just the working of a bitter AMD fanboy who refuses to believe pre-release benchmarks.

Yes I run AMD. I have an SD 3700+ that I'm replacing with an Opteron 170 that should be coming in the mail today or tomorrow. My previous system is an AMD. The system before that is an AMD. My last Intel desktop was from 1995.

I'm not a fanboy thankfully, and I have a Pentium-M laptop. I will vouch for the best products on the market, and it seems that Conroe might be the best one out there.

Get with it, and if you want to stick with AM2, go ahead, but if these Conroe numbers are accurate and representative, then it's your own loss if you want to stick with your FX-60 overclocked to 2.8 while I'll be buying a Conroe 2.4

2:02 PM, May 10, 2006  
Anonymous Anonymous said...

Intel Conroe:

E4200 2MB 1.60GHz 800MHz FSB Q4 $169. us
E6100 2MB 1.33GHz 1066MHz FSB Q1 2007 $149. us (35 Watts)*
E6200 2MB 1.60GHz 1066MHz FSB Q4 $179. us
E6300 2MB 1.86GHz 1066MHz FSB Q3 $209. us
E6400 2MB 2.13GHz 1066MHz FSB Q3 $239. us
E6500 2MB 2.40GHz 1066MHz FSB Q4 $269. us
E6600 4MB 2.40GHz 1066MHz FSB Q3 $309. us (65 Watts)
E6700 4MB 2.67GHz 1066MHz FSB Q3 $529. us
E6800 4MB 2.93GHz 1066MHz FSB Q4 $749. us
E6900 4MB 3.20GHz 1066MHz FSB Q4 $969. us
Intel Conroe XE 65nm Dual Core
E8000 4MB 3.33GHz 1333MHz FSB Q4 $1199. us (95 Watts)

AMD Q3:

Athlon 64 FX-62 2MB 2.80GHz 1000MHz HTT $1,236 (120W)
Athlon 64 X2 5000+ 1MB 2.60GHz 1000MHz HTT $696 (95W)
Athlon 64 X2 4800+ 2MB 2.40GHz 1000MHz HTT $645
Athlon 64 X2 4600+ 1MB 2.40GHz 1000MHz HTT $558
Athlon 64 X2 4400+ 2MB 2.20GHz 1000MHz HTT $469
Athlon 64 X2 4200+ 1MB 2.20GHz 1000MHz HTT $365
Athlon 64 X2 4000+ 2MB 2.00GHz 1000MHz HTT $328
Athlon 64 X2 3800+ 1MB 2.00GHz 1000MHz HTT $303

What kind of processor do you recommend?

1:23 AM, May 13, 2006  
Anonymous Anonymous said...

How do you explain that Conroe is showing the same good tendencies in both anandtech's tests and tests done by xtremesystems.org users? Just take a look, Conroe is ~25-30% better in most cases at same clocks?

5:27 AM, May 13, 2006  
Anonymous Anonymous said...

One shouldn’t compare Intel TDP(typical) to AMD TDP(max power)…

7:47 PM, May 20, 2006  
Anonymous Anonymous said...

Sharikou, I think you just said "CONROE has a MUCH BIGGER RAW POWER than AMD".
You said that all aplications that will fit in cache will work faster. Well, both AMD and Conroe should work in the fastest way when are using only cache memory. Since Conroe is much faster than AMD 64 (at the same speed) when using only cache memory, I can take the following conclusion: Conroe has much more RAW power, so it's performance is real. Your article title should be "Conroe is still slower in huge memory dependent programs because it hasn't an integrated memory controller".

7:19 AM, May 26, 2006  
Anonymous Anonymous said...

Sharikou, running http://athlon64venice.narod.ru/MAS.rar on Core 2400MHz showed (Core 2 2,4GHz == Venice 939 2,55GHz, single thread) Conroe is only 5-6 percent faster than K8 with the same clock. So I don’t believe Core 2 will be more than 10 percent powerful in average than current X2
The time will show

1:12 AM, May 28, 2006  
Anonymous Anonymous said...

In winrar that is. What about other applications? Not all applications are as memory dependent as WinRar.

2:48 PM, May 29, 2006  
Anonymous Anonymous said...

This whole article is garbage. You can not do a calculation to make up for clock speed difference, it isn't linear. You are also working with a pre-production part, if you even really have the part. You did a very limited number of benchmarks, all which amd has had an advantage in the past. This whole article is biased. You don't even explain the hardware difference between the AMD and Intel CPU.

1:25 PM, June 06, 2006  
Anonymous Anonymous said...

Well, I'm a 41 year old IT Mgr who enjoys being a geek when it comes to computers, and always have been that way. So, I like a good read like your article that tries to cut through the hoopla and see what's being left out. Well, it seems Anandtech has run more benchmarks on the Conroe aka Core 2 here:

http://www.anandtech.com/cpuchipsets/showdoc.aspx?i=2771

In all the benchmarks except one, Core 2 is blazing fast at 2.93GHz and solidly beating the FX62 at 2.8GHz. That one benchmark, the SYSMARK 2004 - Communication Performance test, is interesting, and seems to basically back up what you're saying here in your tests. Go outside of Core 2's cache and performance drops significantly, but it still looks to be a very competitive CPU, and it still wins that bencmark although it's close.

To be honest, I don't care who is faster. What I want to see is not a clear winner, but real competition. There hasn't been any real competition for the past couple years in the CPU arena. So...I WANT core 2 to be blazing fast, cool running, and high clocking if possible, so AMD has to respond in kind to remain competitive. That's how we get great innovation. Of course, I want it all cheap, too. LOL So, it's all good in my book. I loved the article and your thoughts on the matter...keep up the good work. Peace.

7:02 PM, June 06, 2006  
Anonymous Anonymous said...

The sad thing is you are so hung up on analysing the nitty gritty technology to death - you seem to have no clue that the reality is general end users don't give a hoot about RAM timing, overclocking and all the other super-cr@p you discuss to death. An end user wants the best performing PC in their budget. For a company to make money, companies need to meet the needs of the marketplace where they can make the right returns for their shareholders. Intel got shafted not because their architecture is bad. They got shafted because it didn't meet the needs of the marketplace. If MHz was still the key selling proposition in the market, AMD would be bleeding like a pig. But it isn't - Intel called it wrong...and AMD got it right. Intel deserved to have it's ass handed to it.

Looking forward - Core 2 Duo is going to perform well, be priced agressively, get into smaller/sexier form factors through design enablement, get OEM/channel support and be more marketable as a result of all this.Can AMD match Intel's marketing even if Core 2 Duo is not better but only equal to Athlon - doubtful. Can Intel use their capacity, manufacturing process, pricing and brand to squeeze AMD with their new product line up - sure. Can AMD defend itself - possibly.

The point is all the freakin' pseudo intellectuals on this blog do not decide who wins in the marketplace. 100 million+ customers do. If it were only about the technology, AMD would have 40 billion $s of revenue and Intel would have 5. I'm looking at this blog and all it consists of is any way to be an AMD fanboy. Dude - get a life. Did you buy Intel at 50$ - is that why you're pissed w/ them. Or perhaps you work for AMD and are pretending to write a blog that keeps it real for the planet. By all means, trying to keep everyone honest. But for goodness sake stop making it personal when you talk about how Anand makes his money. That's just distasteful man. And...you don't seem to mention anywhere how Anand has done reviews of how AMD kicked Intel's butt in the past. Or perhaps your theory is AMD paid Anand to tell the truth. But hold on - AMD is actually the corporate reincarnation of the "divine one" so that would be sacrilege. Dude - get a freakin' life...and get some maturity.

And no - my name is not Anand Shimpi and I do not work for Anand Tech.

2:43 AM, June 09, 2006  
Anonymous Anonymous said...

CONrow Thats what they should call it

11:52 AM, June 17, 2006  
Anonymous Anonymous said...

Hey, Where can I buy 'I am an AMD fanboy' t-shirt?
for someone with phd, I'd expect a bit higher level of critical thinking.
I hope that your tunnel vision, aka head-in-the-ass syndrom, will somewhat diminish in few months. But then again, you'll be waiting for new AMD offering in Jan 07 (which by the way doesn't look that great, more of the same thing, really) or then to 45nm, etc from your beloved chip-maker.
Meanwhile, consumers like myself, looking for an upgrade will vote with their wallet. As part of a large crowd that could care less about this whole AMD vs Intel argument, all I need is price/performance and future upgradeability and compatibility. Conroe looks like no-brainer.

Look at www.xtremesystems.org, probably one of the more competent forums, specifically as it relates to conroe. The level of discussion never degraded to AMD vs Intel fanboys' chat.
- - -
my current rig(future all-linux pc): 2.4 northwood oc'd to 3.2/albatron 865 PE Pro II, etc. still runs absolutely anything for windows out there with very reasonable benchmarks. Alas, not 64 bit.

1:13 PM, June 21, 2006  
Anonymous Anonymous said...

ConRow is just a scam. did you see intc not using sysmark!!!!!!!!!!!!!!!!!!!!!
ConRow is a rehashed PENTIUM III. Only Stu Ped would buy it. if you want an upgradeable platform go AM2!!!! else go INTC's Corpse duover, with an new procsesor evrey month!!! i think you just got blown out of the water

10:57 PM, June 23, 2006  
Anonymous Anonymous said...

which will run mathematica faster clock per clock, conroe or AMD64 (Athlon, FX)?

9:11 PM, July 13, 2006  
Anonymous Anonymous said...

Just saw some benchies over at Xbitlabs that kinda puts the doubt out of my mind. Looks like Core2 runs faster in some things under XP64, just like AMD. AMD gets a bigger bump in a few things. BUT MANY OF THE ACTUAL APPS ARE SLOWER IN 64 BIT.

Why? Because EVEN freaking Microsoft says in their information that 64 bit is to access more than 4 GB of memory. It is not related to performance at all, per se, except what you get by being able to access more memory. Microsoft is the authority here as both AMD and Intel must live in the Windows world. So the long and the short of it is both processors support 64 bit instructions. End of story. Move on. Nothing else to see here.

9:47 PM, July 29, 2006  
Anonymous Anonymous said...

This is about the dumbest thread I've seen in a long time.
and this sharikou, phd guy? Professor of bullshit.

6:54 AM, October 06, 2006  
Blogger Tirth Kadivar said...

Hi! This is my first comment here so I just wanted to give a quick shout out and say I genuinely enjoy reading your blog posts. Can you recommend any other Beauty Guest Post blogs that go over the same topics? Thanks a ton!

10:59 PM, September 09, 2020  

Post a Comment

<< Home