Tuesday, May 23, 2006

A cheap solution to frag Conroe in single threaded 32 bit apps

Intel's Conroe/Woodcrest/Clovertown doesn't show any IPC advantage over Athlon 64 under 64bit mode. However, Conroe's 4MB cache is definitely making a difference in 32 bit single threaded loads.
AMD can't do 4x2MB cache, too expensive.

One possible solution is this
1) dual core, independent clocks like the Turion X2
2) one core has 256KB L2, the other has 6MB L2.
3) clock the smaller core at 1.8GHZ (about 15 watts).
4) clock the core with larger cache at 4GHZ (75 watts)

I will call this AMM (Asymmetric Multi-core Marchitecture).

This is a cheap solution that can frag Intel to death in most gaming and single threaded 32 bit loads. In 2007, the problem will go away as Windows Vista enters the scene. Under 64 bit mode, K8 is about 10% faster than Conroe at the same clock.

19 Comments:

Anonymous Anonymous said...

I think that FX64 is going to have ZRAM cache L2/L3 so the die size will be the same and SuperPi will be faster...
(e.g. L2 2x2MB + 4MB L3 shared cache).
If you look at K8L photo, it HAS ZRAM so this kind of transistor can be produced now. It does not require any architectural changes. Cache is cache.

2:40 PM, May 23, 2006  
Anonymous Anonymous said...

Sorry, I'm confused... but I'm also a novice, so correct me if I'm wrong. Can AMD even produce a K8 core that runs stable at 4 ghz? And would applications at this time be able to distinguish between the two cores? What if a game made the mistake of running its single thread on the smaller, underclocked core... Otherwise, your proposal sounds very good. The slower core would be more than enough to handle background windows processes or AI and physics calculations in games, while the faster core would plow through pretty much anything. : D

2:44 PM, May 23, 2006  
Blogger Sharikou, Ph. D. said...

Can AMD even produce a K8 core that runs stable at 4 ghz?

This has to be on a 65nm process. Although AMD is still tuning the 65nm yields, it shouldn't have any problem making small quantities of high end parts.

As for scheduling, without adding logic to the CPU, I think there can be a CPU driver to change scheduling behaviour...

2:51 PM, May 23, 2006  
Anonymous Anonymous said...

It took over 5 months to achieve first production 90nm yields, so in Q3 AMD should say more about 65nm transision and first parts. But 4GHz CPU means 33% faster clock and much more heat.
It is not possible. AMD will survive 2006 September with faster Conroe.
The real question is what will happen with Opteron/Woodcrest competition during next months.

2:55 PM, May 23, 2006  
Blogger Sharikou, Ph. D. said...

The real question is what will happen with Opteron/Woodcrest competition during next months.


We saw from the 64 bit benchmarks that Woodcrest is no threat. AMD64 runs 20-40% faster under 64 bit. The issue right now is 32 bit performance on Windows desktop. Of course, as we get into 2007, the problem also goes away because Windows Vista will be available by then and benchmarks will be run under 64 bit.

3:02 PM, May 23, 2006  
Anonymous Anonymous said...

"AMD64 runs 20-40% faster under 64 bit."

Where is that piece of info?

3:23 PM, May 23, 2006  
Anonymous Anonymous said...

I agree... somewhat. #1 is a great idea, we should see it in new releases of desktop X2 processors. #2 may be helpful, but more effective by a half rule... 2mb, 1mb, 512k, 256k would be good step points. #3 Not! stick with #1 or enable sleep mode if not needed. #4 Not! stick with #1. Assymetry is a good thing, but a short term solution to a problem that will dissappear with 64 bit acceptance.

3:30 PM, May 23, 2006  
Anonymous Anonymous said...

my engineering sample of k8 isn't currently running at a clockspeed even near 4ghz. will take till next year for that border to reach...

4:09 PM, May 23, 2006  
Anonymous Anonymous said...

The cheap solution to Frag conroe title goes against the earlier

AM2 party time. Frags readied for CONroe

title. If the former was true you would not have to invent a new architecture to compete!

Come on, at least once say something good about Intel .. it will not hurt too much :)

11:40 PM, May 23, 2006  
Anonymous Anonymous said...

What kind of performance would the K8 get if AMD provided a way to bypass most of the front end and issue MacroOps directly to the execution core, VLIW style? I rather suspect K8-VLIW would beat the living crap out of Conroe. Done right, the front end could be left around for use executing legacy code to maintain backward compatibility with x86 and AMD64.

Offering x86, x86-64 and some kind of VLIW instruction set in the same chip would leave Intel in an even worse position than it is in currently as Intel would be left a complete generation behind in the instruction set race.

11:49 PM, May 23, 2006  
Anonymous Anonymous said...

When AMD bought NextGen to rescue them from the debacle that the K6 was, NextGen had to things going:

They had the capability to issue Micro-Ops directly bypasing the CISC ISA.

They wanted to debut their own SIMD instructions ahead of IntelĀ“s MMX.

AMD Scraped both... so why makes you think AMD will do it now?

Oh, and what you sugest (clocking cores diferently, et al) is similar to benchmark tunjning. What is even more, if you can produce 1 core running at 4Ghz, you can make both go at 4Ghz and let the power saving features of the processor reduce the clockspeed.

You have to take a quick read on your Micro-architecture 101 notes man!

3:48 AM, May 24, 2006  
Anonymous Anonymous said...

"Offering x86, x86-64 and some kind of VLIW instruction set in the same chip"

1)It'll be too complicated. Itanium only tried to put x86 into VLIW, and we all see how complicated it is.

2)Without software support VLIW will perform poorly. Then it doesn't worth spending area/power on it.

9:16 AM, May 24, 2006  
Anonymous Anonymous said...

I finally found a Woodcrest 2P benchmark using SPECjbb. A mere 20% improvement over Dempsey.

http://www.realworldtech.com/page.cfm?ArticleID=RWT052306090721&p=9

9:47 AM, May 24, 2006  
Anonymous Anonymous said...

Fuck off and die, AMD fanboy

12:48 PM, May 24, 2006  
Anonymous Anonymous said...

"Fuck off and die, AMD fanboy"

I find such response very funny. Without AMD and people who recognize and support its good works, those Intel fanboys will not only die (intellectually) but also die poor.

Had not been AMD's innovations, those people would be buying $1k+ slow Itanium chips, or 4Ghz P4 with 1000W power supply and air conditioning just for the PC.

5:48 PM, May 24, 2006  
Anonymous Anonymous said...

"Had not been AMD's innovations, those people would be buying $1k+ slow Itanium chips, or 4Ghz P4 with 1000W power supply and air conditioning just for the PC."

trust me there will be somebody else if there's no AMD...

7:14 PM, May 24, 2006  
Anonymous Anonymous said...

"trust me there will be somebody else if there's no AMD..."

or not; does the word "Microsoft" ring a bell?

5:44 AM, May 25, 2006  
Anonymous Anonymous said...

"trust me there will be somebody else if there's no AMD..."

And those Intel fanboys will be wishing that "somebody else" to "fuck off and die."

The name is not important. Attitude toward competition is.

9:28 AM, May 25, 2006  
Anonymous Anonymous said...

Competition is one thing, delusion is another.

All I see on this website are AMD fanboys writing off Conroe and predicting Intel's demise. All this theoretical stuff about how AMD will crush Conroe by doing xyz is exactly that - theoretical.

When Conroe is released AMD will have NONE of the abovementioned features and will most likely lose in the vast majority of benchmarks.

Whether those benchmarks are 'doctored' or 'biased' is up for debate, and I'm sure AMD fans here will vigourously debate that once Conroe is officially released and more concrete benchmarks are available.

4:41 AM, June 21, 2006  

Post a Comment

<< Home