Friday, June 23, 2006

My Guess on Direct Connect Architecture 2.0

It will be a Massive Multi-Scalar. Basically, you direct connect all execution pipelines in a multi-core multi-processors system into one massive pool of computation resource.

You should notice that on AMD's slides, there is no mentioning of multi-core beyond 2008. Why? The following is my educated guess.

A future AMD64 CPU will be single core, but with massive number of pipelines, say, 128. Each pipeline will appear to be a processor to the software. But these pipelines can group and ungroup dynamically within a time quantum. So you can have a single 128-issue processor, or 128 one issue processors, or one 127-issue and one 1-issue, or 32 four issue processors, or any combination, depending on the current computation pattern. Not only this, you can mix and match different pipelines, say, 9 FP and 1 integer, or 16 intger and 0 FP.

It doesn't stop here.

With DCA2.0, multiple CPUs can merge their massive number of pipelines to create one giant multi-scalar machine. Imagine eight such CPUs, total 1024 pipelines....

OK. The above might be my dream, but I bet it's the future direction of CPU architecture.

23 Comments:

Anonymous Anonymous said...

That is quite interesting and I can see it happening. I think 128 issue is a little bit optimistic to say the least considering the number of execution units, buffers, and other resources needed to sustain that. An 8 core 24 issue equivalent is more likely in the near term. Even then, 45nm will be needed.

12:02 PM, June 23, 2006  
Blogger "Mad Mod" Mike said...

I think you've gone crazy shari, lol. But good thinking anyhow.

12:43 PM, June 23, 2006  
Anonymous Anonymous said...

Why 128? why not 256 or 1000?

My point is even guesses have to be based on some information and i'd like to know what info you're basing you guess on.

1:12 PM, June 23, 2006  
Anonymous Anonymous said...

A dream is a dream. You don't need to base on anything to dream. Be creative..

dream on...

2:02 PM, June 23, 2006  
Anonymous Anonymous said...

It's just theorizing what the future will hold. No harm in that.

2:19 PM, June 23, 2006  
Anonymous Anonymous said...

i'd like to know what info you're basing you guess on.

did yuo read till the last line?

OK. The above might be my dream

2:29 PM, June 23, 2006  
Anonymous Anonymous said...

and the houses will be made of chocolate!

3:59 PM, June 23, 2006  
Anonymous Anonymous said...

Dude, you'd better delete your post! Or you'll get a zillion messages saying you're crazy.

6:07 PM, June 23, 2006  
Anonymous Anonymous said...

If the reverse hyper-threading works sharkou is being realistic...

8:11 PM, June 23, 2006  
Anonymous Anonymous said...

The Wright Brothers had a dream. So did Henry Ford. Look at what was accomplished by them? We cant live without those means of transportation.

Doc...just let the dudes at AMD know about your dream/idea. I am sure they will make it happen.

For the rest of you none believers......join the Intel camp. Theres no way they can do 64 bits....Oh Shiza they did it. What it has a built in memory controller? No need we will pump our chips with a larger L2 cache on our way to 10 GHZ. A hyper-transport bus what? The FSB is not the bottleneck its the video card. What does the NX flag do. Sounds interesting lets copy it. Intel is on its way to 10GHZ.
AMD can keep dreaming.
ROFL

8:22 PM, June 23, 2006  
Blogger Eddie said...

(Eddie/Chicagraf0/Todospara1)

It can get even better, through hypertransport, you could have other processors which would complement the main processor.

And with the modular architecture for cores, you could have a line, let's say, of Opterons, of three floating point pipelines, four, one, etc.

10:46 PM, June 23, 2006  
Anonymous Anonymous said...

Sharikou, I'm sorry, but this is not an "educated" guess.

128 threads? How many applications scale to 128 threads? What sort of memory bandwidth is required to keep 128 (non-trivial) threads busy? Sun's Niagara with 32 threads is only good for a few throughput benchmarks and it's already fallen behind dual-procs with four threads.

128-wide issue? Where in the world are you going to find 128 instructions without dependencies? Even VLIW isn't that crazy. Besides, I think you stated that AMD found three-wide issue to be the sweet spot and that Merom's four-wide is not a threat. BTW, what does this mean about the potential of six-wide issue with "reverse hyperthreading"?

9:10 AM, June 24, 2006  
Anonymous Anonymous said...

You have seen SLI and graphics pipilines and think that a CPU is the same as a GPU. Stop putting stupid things in your blog or people'll think you are stupid.

9:17 AM, June 24, 2006  
Anonymous Anonymous said...

"128 threads? How many applications scale to 128 threads? What sort of memory bandwidth is required to keep 128 (non-trivial) threads busy? Sun's Niagara with 32 threads is only good for a few throughput benchmarks and it's already fallen behind dual-procs with four threads."

u r talking about today. there will be tmrw as well.

2:32 PM, June 24, 2006  
Anonymous Anonymous said...

" Sharikou, I'm sorry, but this is not an "educated" guess.

128 threads? How many applications scale to 128 threads? What sort of memory bandwidth is required to keep 128 (non-trivial) threads busy? Sun's Niagara with 32 threads is only good for a few throughput benchmarks and it's already fallen behind dual-procs with four threads.

128-wide issue? Where in the world are you going to find 128 instructions without dependencies? Even VLIW isn't that crazy. Besides, I think you stated that AMD found three-wide issue to be the sweet spot and that Merom's four-wide is not a threat. BTW, what does this mean about the potential of six-wide issue with "reverse hyperthreading"?"


My Windows XP system typically runs 550-800 threads depending on what apps are active.

Operating systems of the future will be constructed differently and have orders of magnitude greater use of threads vs. the coarse designs of today. These operating systems, and the apps built to the new frameworks that these operating systems support, will need more threads to perform well.

And then there are large data sets which support data parallelization. Which are becoming ever more common.

Not to mention games.

The needs for CPU horsepower are there. But the current operating systems, like old cars, cannot go fast because they were not designed to.

I'm looking forward to a real processor and so far neither AMD or Intel has delivered. But AMD looks like they have a lot more clues than Intel on what to do and how to get there.

3:02 PM, June 24, 2006  
Anonymous Anonymous said...

"My Windows XP system typically runs 550-800 threads depending on what apps are active."

These threads aren't running AT THE SAME TIME. They're all fine sharing the one hardware thread you have in your computer, aren't they? I suggest you pick up some basic computer science books while you're waiting for a "real processor".

5:22 PM, June 24, 2006  
Anonymous Anonymous said...

"u r talking about today. there will be tmrw as well."

If your "tmrw" is 2050, please feel free to dream. But Sharikou is talking about 2008. That's two years away. Maybe we'll get Vista SP1 by then.

5:30 PM, June 24, 2006  
Anonymous Anonymous said...

"These threads aren't running AT THE SAME TIME. They're all fine sharing the one hardware thread you have in your computer, aren't they? I suggest you pick up some basic computer science books while you're waiting for a "real processor"."

My point was that even today there are many threads that *exist* in a normal system. They *can't* run concurrently well, if at all, because the OS and apps are *not designed for it*.

*When* there is a *redesigned* OS that is actively threaded, then there will be *much more* concurrent thread-based processing.

On the whole, your argument sounds so much like "you'll never need more than 640K"... and we all know how short-sighted that view was.

9:28 PM, June 24, 2006  
Anonymous Anonymous said...

My point was that even today there are many threads that *exist* in a normal system.

Your point is not the point in this discussion. There *exist* 10+ million cars in LA and the freeways still have 12 lanes. The topic is freeways, not cars.

They *can't* run concurrently well, if at all, because the OS and apps are *not designed for it*.

No, they don't run concurrently because there's not much need for them to run concurrently. Most of the 800 threads you see are specialized to do something once in a while.

*When* there is a *redesigned* OS that is actively threaded, then there will be *much more* concurrent thread-based processing.

What are you talking about? You don't even have a clue that Windows supports 64 active threads for servers. Here's a hint: you probably want more application threading, not OS. Unless you're planning on playing 64 games at a time. But that wouldn't leave you much time to dream, would it?

12:41 AM, June 25, 2006  
Anonymous Anonymous said...

"Your point is not the point in this discussion. There *exist* 10+ million cars in LA and the freeways still have 12 lanes. The topic is freeways, not cars."

You seem to be supporting my point:

All those cars and only 12 lanes. Now if there were 128 lanes, many more interesting and efficient traffic flows would be possible.

I thought Windows XP x64 also supported 64 threads as it is based on Server 2003. Is this right?

If so, Microsoft is making some of the early changes that will be necessary for future processors.

1:04 AM, June 25, 2006  
Anonymous Anonymous said...

You must be crazy to think there can be 128-issue arranged as 128 cores. It's a guess but clearly not an educated guess (at least not in computer architecture)!

For one thing, how do you arrange the cache and maintain consistency among 128 issues? Second, how do you share registers, TLB, branch prediction, reorder buffer and commit logic, etc.? Third, how do you layout these cores such that signals from one to the other end do not take more than 2-3 cycles? Is that even possible (you do know light doesn't travel with infinite speed)?

You might argue that this is dream for the tomorrow! On tomorrow CPUs don't need all those craps above! Yeah right, and tomorrow cars can eat junk food and fly. It is fine for science fiction, ... and that's all.

8:12 AM, June 25, 2006  
Anonymous Anonymous said...

How many applications scale to 128 threads?

I'd say that most applications can easily scale to this amount. Most web servers for example come pre-configured to use 100+ threads.

No, they don't run concurrently because there's not much need for them to run concurrently.

If there wasn't any need for them to run concurrently, then we wouldn't but the effort in creating threaded applications... think about what you just said!

Most of the 800 threads you see are specialized to do something once in a while.

Unfortunatly you are 'partially' right, ant that is Microsoft's way of programming threaded applications (along with many other bad programmers). This just bloats a computer with useless threads.

All and all, I'm a very big skeptic in regards to AMD actually having something usable in the near term (1-2 years) which would actually be usefull in parallelizing threads.

But you can't balme Sharikou for having a vision!

8:55 PM, June 25, 2006  
Anonymous Anonymous said...

Sorry, but if someone believes that a 128-issue-CPU is possible or that it would make sense, he has simply NO competencies in CPU-architectures.
There's not a single app that can give a CPU 128 independent dependency-chains to execute such a lot of instructions in parallel.
And it's also not possible to reorder such a enormous amout of instruction-chains to the execution-units at a noteworthy clock-frequency.

6:05 PM, July 06, 2006  

Post a Comment

<< Home