Tuesday, April 25, 2006

Some ideas on utilizing HT 3.0

HT 3.0 is very exciting stuff. HT 3.0 is hot pluggable and link length can be 1 meter long.

I envision in the near future, AMD PCes will come with a cache coherent HT 3.0 port, much like a i1394 port.

You can then buy specific acceleration chips to insert into the port. For instance, there could be a chip for DOOM, you buy the game, it can play fine. But there is also an acceleration chip which you can insert into the HT 3.0 port. The CPU recognizes the chip and delegates some computation, such as fragging a monster, to the dedicated hardware. Take another more realistic case: video encoding. Video encoding is very CPU intensive if done in software, if one can get a MPEG HT 3.0 chip that can be plugged into the PC that performs realtime MPEG encoding, it will be a huge boost. Starting from this idea, for each video format, one can have a specific HT 3.0 chip for it. Or look at PhotoShop, you can buy a chip that does the expensive image processing algorithms. The artists will love it. For your web server, you can get an XML parser chip... In principle, any piece of software can have a dedicated booster chip, as long as there is a market for it.

The beauty of the HT 3.0 cards is that you can remove them from one PC to another with ease. So you use them on your desktop or you can take your booster chips on the road and boost your Turion 64 notebook.

Now, suppose you have two AMD64 PCes both with HT 3.0 ports, you can then buy a HT 3.0 cable to link them together, and immediately you get a SMP boxes with double CPU power and double memory....

That should be a lot of good business for everyone.

8 Comments:

Blogger netrama said...

I think there is where Technology is headed..and no this is not the "technology" like the word used in "Vpro Technology" ..Intel should be ashamed :-)

I see FPGA's (+DSP) based acceleration modules forming direct inserts into a HT Interface. End users could buy the 'Load' files for to configure these modules. A Load is nothing but a little file that tells the Module how to configure itself to accelerates the alogorithms of the corresponding software. A gamer might be interested in getting a 'Load' for his favorite game. Studios might want to say ..get a MPGE4 Encoding 'Load' for the Module

Once a Common FPGA Platform is established to go into the HT..there could be 'Load' files for all apps needing intensive math, vector and DSP Processings !!

As a fule of thumb...anything in software is atleast 10 times slower than Hardware..so you can already imagine the benefits.

“WELCOME” to the Fall of the General Purpose Processors as we know it !!!

11:34 AM, April 25, 2006  
Anonymous Anonymous said...

I find it rather horrible to have a bulk of specialized HT modules you have to switch between, according to the task at issue. This is only of use for very specialized environments where only one module does the same job all the time.
For multiple purpose application, the way to go is FPGA. Namely FPGA in conjunction with a profiling software that generates appropriate FPGA programming during the execution of the software you want to accelerate. Similar to the way the DEC FX!32 x86 emulator once generated Alpha microcode.
Imagine one profiling run with an H.264 encoder; profiling will slow down the process remarkably. But afterwards, you can execute all these complicated transcendental matrix operations and fourier transformations _in_hardware_, speeding up the process 10x - at least.
I know this works.

11:44 AM, April 25, 2006  
Anonymous Anonymous said...

Hi Doctor,

Can you explain which part of this technology is patented? There are many comments that Intel should start utilizing HyperTransport. Is that even possible considering that AMD owns a patent for a certain aspect of this technology?

Thanks.

12:16 PM, April 25, 2006  
Blogger Sharikou, Ph. D. said...

The ASIC HT 3.0 cards can be mass produced and will be very cheap. FPGAs are slower, have fewer gates and more expensive, but they can be re-porgrammed. So, there will be a trade off.

12:44 PM, April 25, 2006  
Blogger Sharikou, Ph. D. said...

"There are many comments that Intel should start utilizing HyperTransport."

The non coherent HyperTransport is open to everyone via the HyperTransport consortium, it's for I/O only. However, cache coherent HT (ccHT) is needed for what we are discussing here. ccHT is AMD's crown jewel and Intel can't get it, at least not for free.

12:47 PM, April 25, 2006  
Anonymous Anonymous said...

ASIC accelerators make sense in professional use where the main goal is to make a single application faster. Bet yet, even there it is questionable to a certain degree - think of new versions of an application, maybe always needing a new accelerator module. This could relativise the cheapness of ASIC quite quickly, besides you would have to wait for your module to be produced.
Therefore I clearly prefer the FPGA way - but only if connected with a viable profiler / program generator.

1:37 PM, April 25, 2006  
Anonymous Anonymous said...

For ability to add an FPGA coprocessor into an Opteron socket, check:

http://www.xtremedatainc.com/Products.html

11:55 AM, May 01, 2006  
Anonymous Anonymous said...

One idea I think of that could utilize such an HT 3.0 is that users could connect multiple individual PCs with HT links and immediately get a NUMA machine. And with virtualization and some OS support it's probably even possible to control one box from another via some "root" access passing through the HT link.

11:16 PM, May 01, 2006  

Post a Comment

<< Home