Saturday, November 19, 2005

INTEL Sossaman will suffer from cache thrashing and be slower than single core

When INTEL first revealed that its Yonah dual core chip will have shared L2 cache, which Paul Ottellini touted as an advantage over AMD, I pointed out in a Yahoo message posting that such design will lead to cache thrashing and totally destroy performance for real life heavy duty applications.

The reason is simple: suppose two seperate application instances or threads each needs to process large memory blocks at a time to operate efficiently. Each will find the memory it needs not in cache, and therefore cause the eviction of the other's cache to make room to bring in its own memory. As the two cores trampling on each other's cache copy of memory like an ultra fast ping-pong game, they can do little useful work. Clearly, this will result in performance degradation to a level lower than single core. In single core, each application will run continuously for a time quantum, say 0.001 second, or 2 millions clock cycles, the application enjoys the cache almost exclusively during the time quantum. Although the next process will wipe out previous cached memory content, the effect is averaged out over the cycles.

Cache was used to cure the problem with memory latency -- in INTEL CPUs, it takes nearly 400 clock cycles to bring memory to the CPU, during this time the CPU must wait. With AMD's on die memory controller, the latency is reduced to 30% or less. The cache cures this latency problem somewhat by exploiting the locality of the applications and copying blocks of memory into the core before hand. With cache thrashing happening, the cache is made pretty much useless.

zdnet.com has reported similar findings by Microsoft on SQL Server performance on hyperthreading INTEL CPUs. The phenomenon of lower performance with hyperthreading on was well known, what is interesting is the analysis, Microsoft developer argues that the thrashing of the shared L1 & L2 cache among the virtual cores is the culprit.

When you share, you also fight, just like two kids sharing a single toy will fight for it. Thrashing is well known problem in paged memory multi-tasking systems, when it happens, too many tasks compets for memory and the system spends most of its time in paging activity between the memory and the disk. From a user's view, the system basically comes to a halt. With shared cache multi-core designs like INTEL Yonah, multiple cores fight for cache. Cache thrashing will become a common phenomenon inside the INTEL CPUs.

Sossaman is a 32 bit server chip derived from Yonah, so we can expect Sossaman to perform worse than single core in a lot of heavy duty server applications.

1 Comments:

Anonymous Anonymous said...

It really depends on the cache associativity. It will not thrash if it is at least 2-way set associative cache.

3:10 PM, April 24, 2006  

Post a Comment

<< Home