Monday, November 3, 2008

Performance Analysis for Core 2 and K8

One of the interesting factors is the substantial difference in miss rates between the two cache designs, which is influenced by the underlying memory subsystems. Intel's unloaded memory latency is around 55-60ns, while AMD's is closer to 40ns and should also scale much better under load. Unfortunately, there is no data available on the loaded latency for the respective CPUs, but a reasonable guess would be that Intel's loaded latency is 40-70% higher. Given that guess, we can come close to estimating the average latency contribution from L2 misses. Intel has half the number of misses (2 vs. 4) per thousand instructions retired, but 40% higher latency. That implies that Intel's average memory latency contribution from L2 misses is 75% of AMD's (or 80% if we assume Intel's L2 latency is 70% higher). Of course, this is only looking at one aspect of the situation - it ignores the impact of the L1 caches, where AMD tends to have an advantage due to larger capacity. But it's certainly an area that could contribute to the performance difference between the K8 and the Core 2 and definitely does contribute to the power differences.

No comments: