AbstractThe widening gap between processor speed and main memory speed has generated interest in compile-time optimizations to improve memory locality. The possibility that processors will continue to outpace memory systems raises the question of whether these techniques can be used to produce ever higher degrees of locality. In this article, we discuss techniques that can be used to produce scalable locality": locality that can be adjusted for any ratio of processor and memory speeds. We identify a class of calculations for which existing techniques do not generally produce scalable locality, and give an algorithm for obtaining scalable locality for a subset of this class. We include empirical evidence that this transformation can provide dramatic speedups when processor speed is far higher than memory speed.
RightsThis Item is protected by copyright and/or related rights.You are free to use this Item in any way that is permitted by the copyright and related rights legislation that applies to your use.For other uses you need to obtain permission from the rights-holder(s).