Optimizing graph algorithms for improved cache performance
暂无分享,去创建一个
[1] Mithuna Thottethodi,et al. Nonlinear array layouts for hierarchical memory systems , 1999, ICS '99.
[2] James R. Larus,et al. Cache-conscious structure layout , 1999, PLDI '99.
[3] Ronald L. Rivest,et al. Introduction to Algorithms , 1990 .
[4] Viktor K. Prasanna,et al. Optimizing graph algorithms for improved cache performance , 2002, IEEE Transactions on Parallel and Distributed Systems.
[5] Michael Brenner,et al. Multiagent Planning with Partially Ordered Temporal Plans , 2003, IJCAI.
[6] Viktor K. Prasanna,et al. Tiling, Block Data Layout, and Memory Hierarchy Performance , 2003, IEEE Trans. Parallel Distributed Syst..
[7] H. T. Kung,et al. I/O complexity: The red-blue pebble game , 1981, STOC '81.
[8] Sabih H. Gerez,et al. Algorithms for VLSI design automation , 1998 .
[9] Yves Robert,et al. Loop partitioning versus tiling for cache-based multiprocessors , 1998 .
[10] Viktor K. Prasanna,et al. Cache-friendly implementations of transitive closure , 2001, Proceedings 2001 International Conference on Parallel Architectures and Compilation Techniques.
[11] Wilson C. Hsieh,et al. Impulse: Memory system support for scientific applications , 1999, Sci. Program..
[12] Jeremy D. Frens,et al. Auto-blocking matrix-multiplication or tracking BLAS3 performance from source code , 1997, PPOPP '97.
[13] Chau-Wen Tseng,et al. Data transformations for eliminating conflict misses , 1998, PLDI.
[14] Richard E. Ladner,et al. The influence of caches on the performance of heaps , 1996, JEAL.
[15] Sartaj Sahni,et al. Data Structures, Algorithms and Applications in Java , 1998 .
[16] James R. Larus,et al. Making Pointer-Based Data Structures Cache Conscious , 2000, Computer.
[17] Peter M. Kogge,et al. The Characterization of Data Intensive Memory Workloads on Distributed PIM Systems , 2000, Intelligent Memory Systems.
[18] Nikil D. Dutt,et al. Memory data organization for improved cache performance in embedded processor applications , 1997, TODE.
[19] Sandeep Sen,et al. Towards a theory of cache-efficient algorithms , 2000, SODA '00.
[20] Jack J. Dongarra,et al. Automatically Tuned Linear Algebra Software , 1998, Proceedings of the IEEE/ACM SC98 Conference.
[21] Mihalis Yannakakis,et al. Graph-theoretic methods in database theory , 1990, PODS '90.
[22] Matteo Frigo,et al. Cache-oblivious algorithms , 1999, 40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039).
[23] Dimitri P. Bertsekas,et al. Data Networks , 1986 .
[24] Mateo Valero,et al. Eliminating cache conflict misses through XOR-based placement functions , 1997, ICS '97.
[25] Monica S. Lam,et al. The cache performance and optimizations of blocked algorithms , 1991, ASPLOS IV.
[26] Sartaj Sahni,et al. A Blocked All-Pairs Shortest-Path Algorithm , 2000, SWAT.
[27] Peter Sanders,et al. Fast priority queues for cached memory , 1999, JEAL.
[28] David A. Patterson,et al. Computer Architecture: A Quantitative Approach , 1969 .
[29] Peter J. Varman,et al. Optimal prefetching and caching for parallel I/O sytems , 2001, SPAA '01.
[30] Viktor K. Prasanna,et al. Analysis of memory hierarchy performance of block data layout , 2002, Proceedings International Conference on Parallel Processing.
[31] David A. Patterson,et al. Computer architecture (2nd ed.): a quantitative approach , 1996 .
[32] Sally A. McKee,et al. Caches as filters: a new approach to cache analysis , 1998, Proceedings. Sixth International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems (Cat. No.98TB100247).
[33] Todd M. Austin,et al. The SimpleScalar tool set, version 2.0 , 1997, CARN.
[34] Viktor K. Prasanna,et al. Dynamic data layouts for cache-conscious factorization of DFT , 2000, Proceedings 14th International Parallel and Distributed Processing Symposium. IPDPS 2000.
[35] M. Kanehisa,et al. Extraction of correlated gene clusters by multiple graph comparison. , 2001, Genome informatics. International Conference on Genome Informatics.