You are here

P Sadayappan

  • Professor, Computer Science & Engineering
  • 591 Dreese Laboratories
    2015 Neil Ave
    Columbus, OH 43210
  • 614-292-0053

Honors

  • May, 2016

    Joel & Ruth Spira Excellence in Teaching Award.

  • April, 2015

    Outstanding Teaching Award.

  • January, 2008

    Outstanding Teaching Award.

  • January, 2008

    Lumley Research Award.

  • January, 2006

    Outstanding Service Award.

  • January, 2004

    Best Paper Award.

  • January, 2003

    Best Paper Award.

  • January, 2002

    Lumley Research Award.

  • January, 1999

    Outstanding Service Award.

  • January, 1997

    Lumley Research Award.

Journal Articles

2015

  • Venmugil Elango, Naser Sedaghati, Fabrice Rastello, Louis-Noël Pouchet, J. Ramanujam, Radu Teodorescu, P. Sadayappan, 2015, "On Using the Roofline Model with Lower Bounds on Data Movement." ACM Transactions on Architecture and Code Optimization 11, no. 4, 67:1-67:23 - 67:1-67:23.
  • Martin Kong, Antoniu Pop, Louis-Noël Pouchet, R. Govindarajan, Albert Cohen, P. Sadayappan, 2015, "Compiler/Runtime Framework for Dynamic Dataflow Parallelization of Tiled Programs." ACM Transactions on Architecture and Code Optimization 11, no. 4, 61:1-61:30 - 61:1-61:30.

2013

  • Park,Eunjung; Cavazos,John; Pouchet,Louis-Noel; Bastoul,Cedric; Cohen,Albert; Sadayappan,P, 2013, "Predictive Modeling in a Polyhedral Optimization Space." INTERNATIONAL JOURNAL OF PARALLEL PROGRAMMING 41, no. 5, 704-750 - 704-750.
  • Fauzia,Naznin; Elango,Venmugil; Ravishankar,Mahesh; Ramanujam,J; Rastello,Fabrice; Rountev,Atanas; Pouchet,Louis-Noel; Sadayappan,P, 2013, "Beyond Reuse Distance Analysis: Dynamic Analysis for Characterization of Data Locality Potential." ACM TRANSACTIONS ON ARCHITECTURE AND CODE OPTIMIZATION 10, no. 4, 53 - 53.

2012

  • Kevin Stock, Louis-Noël Pouchet, and P. Sadayappan, 2012, "Using machine learning to improve automatic vectorization." ACM Transactions on Architecture and Code Optimization 8, no. 4, 50 - 50.
  • Sadayappan, P.; Ramanujam, J., 2012, "Chairs' welcome." ACM SIGPLAN Notices 47, no. 8,

2011

  • Naga Vydyanathan, Ümit V. Çatalyürek, Tahsin M. Kurç, P Sadayappan, and Joel H. Saltz, 2011, "Optimizing latency and throughput of application workflows on clusters." Parallel Computing 37, no. 10-11, 694-712 - 694-712.

2009

  • Vydyanathan, N.; Krishnamoorthy, S.; Sabin, G. M.; Catalyurek, U. V.; Kurc, T.; Saltz, J. H.; Catalyurek, U. V.; Sadayappan, P.; Saltz, J. H., 2009, "An integrated approach to locality-conscious processor allocation and scheduling of mixed-parallel applications." IEEE Transactions on Parallel and Distributed Systems 20, no. 8, 1158-1172 - 1158-1172.

2006

  • Engelmann, C.; Scott, S.L.; Bernholdt, D.E.; Gottumukkala, N.R. et al., 2006, "MOLAR: Adaptive runtime support for high-end computing operating and runtime systems." Operating Systems Review (ACM) 40, no. 2, 63-72 - 63-72.

2004

  • Tseng, Y.C.; Lai, T.H.; Sadayappan, P.; Lin, Y.B., 2004, "Journal of information science and engineering: Foreword." Journal of Information Science and Engineering 20, no. 3,
  • Krishnamoorthy, S.; Baumgartner, G.; Cociorva, D.; Lam, C.C. et al., 2004, "Efficient parallel out-of-core matrix transposition." International Journal of High Performance Computing and Networking 2, no. 2-4, 110-119 - 110-119.
  • Srinivasan, S.; Krishnamoorthy, S.; Sadayappan, P., 2004, "Robust scheduling of moldable parallel jobs." International Journal of High Performance Computing and Networking 2, no. 2-4, 120-132 - 120-132.

1997

  • S. K. S. Gupta, C.-H. Huang, P. Sadayappan, and R. W. Johnson, 1997, "“A Technique for Overlapping Computation and Communication for Block Recursive Algorithms”." Concurrency: Practiceand Experience 9, no. 12,
  • C. Lam, C.-H. Huang and P. Sadayappan, 1997, "“Optimal Algorithms for All-to-all Complete Exchange on Rings and Tori"." Journal of Parallel and Distributed Computing 43, no. 1, 1-13 - 1-13.
  • C. Lam, P. Sadayappan and R. Wenger, 1997, "”On Optimizing a Class of Multi-Dimensional Loops with Reductions for Parallel Execution”." Parallel Processing Letters 7, no. 2, 157-168 - 157-168.

1996

  • Y. S. Choi-Grogan, K. Eswar, P. Sadayappan, and R. Lee, 1996, "“Sequential and Parallel Implementations of a Partitioning Finite Element Method,”." IEEE Transactions on Antennas and Propagation 44, no. 12, 1609-1616 - 1609-1616.

1995

  • Kumar, B.; Huang, C.H.; Sadayappan, P.; Johnsson, R.W., 1995, "A tensor product formulation of strassen’s matrix multiplication algorithm with memory reduction." Scientific Programming 4, no. 4, 275-289 - 275-289.
  • Kumar, B.; Huang, C.H.; Sadayappan, P.; Johnson, R.W., 1995, "A Tensor Product Formulation of Strassen’s Matrix Multiplication Algorithm with Memory Reduction." Scientific Programming 4, no. 4, 275-289 - 275-289.

1994

  • Gupta, S.K.S.; Huang, C.H.; Sadayappan, P.; Johnson, R.W., 1994, "Implementing fast Fourier transforms on distributed-memory multiprocessors using data redistributions." Parallel processing letters 4, no. 4, 477-488 - 477-488.

1993

  • Sharma, S.; Huang, C.H.; Sadayappan, P., 1993, "On Data Dependence Analysis for Compiling Programs on Distributed-Memory Machines (Extended Abstract)." ACM SIGPLAN Notices 28, no. 1, 13-16 - 13-16.

1992

  • Ramanujam, J.; Sadayappan, P., 1992, "Iteration Space Tiling for Distributed Memory Machines." Advances in Parallel Computing 3, no. C, 255-270 - 255-270.

1987

  • Sadayappan, P.; Ercal, F., 1987, "Nearest-Neighbor Mapping of Finite Element Graphs onto Processor Meshes." IEEE Transactions on Computers C-36, no. 12, 1408-1424 - 1408-1424.

1980

  • Bhatt, D.; Sadayappan, P.; Kieburtz, R.B.; Smith, D.R., 1980, "OPERATING SYSTEM KERNEL FOR A HIERARCHICAL MULTICOMPUTER.." Proceedings - IEEE Computer Society International Conference 665-672 - 665-672.

Presentations

  • "Domain-Specific Compiler Optimization for High-Performance Computing." 2011, Presented at Indian Institute of Technology, Bombay,
  • "Compiler optimization for high-performance computing." 2011, Presented at Indian Institute of Technology, Madras,
  • "Software Challenges for High Performance Computing." 2011, Presented at Indian Institute of Science, Bengaluru, India,
  • "Domain-Specific Frameworks for High-Performance Computing." 2011, Presented at Indian Association for the Cultivation of Science,
  • "Pattern-Based Compiler Optimization for Performance Portability." 2012, Presented at CNRS. Lyon, France,
  • "Domain-specific abstractions for performance portability." 2012, Presented at Imperial College, London,
  • "Domain-specific abstractions for performance portability." 2012, Presented at University of Illinois at Urbana-Champaign,
  • "Compiler Optmization for Heterogeneous Computing (Keynote)." 2011, Presented at Workshop on Characterizing Applications for Heterogeneous Exascale Systems (CACHES 2011),
  • "Tiling: Progress and Challenges." 2014, Presented at SIAM Parallel Processing Conference: MiniSymposium on Tiling,
  • "Distributed Contraction of Tensors." 2014, Presented at SIAM Parallel Processing: Workshop on Parallel Quantum Chemistry,
  • "Domain-Specific Abstractions for Compiler Optimization." 2014, Presented at Seminar at Stony Brook University,
  • "Challenges in Optimization of Stencil Computations." 2013, Presented at Workshop on Optimizing Stencil Computations (WOSC) 2013, help with OOPSLA/SPLASH 2013,
  • "Future Computational Challenges." 2014, Presented at Software Innovation Institute for Computational Chemistry and Materials Modeling,
  • "Domain-specific abstractions for compiler optimization." 2014, Presented at Seminar at University of Utah,
  • "Domain-Specific Abstractions for High-Performance Computing." 2012, Presented at Distinguished Seminar Series, Department of Computer Science, University of Illinois,

Papers in Proceedings

2018

  • Moon, G.E.; Sukumaran-Rajam, A.; Sadayappan, P. "Parallel LDA with Over-Decomposition." (2 2018).
  • Hong, C.; Sukumaran-Rajam, A.; Kim, J.; Rawat, P.S. et al. "POSTER: Performance modeling for GPUs using abstract kernel emulation." (2 2018).
  • Hong, C.; Sukumaran-Rajam, A.; Kim, J.; Rawat, P.S. et al. "POSTER: Performance Modeling for GPUs using Abstract Kernel Emulation." (1 2018).
  • Rawat, P.S.; Sukumaran-Rajam, A.; Rountev, A.; Rastello, F. et al. "Register Optimizations for Stencils on GPUs." in 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP). (1 2018).
  • Hong, C.; Sukumaran-Rajam, A.; Kim, J.; Rawat, P.S. et al. "POSTER: Performance Modeling for GPUs using Abstract Kernel Emulation." in 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP). (1 2018).
  • Rawat, P.S.; Sukumaran-Rajam, A.; Rountev, A.; Rastello, F. et al. "Register Optimizations for Stencils on GPUs." (1 2018).
  • Hong, C.; Sukumaran-Rajam, A.; Kim, J.; Rawat, P.S. et al. "POSTER: Performance Modeling for GPUs using Abstract Kernel Emulation." (1 2018).
  • Rawat, P.S.; Sukumaran-Rajam, A.; Rountev, A.; Rastello, F. et al. "Register Optimizations for Stencils on GPUs." in 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP). (1 2018).
  • Hong, C.; Sukumaran-Rajam, A.; Kim, J.; Rawat, P.S. et al. "POSTER: Performance Modeling for GPUs using Abstract Kernel Emulation." in 23rd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP). (1 2018).
  • Kurt, S.E.; Thumma, V.; Hong, C.; Sukumaran-Rajam, A. et al. "Characterization of Data Movement Requirements for Sparse Matrix Computations on GPUs." (2 2018).
  • Rawat, P.S.; Sukumaran-Rajam, A.; Rountev, A.; Rastello, F. et al. "Register optimizations for stencils on GPUs." (2 2018).

2017

  • Kunchum, R.; Chaudhry, A.; Sukumaran-Rajam, A.; Niu, Q. et al. "On improving performance of sparse matrix-matrix multiplication on GPUs." (6 2017).
  • Israt Nisa, Aravind Sukumaran-Rajam, Rakshith Kunchum, P Sadayappan "Parallel CCD++ on GPU for Matrix Factorization." (2 2017).
  • Kong, M.; Pouchet, L.N.; Sadayappan, P.; Sarkar, V. "PIPES: A Language and Compiler for Task-Based Programming on Distributed-Memory Clusters." (3 2017).
  • Nisa, I.; Sukumaran-Rajam, A.; Kunchum, R.; Sadayappan, P. "Parallel CCD++ on GPU for matrix factorization." (2 2017).
  • Nisa, I.; Sukumaran-Rajam, A.; Kunchum, R.; Sadayappan, P. et al. "Parallel CCD plus plus on GPU for Matrix Factorization." (1 2017).
  • Samyam Rajbhandari, Fabrice Rastello, Karol Kowalski, Sriram Krishnamoorthy, P Sadayappan "Optimizing the Four-Index Integral Transform Using Data Movement Lower Bounds Analysis." (1 2017).
  • Tavarageri, S.; Kim, W.; Torrellas, J.; Sadayappan, P. "Compiler Support for Software Cache Coherence." (2 2017).
  • Rajbhandari, S.; Kim, J.; Krishnamoorthy, S.; Pouchet, L.N. et al. "A Domain-Specific Compiler for a Parallel Multiresolution Adaptive Numerical Simulation Environment." (3 2017).
  • Rajbhandari, S.; Rastello, F.; Kowalski, K.; Krishnamoorthy, S. et al. "Optimizing the Four-Index Integral Transform Using Data Movement Lower Bounds Analysis." (8 2017).
  • Rajbhandari, S.; Rastello, F.; Kowalski, K.; Krishnamoorthy, S. et al. "Optimizing the four-index integral transform using data movement lower bounds analysis." (1 2017).
  • Rajbhandari, S.; Rastello, F.; Kowalski, K.; Krishnamoorthy, S. et al. "Optimizing the Four-Index Integral Transform Using Data Movement Lower Bounds Analysis." in 22nd ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP). (8 2017).
  • Hong, C.; Sukumaran-Rajam, A.; Kim, J.; Sadayappan, P. "MultiGraph: Efficient Graph Processing on GPUs." in 26th International Conference on Parallel Architectures and Compilation Techniques (PACT). (1 2017).
  • Hong, C.; Sukumaran-Rajam, A.; Kim, J.; Sadayappan, P. et al. "MultiGraph: Efficient Graph Processing on GPUs." (1 2017).
  • Rawat, P.S.; Sukumaran-Rajam, A.; Rountev, A.; Rastello, F. et al. "POSTER: Statement Reordering to Alleviate Register Pressure for Stencils on GPUs." (1 2017).
  • Rawat, P.S.; Sukumaran-Rajam, A.; Rountev, A.; Rastello, F. et al. "POSTER: Statement Reordering to Alleviate Register Pressure for Stencils on GPUs." (10 2017).
  • Hong, C.; Sukumaran-Rajam, A.; Kim, J.; Sadayappan, P. "MultiGraph: Efficient Graph Processing on GPUs." (10 2017).
  • Nisa, I.; Sukumaran-Rajam, A.; Kunchum, R.; Sadayappan, P. "Parallel CCD plus plus on GPU for Matrix Factorization." in Workshop on General Purpose GPUs (GPGPU). (1 2017).
  • Kurt, S.E.; Thumma, V.; Hong, C.; Sukumaran-Rajam, A. et al. "Characterization of data movement requirements for sparse matrix computations on GPUs." in IEEE 24th International Conference on High Performance Computing Workshops (HiPCW). (1 2017).
  • Moon, G.E.; Sukumaran-Rajam, A.; Sadayappan, P.; IEEE, "Parallel LDA with Over-Decomposition." (1 2017).
  • Moon, G.E.; Sukumaran-Rajam, A.; Sadayappan, P. "Parallel LDA with Over-Decomposition." in IEEE 24th International Conference on High Performance Computing Workshops (HiPCW). (1 2017).
  • Rawat, P.S.; Sukumaran-Rajam, A.; Rountev, A.; Rastello, F. et al. "POSTER: Statement Reordering to Alleviate Register Pressure for Stencils on GPUs." in 26th International Conference on Parallel Architectures and Compilation Techniques (PACT). (1 2017).
  • Kurt, S.E.; Thumma, V.; Hong, C.; Sukumaran-Rajam, A. et al. "Characterization of data movement requirements for sparse matrix computations on GPUs." (1 2017).

2016

  • Domagala, L.; van Amstel, D.; Rastello, F.; Sadayappan, P. "Register Allocation and Promotion through Combined Instruction Scheduling and Loop Unrolling." in 25th International Conference on Compiler Construction (CC). (1 2016).
  • Rawat, P.S.; Hong, C.; Ravishankar, M.; Grover, V. et al. "Resource Conscious Reuse-Driven Tiling for GPUs." (1 2016).
  • Kim, W.; Tavarageri, S.; Sadayappan, P.; Torrellas, J. "Architecting and Programming a Hardware-Incoherent Multiprocessor Cache Hierarchy." in 30th IEEE International Parallel and Distributed Processing Symposium (IPDPS). (1 2016).
  • Bao, W.; Hong, C.; Chunduri, S.; Krishnamoorthy, S. et al. "Static and Dynamic Frequency Scaling on Multicore CPUs." (12 2016).
  • Domagala, L.; van Amstel, D.; Rastello, F.; Sadayappan, P. et al. "Register Allocation and Promotion through Combined Instruction Scheduling and Loop Unrolling." (1 2016).
  • Wenlei Bao, Sriram Krishnamoorthy, Louis-Noel Pouchet, Fabrice Rastello, P. Sadayappan "PolyCheck: Dynamic Verification of Iteration Space Transformations on Affine Programs." in POPL. (1 2016).
  • Kim, W.; Tavarageri, S.; Sadayappan, P.; Torrellas, J. et al. "Architecting and Programming a Hardware-Incoherent Multiprocessor Cache Hierarchy." (1 2016).
  • Lukasz Domagala, Duco van Amstel, Fabrice Rastello, P. Sadayappan "Register allocation and promotion through combined instruction scheduling and loop unrolling." (3 2016).
  • Rawat, P.S.; Hong, C.; Ravishankar, M.; Grover, V. et al. "Effective Resource Management for Enhancing Performance of 2D and 3D Stencils on GPUs." (1 2016).
  • Prashant Rawat, Changwan Hong, Mahesh Ravishankar, Vinod Grover, Louis-Noel Pouchet, Atanas Rountev, and P. Sadayappan "Resource Conscious Reuse-Driven Tiling for GPUs." in nternational Conference on Parallel Architectures and Compilation Techniques (PACT'16). (9 2016).
  • Changwan Hong, Wenlei Bao, Albert Cohen, Sriram Krishnamoorthy, Louis-Noël Pouchet, Fabrice Rastello, J Ramanujam, P Sadayappan "Effective padding of multidimensional arrays to avoid cache conflict misses." (6 2016).
  • Timothy Carpenter, Fabrice Rastello, P Sadayappan, Anastasios Sidiropoulos "Brief Announcement: Approximating the I/O Complexity of One-Shot Red-Blue Pebbling." (7 2016).
  • Prashant Singh Rawat, Changwan Hong, Mahesh Ravishankar, Vinod Grover, Louis-Noël Pouchet, Atanas Rountev, P Sadayappan "Resource conscious reuse-driven tiling for GPUs." (9 2016).
  • Rawat, P.S.; Hong, C.; Ravishankar, M.; Grover, V. et al. "Resource Conscious Reuse-Driven Tiling for GPUs." in International Conference on Parallel Architectures and Compilation (PACT). (1 2016).
  • Hong, C.; Bao, W.; Cohen, A.; Krishnamoorthy, S. et al. "Effective Padding of Multidimensional Arrays to Avoid Cache Conflict Misses." (6 2016).
  • Rawat, P.S.; Hong, C.; Ravishankar, M.; Grover, V. et al. "Effective Resource Management for Enhancing Performance of 2D and 3D Stencils on GPUs." in 9th Annual Workshop on General Purpose Processing using Graphics Processing Unit (GPUPU). (1 2016).
  • Bao, W.; Krishnamoorthy, S.; Pouchet, L-N.; Sadayappan, P. "PolyCheck: Dynamic Verification of Iteration Space Transformations on Affine Programs." in 43rd Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL). (1 2016).
  • Bao, W.; Krishnamoorthy, S.; Pouchet, L-N.; Sadayappan, P. "PolyCheck: Dynamic Verification of Iteration Space Transformations on Affine Programs." (1 2016).
  • Hong, C.; Bao, W.; Cohen, A.; Krishnamoorthy, S. et al. "Effective Padding of Multidimensional Arrays to Avoid Cache Conflict Misses." (6 2016).
  • Bao, W.; Hong, C.; Chunduri, S.; Krishnamoorthy, S. et al. "Static and Dynamic Frequency Scaling on Multicore CPUs." (12 2016).
  • Kettimuthu, R.; Agrawal, G.; Sadayappan, P.; Foster, I. "Differentiated Scheduling of Response-Critical and Best-Effort Wide-Area Data Transfers." in 30th IEEE International Parallel and Distributed Processing Symposium (IPDPS). (1 2016).
  • Kettimuthu, R.; Agrawal, G.; Sadayappan, P.; Foster, I. et al. "Differentiated Scheduling of Response-Critical and Best-Effort Wide-Area Data Transfers." (1 2016).
  • Kettimuthu, R.; Agrawal, G.; Sadayappan, P.; Foster, I. "Differentiated Scheduling of Response-Critical and Best-Effort Wide-Area Data Transfers." (7 2016).
  • Rajbhandari, S.; Kim, J.; Krishnamoorthy, S.; Poucheti, L-N. et al. "On Fusing Recursive Traversals of K-d Trees." in 25th International Conference on Compiler Construction (CC). (1 2016).
  • Bao, W.; Krishnamoorthy, S.; Pouchet, L.N.; Rastello, F. et al. "PolyCheck: Dynamic verification of iteration space transformations on affine programs." (1 2016).
  • Carpenter, T.; Rastello, F.; Sadayappan, R.; Sidiropoulos, A. "Brief announcement: Approximating the I/O complexity of one-shot red-blue pebbling." (7 2016).
  • Bao, W.; Krishnamoorthy, S.; Pouchet, L.N.; Rastello, F. et al. "PolyCheck: Dynamic verification of iteration space transformations on affine programs." (4 2016).
  • Rawat, P.S.; Hong, C.; Ravishankar, M.; Grover, V. et al. "Effective resource management for enhancing performance of 2D and 3D stencils on GPUs." (3 2016).
  • Rajbhandari, S.; Kim, J.; Krishnamoorthy, S.; Pouchet, L.N. et al. "On fusing recursive traversals of K-d trees." (3 2016).
  • Hong, C.; Bao, W.; Cohen, A.; Krishnamoorthy, S. et al. "Effective padding of multidimensional arrays to avoid cache conflict misses." (6 2016).
  • Rawat, P.S.; Hong, C.; Ravishankar, M.; Grover, V. et al. "Resource Conscious Reuse-Driven Tiling for GPUs." (9 2016).
  • Bao, W.; Hong, C.; Chunduri, S.; Krishnamoorthy, S. et al. "Static and dynamic frequency scaling on multicore CPUs." (12 2016).
  • Carpenter ., Rastello F., Sadayappan P., Sidiropoulos A. Approximating the I/O complexity of one-shot red-blue pebbling (brief announcement). in ACM Symposium on Parallelism in Algorithms and Architectures. http://doi.acm.org/10.1145/2935764.2935807, (7 2016).
  • Wenlei Bao, Sriram Krishnamoorthy, Louis-Noël Pouchet, Fabrice Rastello, P. Sadayappan "PolyCheck: dynamic verification of iteration space transformations on affine programs." (1 2016).
  • Samyam Rajbhandari, Jinsung Kim, Sriram Krishnamoorthy, Louis-Noel Pouchet, Fabrice Rastello, Robert J Harrison, P Sadayappan "A domain-specific compiler for a parallel multiresolution adaptive numerical simulation environment." (11 2016).
  • Samyam Rajbhandari, Jinsung Kim, Sriram Krishnamoorthy, Louis-Noël Pouchet, Fabrice Rastello, Robert J. Harrison, P. Sadayappan "On fusing recursive traversals of K-d trees." (3 2016).
  • Compiler Support for Software Cache Coherence "Sanket Tavarageri, Wooil Kim, Josep Torrellas, P Sadayappan." (12 2016).
  • Kim, W.; Tavarageri, S.; Sadayappan, P.; Torrellas, J. "Architecting and Programming a Hardware-Incoherent Multiprocessor Cache Hierarchy." (7 2016).
  • Kong, M.; Pouchet, L-N.; Sadayappan, P.; Sarkar, V. et al. "PIPES: A Language and Compiler for Task-based Programming on Distributed-Memory Clusters." (1 2016).
  • Rajbhandari, S.; Kim, J.; Krishnamoorthy, S.; Pouchet, L-N. et al. "A Domain-Specific Compiler for a Parallel Multiresolution Adaptive Numerical Simulation Environment." (1 2016).
  • Rajbhandari, S.; Kim, J.; Krishnamoorthy, S.; Pouchet, L-N. et al. "A Domain-Specific Compiler for a Parallel Multiresolution Adaptive Numerical Simulation Environment." in International Conference on High Performance Computing, Networking, Storage and Analysis (SC). (1 2016).
  • Kong, M.; Pouchet, L-N.; Sadayappan, P.; Sarkar, V. "PIPES: A Language and Compiler for Task-based Programming on Distributed-Memory Clusters." in International Conference on High Performance Computing, Networking, Storage and Analysis (SC). (1 2016).
  • Carpenter, ; Rastello, F.; Sadayappan, P.; Sidiropoulos, A. Approximating the I/O complexity of one-shot red-blue pebbling (brief announcement). in ACM Symposium on Parallelism in Algorithms and Architectures. http://doi.acm.org/10.1145/2935764.2935807, (7 2016).
  • Tavarageri, S.; Kim, W.; Torrellas, J.; Sadayappan, P. et al. "Compiler Support for Software Cache Coherence." (1 2016).
  • Tavarageri, S.; Kim, W.; Torrellas, J.; Sadayappan, P. "Compiler Support for Software Cache Coherence." in 23rd IEEE International Conference on High Performance Computing (HiPC). (1 2016).
  • Wenlei Bao, Changwan Hong, Sudheer Chunduri, Sriram Krishnamoorthy, Louis-Noël Pouchet, Fabrice Rastello, P Sadayappan "Static and dynamic frequency scaling on multicore CPUs." (12 2016).
  • Domagala, L.; Van Amstel, D.; Rastello, F.; Sadayappan, P. "Register allocation and promotion through combined instruction scheduling and loop unrolling." (3 2016).
  • Rajbhandari, S.; Kim, J.; Krishnamoorthy, S.; Poucheti, L-N. et al. "On Fusing Recursive Traversals of K-d Trees." (1 2016).
  • Martin Kong, Louis-Noël Pouchet, P Sadayappan, Vivek Sarkar "PIPES: a language and compiler for task-based programming on distributed-memory clusters." (11 2016).

2015

  • Arash Ashari, Shirish Tatikonda, Matthias Boehm, Berthold Reinwald, Keith Campbell, John Keenleyside, P. Sadayappan "On optimizing machine learning workloads via kernel fusion." (2 2015).
  • Sedaghati, N.; Ashari, A.; Pouchet, L-N.; Parthasarathy, S. et al. "Characterizing Dataset Dependence for Sparse Matrix-Vector Multiplication on GPUs." (1 2015).
  • Sedaghati, N.; Mu, T.; Pouchet, L.N.; Parthasarathy, S. et al. "Automatic selection of sparse matrix representation on GPUs." (6 2015).
  • Venmugil Elango, Fabrice Rastello, Louis-Noël Pouchet, J. Ramanujam, P. Sadayappan "On Characterizing the Data Access Complexity of Programs." (1 2015).
  • Naser Sedaghati, Te Mu, Louis-Noël Pouchet, Srinivasan Parthasarathy, P. Sadayappan "Automatic Selection of Sparse Matrix Representation on GPUs." (6 2015).
  • Ravishankar, M.; Dathathri, R.; Elango, V.; Pouchet, L-N. et al. "Distributed Memory Code Generation for Mixed Irregular/Regular Computations." (8 2015).
  • Naser Sedaghati, Arash Ashari, Louis-Noel Pouchet, Srinivasan Parthasarathy and P. Sadayappan "Characterizing Dataset Dependence for Sparse Matrix-Vector Multiplication on GPUs." in 2nd Workshop on Parallel Programming for Analytics Applications (PPAA’15), in conjunction with PPoPP 2015. (2 2015).
  • Naznin Fauzia, Louis-Noel Pouchet and P. Sadayappan "Characterizing and Enhancing Global Memory Data Coalescing on GPU." in CGO'15. (2 2015).
  • Kong, M.; Pouchet, L-N.; Sadayappan, P. "A Roofline-based Performance Estimator for Distributed Matrix-multiply on Intel CnC." in 29th IEEE International Parallel and Distributed Processing Symposium (IPDPS). (1 2015).
  • Tobias Grosser, Sebastian Pop, Louis-Noël Pouchet, P. Sadayappan, Sebastian Pop "Optimistic Delinearization of Parametrically Sized Arrays." (6 2015).
  • Fauzia, N.; Pouchet, L-N.; Sadayappan, P. "Characterizing and Enhancing Global Memory Data Coalescing on GPUs." in Proceedings of the IEEE/ACM International Symposium on Code Generation and Optimization (CGO). (1 2015).
  • Kettimuthu, R.; Vardoyan, G.; Agrawal, G.; Sadayappan, P. et al. "An Elegant Sufficiency: Load-Aware Differentiated Scheduling of Data Transfers." in International Conference for High Performance Computing, Networking, Storage and Analysis (SC). (1 2015).
  • Ali,Syed,A; Kollu,Gautham; Mazumder,Sandip; Sadayappan,P "PREDICTION OF NON-EQUILIBRIUM HEAT CONDUCTION USING PARALLEL COMPUTATION OF THE PHONON BOLTZMANN TRANSPORT EQUATION." in ASME International Mechanical Engineering Congress and Exposition (IMECE). (1 2015).
  • Ashari, A.; Tatikonda, S.; Boehm, M.; Reinwald, B. et al. "On optimizing machine learning workloads via kernel fusion." (1 2015).
  • Naznin Fauzia, Louis-Noël Pouchet, P. Sadayappan "Characterizing and enhancing global memory data coalescing on GPUs." (2 2015).
  • Fauzia, N.; Pouchet, L-N.; Sadayappan, P.; IEEE, "Characterizing and Enhancing Global Memory Data Coalescing on GPUs." (1 2015).
  • Arafat, H.; Krishnamoorthy, S.; Sadayappan, P. "Checksumming Strategies for Data in Volatile Memories." (1 2015).
  • Sedaghati, N.; Ashari, A.; Pouchet, L-N.; Parthasarathy, S. et al. "Characterizing Dataset Dependence for Sparse Matrix-Vector Multiplication on GPUs." in 2nd Workshop on Parallel Programming for Analytics Applications (PPAA). (1 2015).
  • Fauzia, N.; Pouchet, L.N.; Sadayappan, P. "Characterizing and enhancing global memory data coalescing on GPUs." (1 2015).
  • Kong, M.; Pouchet, L.N.; Sadayappan, P. "A Roofline-Based Performance Estimator for Distributed Matrix-Multiply on Intel CnC." (9 2015).
  • Rawat, P.; Kong, M.; Henretty, T.; Holewinski, J. et al. "SDSLc: A multi-target domain-specific compiler for stencil computations." (11 2015).
  • Grosser, T.; Ramanujam, J.; Pouchet, L.N.; Sadayappan, P. et al. "Optimistic delinearization of parametrically sized arrays." (6 2015).
  • Ravishankar, M.; Dathathri, R.; Elango, V.; Pouchet, L.N. et al. "Distributed memory code generation for mixed irregular/regular computations." (1 2015).
  • Bao, W.; Tavarageri, S.; Ozguner, F.; Sadayappan, P. "PWCET: Power-Aware Worst Case Execution Time Analysis." (1 2015).
  • Ali, S.A.; Kollu, G.; Mazumder, S.; Sadayappan, P. et al. "PREDICTION OF NON-EQUILIBRIUM HEAT CONDUCTION USING PARALLEL COMPUTATION OF THE PHONON BOLTZMANN TRANSPORT EQUATION." (1 2015).
  • Ashari, A.; Tatikonda, S.; Boehm, M.; Reinwald, B. et al. "On Optimizing Machine Learning Workloads via Kernel Fusion." (8 2015).
  • Kettimuthu, R.; Vardoyan, G.; Agrawal, G.; Sadayappan, P. et al. "An elegant sufficiency: Load-aware differentiated scheduling of data transfers." (11 2015).
  • Kettimuthu, R.; Vardoyan, G.; Agrawal, G.; Sadayappan, P. et al. "An Elegant Sufficiency: Load-Aware Differentiated Scheduling of Data Transfers." (1 2015).
  • Ali, S.A.; Kollu, G.; Mazumder, S.; Sadayappan, P. "PREDICTION OF NON-EQUILIBRIUM HEAT CONDUCTION USING PARALLEL COMPUTATION OF THE PHONON BOLTZMANN TRANSPORT EQUATION." in ASME International Mechanical Engineering Congress and Exposition (IMECE2014). (1 2015).
  • Sedaghati, N.; Ashari, A.; Pouchet, L.N.; Parthasarathy, S. et al. "Characterizing dataset dependence for sparse matrix-vector multiplication on GPUS." (1 2015).
  • Prashant Rawat, Martin Kong, Tom Henretty, Justin Holewinski, Kevin Stock, Louis-Noel Pouchet, J. Ramanujam, Atanas Rountev, and P. Sadayappan "SDSLc: a Multi-Target Domain-Specific Compiler for Stencil Computations." in International Workshop on Domain-Specific Languages and High-Level Frameworks for High Performance Computing (WOLFHPC'15). (11 2015).
  • Ashari, A.; Tatikonda, S.; Boehm, M.; Reinwald, B. et al. "On Optimizing Machine Learning Workloads via Kernel Fusion." (8 2015).
  • Rajkumar Kettimuthu, Gayane Vardoyan, Gagan Agrawal, P. Sadayappan, Ian T. Foster "An elegant sufficiency: load-aware differentiated scheduling of data transfers." (11 2015).
  • Mahesh Ravishankar, Roshan Dathathri, Venmugil Elango, Louis-Noël Pouchet, J. Ramanujam, Atanas Rountev, P. Sadayappan "Distributed memory code generation for mixed Irregular/Regular computations." (2 2015).
  • Faisal, S.M.; Parthasarathy, S.; Sadayappan, P. "Global graphs: A middleware for large scale graph processing." (1 2015).
  • Ravishankar, M.; Dathathri, R.; Elango, V.; Pouchet, L-N. et al. "Distributed Memory Code Generation for Mixed Irregular/Regular Computations." (8 2015).
  • Kong, M.; Pouchet, L-N.; Sadayappan, P.; IEEE, "A Roofline-based Performance Estimator for Distributed Matrix-multiply on Intel CnC." (1 2015).

2014

  • Niu, Q.; Lai, P.W.; Faisal, S.M.; Parthasarathy, S. et al. "A fast implementation of MLR-MCL algorithm on multi-core processors." (1 2014).
  • Rajbhandari, S.; Nikam, A.; Lai, P.W.; Stock, K. et al. "CAST: Contraction algorithm for symmetric tensors." (1 2014).
  • Ashari, A.; Sedaghati, N.; Eisenlohr, J.; Sadayappan, P. "An efficient two-dimensional blocking strategy for sparse matrix-vector multiplication on GPUs." (1 2014).
  • Tavarageri, S.; Krishnamoorthy, S.; Sadayappan, P. "Compiler-assisted detection of transient memory errors." (1 2014).
  • Tavarageri, S.; Krishnamoorthy, S.; Sadayappan, P. "Compiler-assisted detection of transient memory errors." (1 2014).
  • Konstantinidis, A.; Kelly, P.H.J.; Ramanujam, J.; Sadayappan, P. "Parametric GPU code generation for affine loop programs." (1 2014).
  • Stock, K.; Kong, M.; Grosser, T.; Pouchet, L-N. et al. "A Framework for Enhancing Data Reuse via Associative Reordering." in 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). (6 2014).
  • Elango,Venmugil; Sedaghati,Naser; Rastello,Fabrice; Pouchet,Louis-Noel; Ramanujam,J; Teodorescu,Radu; Sadayappan,P "On Using the Roofline Model with Lower Bounds on Data Movement." in HiPEAC. (12 2014).
  • Niu, Q.; Lai, P-W.; Faisal, S.M.; Parthasarathy, S. et al. "A Fast Implementation of MLR-MCL Algorithm on Multi-core Processors." (1 2014).
  • Sanket Tavarageri, Sriram Krishnamoorthy, P. Sadayappan "Compiler-assisted detection of transient memory errors." in PLDI 2014. (6 2014).
  • Rajbhandari, S.; Nikam, A.; Lai, P.W.; Stock, K. et al. "A Communication-Optimal Framework for Contracting Distributed Tensors." (1 2014).
  • Kamil, S.; Amarasinghe, S.; Sadayappan, P. "WOSC 2014: Second Workshop on Optimizing Stencil Computations." (1 2014).
  • Ashari, A.; Sedaghati, N.; Eisenlohr, J.; Parthasarathy, S. et al. "Fast Sparse Matrix-Vector Multiplication on GPUs for Graph Applications." (1 2014).
  • Ashari, A.; Sedaghati, N.; Eisenlohr, J.; Sadayappan, P. "An Efficient Two-Dimensional Blocking Strategy for Sparse Matrix-Vector Multiplication on GPUs." in 28th ACM International Conference on Supercomputing (ICS). (1 2014).
  • Kong, M.; Pop, A.; Pouchet, L-N.; Govindarajan, R. et al. "Compiler/Runtime Framework for Dynamic Dataflow Parallelization of Tiled Programs." (12 2014).
  • Tobias Grosser, Albert Cohen, Justin Holewinski, P. Sadayappan, Sven Verdoolaege "Hybrid Hexagonal/Classical Tiling for GPUs." in CGO 2014. (2 2014).
  • Elango, V.; Sedaghati, N.; Rastello, F.; Pouchet, L-N. et al. "On Using the Roofline Model with Lower Bounds on Data Movement." (12 2014).
  • Grosser, T.; Cohen, A.; Holewinski, J.; Sadayappan, P. et al. "Hybrid hexagonal/classical tiling for GPUs." (1 2014).
  • Stock, K.; Kong, M.; Grosser, T.; Pouchet, L.N. et al. "A framework for enhancing data reuse via associative reordering." (1 2014).
  • Kettimuthu, R.; Vardoyan, G.; Agrawal, G.; Sadayappan, P. "Modeling and Optimizing Large-Scale Wide-Area Data Transfers." (1 2014).
  • Kettimuthu, R.; Vardoyan, G.; Agrawal, G.; Sadayappan, P. "Modeling and optimizing large-scale wide-area data transfers." (1 2014).
  • Ali, S.A.; Kollu, G.; Mazumder, S.; Sadayappan, P. "Prediction of non-equilibrium heat conduction using parallel computation of the phonon Boltzmann Transport Equation." (1 2014).
  • Kettimuthu, R.; Vardoyan, G.; Agrawal, G.; Sadayappan, P. "Modeling and Optimizing Large-Scale Wide-Area Data Transfers." in 14th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid). (1 2014).
  • Rajbhandari, S.; Nikam, A.; Lai, P-W.; Stock, K. et al. "A Communication-Optimal Framework for Contracting Distributed Tensors." in International Conference on High Performance Computing, Networking, Storage and Analysis. (1 2014).
  • Rajbhandari, S.; Nikam, A.; Lai, P-W.; Stock, K. et al. "CAST: Contraction Algorithm for Symmetric Tensors." in 43rd Annual International Conference on Parallel Processing (ICPP). (1 2014).
  • Tavarageri, S.; Krishnamoorthy, S.; Sadayappan, P. "Compiler-Assisted Detection of Transient Memory Errors." in 35th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). (6 2014).
  • Stock, K.; Kong, M.; Grosser, T.; Pouchet, L.N. et al. "A Framework for Enhancing Data Reuse via Associative Reordering." (1 2014).
  • Venmugil Elango, Fabrice Rastello, Louis-Noël Pouchet, J. Ramanujam, P. Sadayappan "On Characterizing the Data Movement Complexity of Computational DAGs for Parallel Execution." in SPAA 2014. (6 2014).
  • Kong, M.; Pop, A.; Pouchet, L.N.; Govindarajan, R. et al. "Compiler/runtime framework for dynamic dataflow parallelization of tiled programs." (1 2014).
  • Elango, V.; Sedaghati, N.; Rastello, F.; Pouchet, L.N. et al. "On using the roofline model with lower bounds on data movement." (1 2014).
  • Rajbhandari, S.; Nikam, A.; Lai, P-W.; Stock, K. et al. "A Communication-Optimal Framework for Contracting Distributed Tensors." (1 2014).
  • Ashari, A.; Sedaghati, N.; Eisenlohr, J.; Parthasarathy, S. et al. "Fast Sparse Matrix-Vector Multiplication on GPUs for Graph Applications." in International Conference on High Performance Computing, Networking, Storage and Analysis. (1 2014).
  • Faisal, S.M.; Parthasarathy, S.; Sadayappan, P. "Global Graphs: A Middleware For Large Scale Graph Processing." in IEEE International Conference on Big Data. (1 2014).
  • Kevin Stock, Martin Kong, Tobias Grosser, Louis-Noël Pouchet, Fabrice Rastello, J. Ramanujam, P. Sadayappan "A framework for enhancing data reuse via associative reordering." in PLDI 2014. (6 2014).
  • Niu, Q.; Lai, P-W.; Faisal, S.M.; Parthasarathy, S. et al. "A Fast Implementation of MLR-MCL Algorithm on Multi-core Processors." in 21st International Conference on High Performance Computing (HiPC). (1 2014).
  • Konstantinidis, A.; Kelly, P.H.J.; Ramanujam, J.; Sadayappan, P. "Parametric GPU Code Generation for Affine Loop Programs." in 26th International Workshop on Languages and Compilers for Parallel Computing (LCPC). (1 2014).
  • Ashari, A.; Sedaghati, N.; Eisenlohr, J.; Parthasarathy, S. et al. "Fast Sparse Matrix-Vector Multiplication on GPUs for Graph Applications." (1 2014).
  • Samyam Rajbhandari, Akshay Nikam, Pai-Wei Lai, Kevin Stock, Sriram Krishnamoorthy, P. Sadayappan "A Communication-Optimal Framework for Contracting Distributed Tensors." (11 2014).
  • Faisal, S.M.; Parthasarathy, S.; Sadayappan, P.; IEEE, "Global Graphs: A Middleware For Large Scale Graph Processing." (1 2014).
  • Ashari, A.; Sedaghati, N.; Eisenlohr, J.; Parthasarathy, S. et al. "Fast Sparse Matrix-Vector Multiplication on GPUs for Graph Applications." (1 2014).
  • Rajbhandari, S.; Nikam, A.; Lai, P-W.; Stock, K. et al. "CAST: Contraction Algorithm for Symmetric Tensors." (1 2014).
  • Konstantinidis, A.; Kelly, P.H.J.; Ramanujam, J.; Sadayappan, P. "Parametric GPU Code Generation for Affine Loop Programs." (1 2014).
  • Arash Ashari, Naser Sedaghati, John Eisenlohr, Srinivasan Parthasarathy, P. Sadayappan: "Fast Sparse Matrix-Vector Multiplication on GPUs for Graph Applications." (11 2014).
  • Niu, Q.; Lai, P.W.; Faisal, S.M.; Parthasarathy, S. et al. "A fast implementation of MLR-MCL algorithm on multi-core processors." (1 2014).
  • Ashari, A.; Sedaghati, N.; Eisenlohr, J.; Sadayappan, P. "An Efficient Two-Dimensional Blocking Strategy for Sparse Matrix-Vector Multiplication on GPUs." (1 2014).
  • Tavarageri, S.; Krishnamoorthy, S.; Sadayappan, P. "Compiler-Assisted Detection of Transient Memory Errors." (6 2014).
  • Elango, V.; Sedaghati, N.; Rastello, F.; Pouchet, L-N. et al. "On Using the Roofline Model with Lower Bounds on Data Movement." (12 2014).
  • Kong,Martin; Pop,Antoniu; Pouchet,Louis-Noel; Govindarajan,R; Cohen,Albert; Sadayappan,P "Compiler/Runtime Framework for Dynamic Dataflow Parallelization of Tiled Programs." in HiPEAC. (12 2014).
  • Rajbhandari, S.; Stock, K.; Sadayappan, P. "Towards performance-portable generation of tensor kernels for computational chemistry." (8 2014).
  • Stock, K.; Kong, M.; Grosser, T.; Pouchet, L-N. et al. "A Framework for Enhancing Data Reuse via Associative Reordering." (6 2014).
  • Rajbhandari, S.; Stock, K.; Sadayappan, P. "Towards performance-portable generation of tensor kernels for computational chemistry." in 248th National Meeting of the American-Chemical-Society (ACS). (8 2014).
  • Kong, M.; Pop, A.; Pouchet, L-N.; Govindarajan, R. et al. "Compiler/Runtime Framework for Dynamic Dataflow Parallelization of Tiled Programs." (12 2014).

2013

  • Louis-Noel Pouchet, Peng Zhang, P. Sadayappan, and Jason Cong "Polyhedral-based data reuse optimization for configurable computing." (1 2013).
  • Wang, Y.; Parthasarathy, S.; Sadayappan, P. "Stratification Driven Placement of Complex Data: A Framework for Distributed Data Analytics." in 29th IEEE International Conference on Data Engineering (ICDE). (1 2013).
  • Grosser, T.; Cohen, A.; Kelly, P.H.J.; Ramanujam, J. et al. "Split tiling for GPUs: Automatic parallelization using trapezoidal tiles." (4 2013).
  • Thomas Henretty, Richard Veras, Franz Franchetti, Louis-Noël Pouchet, J. Ramanujam, P. Sadayappan "A stencil compiler for short-vector SIMD architectures." in ICS 2013. (6 2013).
  • Henretty, T.; Veras, R.; Franchetti, F.; Pouchet, L.N. et al. "A stencil compiler for short-vector SIMD architectures." (7 2013).
  • Lai, P-W.; Arafat, H.; Elango, V.; Sadayappan, P. et al. "Accelerating Strassen-Winograd's Matrix Multiplication Algorithm on GPUs." (1 2013).
  • Kong, M.; Veras, R.; Stock, K.; Franchetti, F. et al. "When polyhedral transformations meet SIMD code generation." (9 2013).
  • Kong, M.; Veras, R.; Stock, K.; Franchetti, F. et al. "When polyhedral transformations meet SIMD code generation." (6 2013).
  • Park, E.; Cavazos, J.; Pouchet, L.N.; Bastoul, C. et al. "Predictive Modeling in a Polyhedral Optimization Space." (10 2013).
  • Pouchet, L.N.; Zhang, P.; Sadayappan, P.; Cong, J. "Polyhedral-based data reuse optimization for configurable computing." (3 2013).
  • Lai, P-W.; Stock, K.; Rajbhandari, S.; Krishnamoorthy, S. et al. "A Framework for Load Balancing of Tensor Contraction Expressions via Dynamic Task Partitioning." in International Conference for High Performance Computing, Networking, Storage and Analysis (SC). (1 2013).
  • Ye Wang, Srinivasan Parthasarathy, P. Sadayappan "Stratification driven placement of complex data: A framework for distributed data analytics." in ICDE 2013. (6 2013).
  • Pai-Wei Lai, Humayun Arafat, Venmugil Elango, Ponnuswamy Sadayappan "Accelerating Strassen-Winograd's matrix multiplication algorithm on GPUs." in HiPC 2013. (12 2013).
  • Park, E.; Cavazos, J.; Pouchet, L-N.; Bastoul, C. et al. "Predictive Modeling in a Polyhedral Optimization Space." (10 2013).
  • Wang, Y.; Parthasarathy, S.; Sadayappan, P.; IEEE, "Stratification Driven Placement of Complex Data: A Framework for Distributed Data Analytics." (1 2013).
  • Kong,Martin; Veras,Richard; Stock,Kevin; Franchetti,Franz; Pouchet,Louis-Noel; Sadayappan,P "When Polyhedral Transformations Meet SIMD Code Generation." in 34th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). (6 2013).
  • Lai, P.W.; Arafat, H.; Elango, V.; Sadayappan, P. "Accelerating Strassen-Winograd's matrix multiplication algorithm on GPUs." (1 2013).
  • Park,Eunjung; Cavazos,John; Pouchet,Louis-Noel; Bastoul,Cedric; Cohen,Albert; Sadayappan,P "Predictive Modeling in a Polyhedral Optimization Space." (10 2013).
  • Kong, M.; Veras, R.; Stock, K.; Franchetti, F. et al. "When Polyhedral Transformations Meet SIMD Code Generation." (6 2013).
  • Lai, P-W.; Stock, K.; Rajbhandari, S.; Krishnamoorthy, S. et al. "A Framework for Load Balancing of Tensor Contraction Expressions via Dynamic Task Partitioning." (1 2013).
  • Kong, M.; Veras, R.; Stock, K.; Franchetti, F. et al. "When Polyhedral Transformations Meet SIMD Code Generation." in 34th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). (6 2013).
  • Lai, P-W.; Arafat, H.; Elango, V.; Sadayappan, P. "Accelerating Strassen-Winograd's Matrix Multiplication Algorithm on GPUs." in 20th International Conference on High Performance Computing (HiPC). (1 2013).
  • Wang, Y.; Parthasarathy, S.; Sadayappan, P. "Stratification driven placement of complex data: A framework for distributed data analytics." (8 2013).
  • Tobias Grosser, Albert Cohen, Paul H. J. Kelly, J. Ramanujam, P. Sadayappan, and Sven Verdoolaege "Split tiling for GPUs: automatic parallelization using trapezoidal tiles." (3 2013).
  • Martin Kong, Richard Veras, Kevin Stock, Franz Franchetti, Louis-Noël Pouchet, P. Sadayappan "When polyhedral transformations meet SIMD code generation." in PLDI 2013. (6 2013).
  • Pai-Wei Lai, Kevin Stock, Samyam Rajbhandari, Sriram Krishnamoorthy, P. Sadayappan "A framework for load balancing of tensor contraction expressions via dynamic task partitioning." in Supercomputing. (11 2013).
  • Tavarageri, S.; Sadayappan, P. "A compiler analysis to determine useful cache size for energy efficiency." (1 2013).
  • Lai, P.W.; Stock, K.; Rajbhandari, S.; Krishnamoorthy, S. et al. "A framework for load balancing of tensor contraction expressions via dynamic task partitioning." (1 2013).
  • Park, E.; Cavazos, J.; Pouchet, L-N.; Bastoul, C. et al. "Predictive Modeling in a Polyhedral Optimization Space." (10 2013).

2012

  • Clemons, T.; Parthasarathy, S.; Sadayappan, P. "GADBMS: A framework for scalable array analytics." in 25th ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis (SC). (1 2012).
  • Holewinski, J.; Ramamurthi, R.; Ravishankar, M.; Fauzia, N. et al. "Dynamic Trace-Based Analysis of Vectorization Potential of Applications." (6 2012).
  • Shirako, J.; Sharma, K.; Fauzia, N.; Pouchet, L.N. et al. "Analytical bounds for optimal tile size selection." (4 2012).
  • Shirako, Jun, Kamal Sharma, Naznin Fauzia, Louis-Noël Pouchet, J. Ramanujam, P. Sadayappan, and Vivek Sarkar "Analytical bounds for optimal tile size selection." in Compiler Construction. (6 2012).
  • Stock, K.; Pouchet, L.N.; Sadayappan, P. "Using machine learning to improve automatic vectorization." (1 2012).
  • Niu, Q.; Dinan, J.; Lu, Q.; Sadayappan, P. "PARDA: A Fast Parallel Reuse Distance Analysis Algorithm." in 26th IEEE International Parallel and Distributed Processing Symposium (IPDPS) / Workshop on High Performance Data Intensive Computing. (1 2012).
  • Justin Holewinski, Ragavendar Ramamurthi, Mahesh Ravishankan, Naznin Fauzia, Louis-Noel Pouchet, Atanas Rountev, and P. Sadayappan "Dynamic Trace-Based Analysis of Vectorization Potential of Applications." (6 2012).
  • Ravishankar, M.; Eisenlohr, J.; Pouchet, L-N.; Ramanujam, J. et al. "Code Generation for Parallel Execution of a Class of Irregular Loops on Distributed Memory Systems." in 25th ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis (SC). (1 2012).
  • Justin Holewinski, Ragavendar Ramamurthi, Mahesh Ravishankan, Naznin Fauzia, Louis-Noel Pouchet, Atanas Rountev, and P. Sadayappan "Dynamic Trace-Based Analysis of Vectorization Potential of Applications." in ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI'12). (6 2012).
  • Pai-Wei Lai, Huaijian Zhang, Samyam Rajbhandari, Edward Valeev, Karol Kowalski, P. Sadayappan "Effective Utilization of Tensor Symmetry in Operation Optimization of Tensor Contraction Expressions." (6 2012).
  • Clemons, T.; Parthasarathy, S.; Sadayappan, P. "GADBMS: A framework for scalable array analytics." (12 2012).
  • Holewinski, J.; Pouchet, L.N.; Sadayappan, P. "High-performance code generation for stencil computations on GPU architectures." (7 2012).
  • Qingpeng Niu, James Dinan, Qingda Lu, and P. Sadayappan "PARDA: A Fast Parallel Reuse Distance Analysis Algorithm." (5 2012).
  • Lai, P-W.; Zhang, H.; Rajbhandari, S.; Valeev, E. et al. "International Conference on Computational Science, ICCS 2012." (1 2012).
  • Clemons, T.; Parthasarathy, S.; Sadayappan, P.; IEEE, "GADBMS: A framework for scalable array analytics." (1 2012).
  • Holewinski,Justin; Ramamurthi,Ragavendar; Ravishankar,Mahesh; Fauzia,Naznin; Pouchet,Louis-Noel; Rountev,Atanas; Sadayappan,P "Dynamic Trace-Based Analysis of Vectorization Potential of Applications." in PLDI. (6 2012).
  • Stock,Kevin; Pouchet,Louis-Noel; Sadayappan,P "Using Machine Learning to Improve Automatic Vectorization." in HiPEAC. (1 2012).
  • Justin Holewinski, Louis-Noël Pouchet, and P. Sadayappan "High-performance code generation for stencil computations on GPU architectures." (6 2012).
  • Stock, K.; Pouchet, L-N.; Sadayappan, P. "Using Machine Learning to Improve Automatic Vectorization." (1 2012).
  • Niu, Q.; Dinan, J.; Tirukkovalur, S.; Mitas, L. et al. "A Global Address Space Approach to Automated Data Management for Parallel Quantum Monte Carlo Applications." in 19th International Conference on High Performance Computing (HiPC). (1 2012).
  • Holewinski, J.; Ramamurthi, R.; Ravishankar, M.; Fauzia, N. et al. "Dynamic trace-based analysis of vectorization potential of applications." (7 2012).
  • Jun Shirako, Kamal Sharma, Naznin Fauzia, Louis-Noël Pouchet, J. Ramanujam, P. Sadayappan, and Vivek Sarkar "Analytical Bounds for Optimal Tile Size Selection." (3 2012).
  • Holewinski, J.; Ramamurthi, R.; Ravishankar, M.; Fauzia, N. et al. "Dynamic trace-based analysis of vectorization potential of applications." (8 2012).
  • Niu, Q.; Dinan, J.; Lu, Q.; Sadayappan, P. et al. "PARDA: A Fast Parallel Reuse Distance Analysis Algorithm." (1 2012).
  • Lai, P.W.; Zhang, H.; Rajbhandari, S.; Valeev, E. et al. "Effective utilization of tensor symmetry in operation optimization of tensor contraction expressions." (1 2012).
  • Mahesh Ravishankar, John Eisenlohr, Louis-Noel Pouchet, J. Ramanujam, Atanas Rountev, and P. Sadayappan "Code Generation for Parallel Execution of a Class of Irregular Loops on Distributed Memory Systems." in International Conference for High Performance Computing, Networking, Storage and Analysis (SC'12). (11 2012).
  • Niu, Q.; Dinan, J.; Tirukkovalur, S.; Mitas, L. et al. "A global address space approach to automated data management for parallel Quantum Monte Carlo applications." (12 2012).
  • Lai, P-W.; Zhang, H.; Rajbhandari, S.; Valeev, E. et al. "International Conference on Computational Science, ICCS 2012." in International Conference on Computational Science (ICCS). (1 2012).
  • Niu, Q.; Dinan, J.; Lu, Q.; Sadayappan, P. "PARDA: A fast parallel reuse distance analysis algorithm." (10 2012).
  • Godwin, J.; Holewinski, J.; Sadayappan, P. "High-performance sparse matrix-vector multiplication on GPUs for structured grid computations." (3 2012).
  • Sadayappan, P.; Ramanujam, J. "Chairs' Welcome." (3 2012).
  • Ravishankar, M.; Eisenlohr, J.; Pouchet, L-N.; Ramanujam, J. et al. "Code Generation for Parallel Execution of a Class of Irregular Loops on Distributed Memory Systems." (1 2012).
  • Shirako, J.; Sharma, K.; Fauzia, N.; Pouchet, L-N. et al. "Analytical Bounds for Optimal Tile Size Selection." (1 2012).
  • Arafat, H.; Sadayappan, P.; Dinan, J.; Krishnamoorthy, S. et al. "Load Balancing of Dynamical Nucleation Theory Monte Carlo Simulations Through Resource Sharing Barriers." in 26th IEEE International Parallel and Distributed Processing Symposium (IPDPS) / Workshop on High Performance Data Intensive Computing. (1 2012).
  • Niu, Q.; Dinan, J.; Tirukkovalur, S.; Mitas, L. et al. "A Global Address Space Approach to Automated Data Management for Parallel Quantum Monte Carlo Applications." (1 2012).
  • Md. Humayun Arafat, James Dinan, Sriram Krishnamoorthy, Theresa Windus, and P. Sadayappan "Load Balancing of Dynamical Nucleation Theory Monte Carlo Simulations Through Resource Sharing Barriers." (5 2012).
  • Shirako, J.; Sharma, K.; Fauzia, N.; Pouchet, L-N. et al. "Analytical Bounds for Optimal Tile Size Selection." in 21st International Conference on Compiler Construction (CC). (1 2012).
  • Holewinski, J.; Ramamurthi, R.; Ravishankar, M.; Fauzia, N. et al. "Dynamic Trace-Based Analysis of Vectorization Potential of Applications." in 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation. (6 2012).
  • Arafat, H.; Sadayappan, P.; Dinan, J.; Krishnamoorthy, S. et al. "Load Balancing of Dynamical Nucleation Theory Monte Carlo Simulations Through Resource Sharing Barriers." (1 2012).
  • Mahesh Ravishankar, John Eisenlohr, Louis-Noël Pouchet, J. Ramanujam, Atanas Rountev, and P. Sadayappan "Code generation for parallel execution of a class of irregular loops on distributed memory systems." (11 2012).
  • Stock, K.; Pouchet, L-N.; Sadayappan, P. "Using Machine Learning to Improve Automatic Vectorization." (1 2012).
  • Ravishankar, M.; Eisenlohr, J.; Pouchet, L.N.; Ramanujam, J. et al. "Code generation for parallel execution of a class of irregular loops on distributed memory systems." (12 2012).
  • Arafat, H.; Sadayappan, P.; Dinan, J.; Krishnamoorthy, S. et al. "Load balancing of dynamical nucleation theory Monte Carlo simulations through resource sharing barriers." (10 2012).

2011

  • Sadayappan, P.; Krishnamoorthy, S.; Lai, P-W.; Pouchet, L-N. et al. "Optimization and performance-portable transformation of high level specifications of tensor contraction expressions." (8 2011).
  • Ali, N.; Krishnamoorthy, S.; Govind, N.; Kowalski, K. et al. "Application-Specific Fault Tolerance via Data Access Characterization." (1 2011).
  • Park, E.; Pouchet, L-N.; Cavazos, J.; Cohen, A. et al. "Predictive Modeling in a Polyhedral Optimization Space." (1 2011).
  • Stock, K.; Henretty, T.; Murugandi, I.; Sadayappan, P. et al. "Model-driven SIMD code generation for a multi-resolution tensor kernel." (10 2011).
  • Park, E.; Pouche, L.N.; Cavazos, J.; Cohen, A. et al. "Predictive modeling in a polyhedral optimization space." (5 2011).
  • Sadayappan, P.; Krishnamoorthy, S.; Lai, P-W.; Pouchet, L-N. et al. "Optimization and performance-portable transformation of high level specifications of tensor contraction expressions." in 242nd National Meeting of the American-Chemical-Society (ACS). (8 2011).
  • Ali, N.; Krishnamoorthy, S.; Govind, N.; Kowalski, K. et al. "Application-Specific Fault Tolerance via Data Access Characterization." in 17th International Euro-Par Conference on Parallel Processing. (1 2011).
  • Minnich, R.G.; Janssen, C.L.; Krishnamoorthy, S.; Marquez, A. et al. "FOX: A fault-oblivious extreme scale execution environment." (12 2011).
  • Henretty, T.; Stock, K.; Pouchet, L-N.; Franchetti, F. et al. "Data Layout Transformation for Stencil Computations on Short-Vector SIMD Architectures." in 20th International Conference on Compiler Construction. (1 2011).
  • Pouchet, L-N.; Bondhugula, U.; Bastoul, C.; Cohen, A. et al. "Loop Transformations: Convexity, Pruning and Optimization." in 38th Symposium on Principles of Programming Languages. (1 2011).
  • Pouchet, L-N.; Bondhugula, U.; Bastoul, C.; Cohen, A. et al. "Loop Transformations: Convexity, Pruning and Optimization." (1 2011).
  • Park, E.; Pouchet, L-N.; Cavazos, J.; Cohen, A. et al. "Predictive Modeling in a Polyhedral Optimization Space." in 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization (CGO). (1 2011).
  • Henretty, T.; Stock, K.; Pouchet, L-N.; Franchetti, F. et al. "Data Layout Transformation for Stencil Computations on Short-Vector SIMD Architectures." (1 2011).
  • Pouchet, L-N.; Bondhugula, U.; Bastoul, C.; Cohen, A. et al. "Loop Transformations: Convexity, Pruning and Optimization." (1 2011).
  • Pouchet, L-N.; Bondhugula, U.; Bastoul, C.; Cohen, A. et al. "Loop Transformations: Convexity, Pruning and Optimization." (1 2011).
  • Ali, N.; Krishnamoorthy, S.; Govind, N.; Kowalski, K. et al. "Application-specific fault tolerance via data access characterization." (9 2011).
  • Minnich, R.G.; Janssen, C.L.; Krishnamoorthy, S.; Marquez, A. et al. "Fault oblivious eXascale whitepaper." (7 2011).
  • Sedaghati, N.; Thomas, R.; Pouchet, L.N.; Teodorescu, R. et al. "StVEC: A vector instruction extension for high performance stencil computation." (12 2011).
  • Nawab Ali, Sriram Krishnamoorthy, Niranjan Govind, Karol Kowalski, and P. Sadayappan "Application-Specific Fault Tolerance via Data Access Characterization." (8 2011).
  • Sadayappan,P; Krishnamoorthy,Sriram; Lai,Pai-Wei; Pouchet,Louis-Noel; Stock,Kevin; Zhang,Huaijian "Optimization and performance-portable transformation of high level specifications of tensor contraction expressions." in 242nd National Meeting of the American-Chemical-Society (ACS). (8 2011).
  • Henretty, T.; Stock, K.; Pouchet, L.N.; Franchetti, F. et al. "Data layout transformation for stencil computations on short-vector SIMD architectures." (4 2011).
  • Sanket Tavarageri, Louis-Noël Pouchet, J. Ramanujam, Atanas Rountev, and P. Sadayappan "Dynamic selection of tile sizes." (12 2011).
  • Pouchet, L.N.; Bondhugula, U.; Bastoul, C.; Cohen, A. et al. "Loop transformations: Convexity, pruning and optimization." (1 2011).
  • Pouchet, LN; Bondhugula, U; Bastoul, C; Cohen, A; Ramanujam, J; Sadayappan, P; Vasilache, N "Loop Transformations: Convexity, Pruning and Optimization." in ACM Symposium on Principles of Programming Languages (POPL 2011). (1 2011).
  • Tavarageri, S.; Pouchet, L.N.; Ramanujam, J.; Rountev, A. et al. "Dynamic selection of tile sizes." (12 2011).
  • Pouchet,Louis-Noel; Bondhugula,Uday; Bastoul,Cedric; Cohen,Albert; Ramanujam,J; Sadayappan,P; Vasilache,Nicolas "Loop Transformations: Convexity, Pruning and Optimization." in 38th Symposium on Principles of Programming Languages. (1 2011).
  • Henretty,Tom; Stock,Kevin; Pouchet,Louis-Noel; Franchetti,Franz; Ramanujam,J; Sadayappan,P "Data Layout Transformation for Stencil Computations on Short-Vector SIMD Architectures." in 20th International Conference on Compiler Construction. (1 2011).
  • Eunjung Park, Louis-Noël Pouchet, John Cavazos, Albert Cohen, P. Sadayappan "Predictive modeling in a polyhedral optimization space." in International Symposium on Code Generation and Optimization (CGO 2011). (4 2011).
  • Kevin Stock, Thomas Henretty, Iyyappa Murugandi, P. Sadayappan, Robert Harrison "Model-Driven SIMD Code Generation for a Multi-Resolution Tensor Kernel." in IEEE International Parallel and Distributed Processing Symposium (IPDPS 2011). (5 2011).
  • Naser Sedaghati, Renji Thomas, Louis-Noel Pouchet, Radu Teodorescu, P. Sadayappan "StVEC: A Vector Instruction Extension for High Performance Stencil Computation." in International Conference on Parallel Architectures and Compilation Techniques (PACT). (10 2011).
  • Naser Sedaghati, Renji Thomas, Louis-Noël Pouchet, Radu Teodorescu, and P. Sadayappan "StVEC: A Vector Instruction Extension for High Performance Stencil Computation." in PACT. (10 2011).
  • Thomas Henretty, Kevin Stock, Louis-Noël Pouchet, Franz Franchetti, J. Ramanujam, P. Sadayappan "Data Layout Transformation for Stencil Computations on Short-Vector SIMD Architectures." in International Conference on Compiler Construction (CC 2011). (3 2011).
  • Park,Eunjung; Pouchet,Louis-Noel; Cavazos,John; Cohen,Albert; Sadayappan,P "Predictive Modeling in a Polyhedral Optimization Space." in 9th Annual IEEE/ACM International Symposium on Code Generation and Optimization (CGO). (1 2011).

2010

  • Muthu Manikandan Baskaran, J. Ramanujam, P. Sadayappan "Automatic C-to-CUDA code generation for affine programs." in 19th International Conference on Compiler Construction, CC 2010. (3 2010).
  • Baskaran, M.M.; Ramanujam, J.; Sadayappan, P. "Automatic C-to-CUDA Code Generation for Affine Programs." (1 2010).
  • A. Hartono, M. Baskaran, J. Ramanujam, and P. Sadayappan "Parametric Tiled Loop Generation for Effective Parallel Execution on Multicore Processors." in Proc. 24th IEEE International Parallel and Distributed Processing Symposium (IPDPS). (1 2010).
  • Rountev, A.; Van Valkenburgh, K.; Yan, D.; Sadayappan, P. "Understanding parallelism-inhibiting dependences in sequential Java programs." (12 2010).
  • Dinan, J.; Balaji, P.; Lusk, E.; Sadayappan, P. et al. "Hybrid parallel programming with MPI and unified parallel C." (7 2010).
  • Muthu Manikandan Baskaran, Albert Hartono, Sanket Tavarageri, Thomas Henretty, J. Ramanujam, P. Sadayappan "Parameterized tiling revisited." in 8th International Symposium on Code Generation and Optimization (CGO 2010). (4 2010).
  • Baskaran, M.M.; Ramanujam, J.; Sadayappan, P. "Automatic C-to-CUDA Code Generation for Affine Programs." in 19th International Conference on Compiler Construction held at the 13th Joint European Conference on Theory and Practice of Software. (1 2010).
  • Devulapalli, A.; Dalessandro, D.; Wyckoff, P.; Ali, N. et al. "Integrating Parallel File Systems with Object-Based Storage Devices." in ACM/IEEE SC07 Conference. (1 2010).
  • Pouchet, L.N.; Bondhugula, U.; Bastoul, C.; Cohen, A. et al. "Loop transformations: Convexity, pruning and optimization." (12 2010).
  • Baskaran, M.M.; Hartono, A.; Tavarageri, S.; Henretty, T. et al. "Parameterized tiling revisited." (7 2010).
  • Dinan, J.; Balaji, P.; Lusk, E.; Sadayappan, P. et al. "Hybrid Parallel Programming with MPI and Unified Parallel C." in 7th ACM International Conference on Computing Frontiers (CF). (1 2010).
  • Murthy, G.S.; Ravishankar, M.; Baskaran, M.M.; Sadayappan, P. "Optimal loop unrolling for GPGPU programs." (7 2010).
  • Pouchet, L.N.; Bondhugula, U.; Bastoul, C.; Cohen, A. et al. "Combined iterative and model-driven optimization in an automatic parallelization framework." (12 2010).
  • Dinan, J.; Singri, A.; Sadayappan, P.; Krishnamoorthy, S. "Selective recovery from failures in a task parallel programming model." (7 2010).
  • Leung, V.J.; Sabin, G.; Sadayappan, P. "Parallel job scheduling policies to improve fairness: A case study." (12 2010).
  • Rountev, A.; Van Valkenburgh, K.; Yan, D.; Sadayappan, P. "Understanding Parallelism-Inhibiting Dependences in Sequential Java Programs." in International Conference on Software Maintenance. (1 2010).
  • Hartono, A.; Baskaran, M.M.; Ramanujam, J.; Sadayappan, P. "DynTile: Parametric tiled loop generation for parallel execution on multicore processors." (7 2010).
  • Louis-Noël Pouchet, Uday Bondhugula, Cédric Bastoul, Albert Cohen, J. Ramanujam, P. Sadayappan "Combined Iterative and Model-driven Optimization in an Automatic Parallelization Framework." in Supercomputing 2010 (SC 2010). (11 2010).
  • Baskaran, M.M.; Ramanujam, J.; Sadayappan, P. "Automatic C-to-CUDA code generation for affine programs." (5 2010).
  • Atanas Rountev, Kevin Van Valkenburgh, Dacong Yan, and P. Sadayappan "Understanding Parallelism-Inhibiting Dependences in Sequential Java Programs." in IEEE International Conference on Software Maintenance (ICSM'10). (9 2010).
  • Devulapalli, A.; Dalessandro, D.; Wyckoff, P.; Ali, N. et al. "Integrating Parallel File Systems with Object-Based Storage Devices." (1 2010).
  • Baskaran, M.M.; Hartono, A.; Tavarageri, S.; Henretty, T. et al. "Parameterized Tiling Revisited." (1 2010).
  • Rountev, A.; Van Valkenburgh, K.; Yan, D.; Sadayappan, P. et al. "Understanding Parallelism-Inhibiting Dependences in Sequential Java Programs." (1 2010).
  • James Dinan, Pavan Balaji, Ewing Lusk, Rajeev Thakur, P. Sadayappan "Hybrid parallel programming with MPI and Unified Parallel C." in Proceedings of ACM International Conference on Computing Frontiers, CF 2010. (5 2010).
  • Giridhar Sreenivasa Murthy, Mahesh Ravishankar, Muthu Manikandan Baskaran, P. Sadayappan "Optimal loop unrolling for GPGPU programs." in Proceedings of IEEE International Parallel & Distributed Processing Symposium, IPDPS 2010. (4 2010).
  • Atanas Rountev, Kevin Van Valkenburgh, Dacong Yan, P. Sadayappan "Understanding parallelism-inhibiting dependences in sequential Java programs." in International Conference on Software Maintenance (ICSM 2010). (9 2010).
  • Dinan, J.; Balaji, P.; Lusk, E.; Sadayappan, P. et al. "Hybrid Parallel Programming with MPI and Unified Parallel C." (1 2010).
  • Baskaran, M.M.; Hartono, A.; Tavarageri, S.; Henretty, T. et al. "Parameterized Tiling Revisited." in 8th International Symposium on Code Generation and Optimization. (1 2010).

2009

  • James Dinan, D. Brian Larkins, P. Sadayappan, Sriram Krishnamoorthy, Jarek Nieplocha "Scalable work stealing." in roceedings of the ACM/IEEE Conference on High Performance Computing, SC 2009. (11 2009).
  • Hartono, A.; Norris, B.; Sadayappan, P. "Annotation-Based Empirical Performance Tuning Using Orio." in 23rd IEEE International Parallel and Distributed Processing Symposium. (1 2009).
  • Lin, J.; Lu, Q.; Ding, X.; Zhang, Z. et al. "Enabling Software Management for Multicore Caches with a Lightweight Hardware Support." (1 2009).
  • Albert Hartono, Muthu Manikandan Baskaran, Cédric Bastoul, Albert Cohen, Sriram Krishnamoorthy, Boyana Norris, J. Ramanujam, P. Sadayappan "Parametric multi-level tiling of imperfectly nested loops." in Proceedings of the International Conference on Supercomputing. (6 2009).
  • Ali, N.; Carns, P.; Iskra, K.; Kimpe, D. et al. "Scalable I/O Forwarding Framework for High-Performance Computing Systems." in IEEE International Conference on Cluster Computing (Cluster 2009). (1 2009).
  • Ali, N.; Carns, P.; Iskra, K.; Kimpe, D. et al. "Scalable I/O forwarding framework for high-performance computing systems." (12 2009).
  • Albert Hartono, Boyana Norris, and P. Sadayappan "Annotation-based empirical performance tuning using Orio." in Proceedings of the 2009 IEEE International Parallel and Distributed Processing Symposium. (5 2009).
  • Lin, J.; Lu, Q.; Ding, X.; Zhang, Z. et al. "Enabling Software Management for Multicore Caches with a Lightweight Hardware Support." in Conference on High Performance Computing Networking, Storage and Analysis. (1 2009).
  • Ali, N.; Carns, P.; Iskra, K.; Kimpe, D. et al. "Scalable I/O Forwarding Framework for High-Performance Computing Systems." (1 2009).
  • Baskaran, M.M.; Vydyanathan, N.; Bondhugula, U.K.; Ramanujam, J. et al. "Compiler-Assisted Dynamic Scheduling for Effective Parallelization of Loop Nests on Multicore Processors." in 14th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. (4 2009).
  • Baskaran, M.M.; Vydyanathan, N.; Bondhugula, U.K.; Ramanujam, J. et al. "Compiler-Assisted Dynamic Scheduling for Effective Parallelization of Loop Nests on Multicore Processors." (4 2009).
  • Lu, Q.; Lin, J.; Ding, X.; Zhang, Z. et al. "Soft-OLP: Improving hardware cache performance through software-controlled object-level partitioning." (11 2009).
  • Hartono, A.; Norris, B.; Sadayappan, P. "Annotation-based empirical performance tuning using orio." (11 2009).
  • Dinan, J.; Larkins, D.B.; Sadayappan, P.; Krishnamoorthy, S. et al. "Scalable work stealing." (12 2009).
  • Hartono, A.; Norris, B.; Sadayappan, P.; IEEE, "Annotation-Based Empirical Performance Tuning Using Orio." (1 2009).
  • Lu, Q.; Alias, C.; Bondhugula, U.; Henretty, T. et al. "Data layout transformation for enhancing data locality on NUCA chip multiprocessors." (11 2009).
  • Lin, J.; Lu, Q.; Ding, X.; Zhang, Z. et al. "Enabling software management for multicore caches with a lightweight hardware support." (12 2009).
  • Lu, Q.; Lin, J.; Ding, X.; Zhang, Z. et al. "Soft-OLP: Improving Hardware Cache Performance Through Software-Controlled Object-Level Partitioning." in 18th International Conference on Parallel Architectures and Compilation Techniques. (1 2009).
  • Hartono, A.; Baskaran, M.M.; Bastoul, C.; Cohen, A. et al. "Parametric multi-level tiling of imperfectly nested loops." (11 2009).
  • Lu, Q.; Lin, J.; Ding, X.; Zhang, Z. et al. "Soft-OLP: Improving Hardware Cache Performance Through Software-Controlled Object-Level Partitioning." (1 2009).
  • N. Vydyanathan, S. Krishnamoorthy, G. Sabin, Ãœ.V. Çatalyürek, T. Kurc, P. Sadayappan, J. Saltz "An Integrated Approach to Locality Conscious PRocessor Allocation and Scheduling of Mixed Parallel Applications." (8 2009).
  • Baskaran, M.M.; Vydyanathan, N.; Bondhugula, U.K.; Ramanujam, J. et al. "Compiler-assisted dynamic scheduling for effective parallelization of loop nests on multicore processors." (7 2009).
  • Muthu Baskaran, Nagavijayalakshmi Vydyanathan, Uday Bondhugula, J. Ramanujam, Atanas Rountev, and P. Sadayappan "Compiler-Assisted Dynamic Scheduling for Effective Parallelization of Loop Nests on Multicore Processors." in ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP'09). (2 2009).
  • Qingda Lu, Jiang Lin, Xiaoning Ding, Zhao Zhang, Xiaodong Zhang, P. Sadayappan "Soft-OLP: Improving Hardware Cache Performance through Software-Controlled Object-Level Partitioning." in Proceedings of the 18th International Conference on Parallel Architectures and Compilation Techniques, PACT 2009. (9 2009).
  • Baskaran, M.M.; Vydyanathan, N.; Bondhugula, U.K.; Ramanujam, J. et al. "Compiler-assisted dynamic scheduling for effective parallelization of loop nests on multicore processors." (11 2009).
  • Nawab Ali, Philip H. Carns, Kamil Iskra, Dries Kimpe, Samuel Lang, Robert Latham, Robert B. Ross, Lee Ward, P. Sadayappan "Scalable I/O forwarding framework for high-performance computing systems." in Proceedings of the 2009 IEEE International Conference on Cluster Computing. (9 2009).
  • Dinan, J.; Larkins, D.B.; Sadayappan, P.; Krishnamoorthy, S. et al. "Scalable Work Stealing." in Conference on High Performance Computing Networking, Storage and Analysis. (1 2009).
  • Dinan, J.; Larkins, D.B.; Sadayappan, P.; Krishnamoorthy, S. et al. "Scalable Work Stealing." (1 2009).
  • Vijay S. Kumar, P. Sadayappan, Gaurang Mehta, Karan Vahi, Ewa Deelman, Varun Ratnakar, Jihie Kim, Yolanda Gil, Mary W. Hall, Tahsin M. Kurç, Joel H. Saltz "An integrated framework for performance-based optimization of scientific workflows." in Proceedings of the 18th ACM International Symposium on High Performance Distributed Computing, HPDC 2009. (6 2009).
  • Qingda Lu, Christophe Alias, Uday Bondhugula, Thomas Henretty, Sriram Krishnamoorthy, J. Ramanujam, Atanas Rountev, P. Sadayappan, Yongjian Chen, Haibo Lin, Tin-fook Ngai "Data Layout Transformation for Enhancing Data Locality on NUCA Chip Multiprocessors." in Proceedings of the 18th International Conference on Parallel Architectures and Compilation Techniques, PACT 2009. (9 2009).
  • Lu, Q.; Alias, C.; Bondhugula, U.; Henretty, T. et al. "Data Layout Transformation for Enhancing Data Locality on NUCA Chip Multiprocessors." in 18th International Conference on Parallel Architectures and Compilation Techniques. (1 2009).
  • Hartono, A.; Baskaran, M.M.; Bastoul, C.; Cohen, A. et al. "Parametric Multi-Level Tiling of Imperfectly Nested Loops." in ACM SIGARCH International Conference on Supercomputing. (1 2009).
  • Hartono, A.; Baskaran, M.M.; Bastoul, C.; Cohen, A. et al. "Parametric Multi-Level Tiling of Imperfectly Nested Loops." (1 2009).
  • Lu, Q.; Alias, C.; Bondhugula, U.; Henretty, T. et al. "Data Layout Transformation for Enhancing Data Locality on NUCA Chip Multiprocessors." (1 2009).
  • Jiang Lin, Qingda Lu, Xiaoning Ding, Zhao Zhang, Xiaodong Zhang, P. Sadayappan "Enabling software management for multicore caches with a lightweight hardware support." in roceedings of the ACM/IEEE Conference on High Performance Computing, SC 2009. (11 2009).
  • Muthu Manikandan Baskaran, Nagavijayalakshmi Vydyanathan, Uday Bondhugula, J. Ramanujam, Atanas Rountev "Compiler-assisted dynamic scheduling for effective parallelization of loop nests on multicore processors." in Proceedings of the 14th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPOPP 2009. (2 2009).

2008

  • Ali, N.; Devulapalli, A.; Dalessandro, D.; Wyckoff, P. et al. "Revisiting the metadata architecture of parallel file systems." (12 2008).
  • D. Brian Larkins, J. Dinan, S. Krishnamoorthy, S. Parthasarathy, A. Rountev and P. Sadayappan: Global trees: a framework for linked data structures on distributed memory parallel systems.. in Proceedings of the ACM/IEEE Conference on High Performance Computing, SC 2008. http://www.informatik.uni-trier.de/~ley/db/conf/sc/sc2008.html#LarkinsDKPRS08, (1 2008).
  • Ali, N.; Devulapalli, A.; Dalessandro, D.; Wyckoff, P. et al. "An OSD-based Approach to Managing Directory Operations in Parallel File Systems." in IEEE International Conference on Cluster Computing. (1 2008).
  • Nieplocha, J.; Krishamoorthy, S.; Valiev, M.; Krishnan, M. et al. "Integrated data and task management for scientific applications." in 8th International Conference on Computational Science. (1 2008).
  • Khanna, G.; Catalyurek, U.; Kurc, T.; Kettimuthu, R. et al. "Using Overlays For Efficient Data Transfer Over Shared Wide-Area Networks." in International Conference for High Performance Computing, Networking, Storage and Analysis. (1 2008).
  • Ali, N.; Devulapalli, A.; Dalessandro, D.; Wyckoff, P. et al. "Revisiting the Metadata Architecture of Parallel File Systems." in 3rd Petascale Data Storage Workshop 2008. (1 2008).
  • Muthu Baskaran, Uday Bondhugula, Sriram Krishnamoorthy, J. Ramanujam, Atanas Rountev, and P. Sadayappan "Automatic Data Movement and Computation Mapping for Multi-level Parallel Architectures with Explicitly Managed Memories." in ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP'08). (2 2008).
  • Desai, N.; Balaji, P.; Sadayappan, P.; Islam, M. "Are Nonblocking Networks Really Needed for High-End-Computing Workloads?." in IEEE International Conference on Cluster Computing. (1 2008).
  • Lin, J.; Lu, Q.; Ding, X.; Zhang, Z. et al. "Gaining insights into multicore cache partitioning: Bridging the gap between simulation and real systems." (12 2008).
  • Bondhugula, U.; Hartono, A.; Ramanujam, J.; Sadayappan, P. "A practical automatic polyhedral parallelizer and locality optimizer." in ACM SIGPLAN Conference on Programming Language Design and Implementation. (6 2008).
  • Ali, N.; Devulapalli, A.; Dalessandro, D.; Wyckoff, P. et al. "Revisiting the Metadata Architecture of Parallel File Systems." (1 2008).
  • Desai, N.; Balaji, P.; Sadayappan, P.; Islam, M. "Are nonblocking networks really needed for high-end-computing workloads?." (1 2008).
  • Bondhugula, U.; Hartono, A.; Ramanujam, J.; Sadayappan, P. "A practical automatic polyhedral parallelizer and locality optimizer." (6 2008).
  • Khanna, G.; Catalyurek, U.; Kurc, T.; Kettimuthu, R. et al. "A dynamic scheduling approach for coordinated wide-area data transfers using GridFTP." in 10th Workshop on Advances in Parallel and Distributed Computational Models/22nd IEEE International Parallel and Distributed Processing Symposium. (1 2008).
  • G. Khanna, U. Catalyurek, T. Kurc, R. Kettimuthu, P. Sadayappan, and J. H. Saltz "“A dynamic scheduling approach for coordinated wide-area data transfers using GridFTP,”." in Proc. 22nd IEEE International Parallel and Distributed Processing Symposium (IPDPS). (4 2008).
  • Dinan, J.; Krishnamoorthy, S.; Larkins, D.B.; Nieplocha, J. et al. "Scioto: A framework for global-view task parallelism." (11 2008).
  • Uday Bondhugula, Muthu Baskaran, Sriram Krishnamoorthy, J. Ramanujam, A. Rountev, and P. Sadayappan "Automatic Transformations for Communication-Minimized Parallelization and Locality Optimization in the Polyhedral Model." in International Conference on Compiler Construction (CC'08). (4 2008).
  • Khanna, G.; Catalyurek, U.; Kurc, T.; Kettimuthu, R. et al. "Using overlays for efficient data transfer over shared wide-area networks." (12 2008).
  • Baskaran, M.M.; Bondhugula, U.; Krishnamoorthy, S.; Ramanujam, J. et al. "A Compiler Framework for Optimization of Affine Loop Nests for GPGPUs." in 22nd ACM International Conference on Supercomputing. (1 2008).
  • M. Baskaran, U. Bondhugula, J. Ramanujam, A. Rountev, and P. Sadayappan "“A compiler framework for optimization of affine loop nests for GPGPUs”." in ACM International Conferenceon Supercomputing (ICS). (1 2008).
  • U. Bondhugula, A. Hartono, J. Ramanujam, and P. Sadayappan "“A Practical Automatic Polyhedral Parallelizer and Locality Optimizer,”." in ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). (1 2008).
  • Larkins, D.B.; Dinan, J.; Krishnamoorthy, S.; Parthasarathy, S. et al. "Global Trees: A Framework for Linked Data Structures on Distributed Memory Parallel Systems." in International Conference for High Performance Computing, Networking, Storage and Analysis. (1 2008).
  • D. B. Larkins, J. Dinan, S. Krishnamoorthy, S. Parthasarathy, A. Rountev, and P. Sadayappan "“Global trees: a framework for linked data structures on distributed memory parallel systems”." in Proc. Supercomputing (SC). (11 2008).
  • Khanna, G.; Catalyurek, U.; Kurc, T.; Kettimuthu, R. et al. "Using Overlays For Efficient Data Transfer Over Shared Wide-Area Networks." (1 2008).
  • G. Khanna, U. Catalyurek, T. Kurc, R. Kettimuthu, P. Sadayappan, I. Foster, J. H. Saltz "Using overlays for efficient data transfer over shared wide-area networks." in Proc. Supercomputing (SC). (11 2008).
  • Vydyanathant, N.; Catalyurek, U.; Kurc, T.; Sadayappan, P. et al. "A duplication based algorithm for optimizing latency under throughput constraints for streaming workflows." (11 2008).
  • Bondhugula, U.; Baskaran, M.; Krishnamoorthy, S.; Ramanujam, J. et al. "Automatic transformations for communication-minimized parallelization and locality optimization in the polyhedral model." in 17th International Conference on Compiler Construction. (1 2008).
  • Khanna, G.; Catalyurek, U.; Kurc, T.; Sadayappan, P. et al. "Multi-hop path splitting and multi-pathing optimizations for data transfers over shared wide-area networks using GridFTP." (12 2008).
  • Baskaran, M.M.; Bondhugula, U.; Krishnamoorthy, S.; Ramanujam, J. et al. "Automatic Data Movement and Computation Mapping for Multi-level Parallel Architectures with Explicitly Managed Memories." in ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP 08). (1 2008).
  • Bondhugula, U.; Baskaran, M.; Hartonol, A.; Krishnamoorthy, S. et al. "Towards effective automatic parallelization for multicore systems." in 10th Workshop on Advances in Parallel and Distributed Computational Models/22nd IEEE International Parallel and Distributed Processing Symposium. (1 2008).
  • Baskaran, M.M.; Bondhugula, U.; Krishnamoorthy, S.; Ramanujam, J. et al. "A Compiler Framework for Optimization of Affine Loop Nests for GPGPUs." (1 2008).
  • Bondhugula, U.; Baskaran, M.; Hartonol, A.; Krishnamoorthy, S. et al. "Towards effective automatic parallelization for multicore systems." (1 2008).
  • Baskaran, M.M.; Bondhugula, U.; Krishnamoorthy, S.; Ramanujam, J. et al. "Automatic Data Movement and Computation Mapping for Multi-level Parallel Architectures with Explicitly Managed Memories." (1 2008).
  • Bondhugula, U.; Baskaran, M.; Krishnamoorthy, S.; Ramanujam, J. et al. "Automatic transformations for communication-minimized parallelization and locality optimization in the polyhedral model." (1 2008).
  • Ali, N.; Devulapalli, A.; Dalessandro, D.; Wyckoff, P. et al. "An OSD-based Approach to Managing Directory Operations in Parallel File Systems." (1 2008).
  • Lin, J.; Lu, Q.; Ding, X.; Zhang, Z. et al. "Gaining Insights into Multicore Cache Partitioning: Bridging the Gap between Simulation and Real Systems." (1 2008).
  • Desai, N.; Balaji, P.; Sadayappan, P.; Islam, M. et al. "Are Nonblocking Networks Really Needed for High-End-Computing Workloads?." (1 2008).
  • Khanna, G.; Catalyurek, U.; Kurc, T.; Kettimuthu, R. et al. "A dynamic scheduling approach for coordinated wide-area data transfers using GridFTP." (1 2008).
  • M. Baskaran, U. Bondhugula, S. Krishnamoorthy, J. Ramanujam, A. Rountev, and P. Sadayappan "Automatic data movement and computation mapping for multi-level parallel architectures with explicitly managed memories." in 13th ACM SIGPLAN Symposium on Principlesand Practice of Parallel Programming (PPoPP). (1 2008).
  • Khanna, G.; Catalyurek, U.; Kurc, T.; Kettimuthu, R. et al. "A dynamic scheduling approach for coordinated wide-area data transfers using GridFTP." (9 2008).
  • G. Khanna, Ü.V. Çatalyürek, T. Kurc, R. Kettimuthu, P. Sadayappan, I. Foster, J. Saltz "Multi-Hop Path Splitting and Multi-Pathing Optimizations for Data Transfers Over Shared Wide-Area Networks Using GridFTP." (1 2008).
  • Bondhugula, U.; Hartono, A.; Ramanujam, J.; Sadayappan, P. "A practical automatic polyhedral parallelizer and locality optimizer." (12 2008).
  • Larkins, D.B.; Dinan, J.; Krishnamoorthy, S.; Parthasarathy, S. et al. "Global trees: A framework for linked data structures on distributed memory parallel systems." (12 2008).
  • Nieplocha, J.; Krishamoorthy, S.; Valiev, M.; Krishnan, M. et al. "Integrated data and task management for scientific applications." (1 2008).
  • J. Lin, Q. Lu, X. Ding, Z. Zhang, X. Zhang, and P. Sadayappan "Gaining insights into multicore cache partitioning: Bridging the gap between simulation and real systems." in 14th International Symposium on High-Performance Computer Architecture (HPCA). (2 2008).
  • Nieplocha, J.; Krishamoorthy, S.; Valiev, M.; Krishnan, M. et al. "Integrated data and task management for scientific applications." (7 2008).
  • Bondhugula, U.; Baskaran, M.; Krishnamoorthy, S.; Ramanujam, J. et al. "Automatic transformations for communication-minimized parallelization and locality optimization in the polyhedral model." (7 2008).
  • Baskaran, M.M.; Ramanujam, J.; Bondhugula, U.; Rountev, A. et al. "Automatic data movement and computation mapping for multi-level parallel architectures with explicitly managed memories." (12 2008).
  • Larkins, D.B.; Dinan, J.; Krishnamoorthy, S.; Parthasarathy, S. et al. "Global Trees: A Framework for Linked Data Structures on Distributed Memory Parallel Systems." (1 2008).
  • Bondhugula, U.; Baskaran, M.; Hartono, A.; Krishnamoorthy, S. et al. "Towards effective automatic parallelization for multicore systems." (9 2008).
  • U. Bondhugula, M. Baskaran, S. Krishnamoorthy, J. Ramanujam, A. Rountev and P. Sadayappan "Automatic transformations for communication-minimized parallelization and locality optimization in the polyhedral model." in International Conference on Compiler Construction(CC). (1 2008).
  • Baskaran, M.M.; Bondhugula, U.; Krishnamoorthy, S.; Ramanujam, J. et al. "A compiler framework for optimization of affine loop nests for GPGPUs." (12 2008).
  • Ali, N.; Devulapalli, A.; Dalessandro, D.; Wyckoff, P. et al. "An OSD-based approach to managing directory operations in parallel file systems." (1 2008).
  • Bondhugula, U.; Hartono, A.; Ramanujam, J.; Sadayappan, P. "A practical automatic polyhedral parallelizer and locality optimizer." (6 2008).
  • Lin, J.; Lu, Q.; Ding, X.; Zhang, Z. et al. "Gaining Insights into Multicore Cache Partitioning: Bridging the Gap between Simulation and Real Systems." in 14th International Symposium on High-Performance Computer Architecture. (1 2008).

2007

  • Bondhugula, U.; Ramanujam, J.; Sadayappan, P. "Automatic mapping of nested loops to FPGAS." (10 2007).
  • Sabin, G.; Lang, M.; Sadayappan, P. "Moldable parallel job scheduling using job efficiency: An iterative approach." (12 2007).
  • Olivier, S.; Huan, J.; Liu, J.; Prins, J. et al. "UTS: An unbalanced tree search benchmark." (12 2007).
  • S. Krishnamoorthy. M. Baskaran, U. Bondhugula, J. Ramanujam, A. Rountev, and P. Sadayappan "Effective automatic parallelization of stencil computations." in ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI). (1 2007).
  • Islam, M.; Balaji, P.; Sabin, G.; Sadayappan, P. "Analyzing and minimizing the impact of opportunity cost in QoS-aware job scheduling." (12 2007).
  • Diñan, J.; Olivier, S.; Sabin, G.; Prins, J. et al. "Dynamic load balancing of unbalanced computations using message passing." (9 2007).
  • Islam, M.; Balaji, P.; Sabin, G.; Sadayappan, P. "Analyzing and Minimizing the Impact of Opportunity Cost in QoS-aware Job Scheduling." in 36th Annual International Conference on Parallel Processing (ICPP 2007). (1 2007).
  • Bondhugula, U.; Ramanujam, J.; Sadayappan, P.; ACM, "Automatic Mapping of Nested Loops to FPGAs." (1 2007).
  • Olivier, S.; Huan, J.; Liu, J.; Prins, J. et al. "UTS: An unbalanced tree search benchmark." (1 2007).
  • Sabin, G.; Lang, M.; Sadayappan, P. "Moldable parallel job scheduling using job efficiency: An iterative approach." (1 2007).
  • Sriram Krishnamoorthy, Muthu Baskaran, Uday Bondhugula, J. Ramanujam, Atanas Rountev, and P. Sadayappan "Effective Automatic Parallelization of Stencil Computations." in ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI'07). (6 2007).
  • G. Khanna, Ü.V. Çatalyürek, T. Kurc, R. Kettimuthu, P. Sadayappan, I. Foster, J. Saltz "Scheduling File Transfers for Data-Intensive Jobs on Heterogeneous Clusters." (1 2007).
  • Islam, M.; Balaji, P.; Sabin, G.; Sadayappan, P. et al. "Analyzing and Minimizing the Impact of Opportunity Cost in QoS-aware Job Scheduling." (1 2007).
  • Khanna, G.; Catalyurek, U.; Kurc, T.; Sadayappan, P. et al. "Scheduling file transfers for data-intensive jobs on heterogeneous clusters." (1 2007).
  • Krishnamoorthy, S.; Catalyurek, U.; Nieplocha, J.; Rountev, A. et al. "A global address space framework for locality aware scheduling of block-sparse computations." (9 2007).
  • Krishnamoorthy, S.; Canovas, J.P.; Tipparaju, V.; Nieplocha, J. et al. "Non-collective parallel I/O for global address space programming models." (12 2007).
  • Khanna, G.; Catalyurek, U.; Kure, T.; Sadayappan, P. et al. "A data locality aware online scheduling approach for I/O-intensive jobs with file sharing." (12 2007).
  • Krishnamoorthy, S.; Baskaran, M.; Bondhugula, U.; Ramanujam, J. et al. "Effective automatic parallelization of stencil computations." (6 2007).
  • Vydyanathan, N.; Catalyurek, U.V.; Kure, T.M.; Sadayappan, P. et al. "Toward optimizing latency under throughput constraints for application workflows on clusters." (12 2007).
  • Krishnamoorthy, S.; Canovas, J.P.; Tipparaju, V.; Nieplocha, J. et al. "Non-collective Parallel I/O for Global Address Space Programming Models." in IEEE International Conference on Cluster Computing. (1 2007).
  • Krishnamoorthy, S.; Baskaran, M.; Bondhugula, U.; Ramanujam, J. et al. "Effective automatic parallelization of stencil computations." in Conference on Programming Language Design and Implementation. (6 2007).
  • Khanna, G.; Catalyurek, U.; Kure, T.; Sadayappan, P. et al. "Scheduling file transfers for data-intensive jobs on heterogeneous clusters." (12 2007).
  • Vydyanathan, N.; Catalyurek, U.V.; Kurc, T.M.; Sadayappan, P. et al. "Toward optimizing latency under throughput constraints for application workflows on clusters." in 13th International Euro-Par Conference on Parallel Processing. (1 2007).
  • Krishnamoorthy, S.; Baskaran, M.; Bondhugula, U.; Ramanujam, J. et al. "Effective Automatic Parallelization of Stencil Computations." in Conference on Programming Language Design and Implementation. (1 2007).
  • Bondhugula, U.; Ramanujam, J.; Sadayappan, P. "Automatic Mapping of Nested Loops to FPGAs." in ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. (1 2007).
  • Khanna, G.; Catalyurek, U.; Kurc, T.; Sadayappan, P. et al. "Scheduling file transfers for data-intensive jobs on heterogeneous clusters." in 13th International Euro-Par Conference on Parallel Processing. (1 2007).
  • Olivier, S.; Huan, J.; Liu, J.; Prins, J. et al. "UTS: An unbalanced tree search benchmark." in 19th International Workshop on Languages and Compilers for Parallel Computing. (1 2007).
  • Krishnamoorthy, S.; Baskaran, M.; Bondhugula, U.; Ramanujam, J. et al. "Effective automatic parallelization of stencil computations." (10 2007).
  • Vydyanathan, N.; Catalyurek, U.V.; Kurc, T.M.; Sadayappan, P. et al. "Toward optimizing latency under throughput constraints for application workflows on clusters." (1 2007).
  • Krishnamoorthy, S.; Canovas, J.P.; Tipparaju, V.; Nieplocha, J. et al. "Non-collective Parallel I/O for Global Address Space Programming Models." (1 2007).
  • U. Bondhugula, J. Ramanujam, and P. Sadayappan "Automatic Mapping of Nested Loops to FPGAs." in Proc. ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPoPP). (3 2007).
  • Khanna, G.; Catalyurek, U.; Kurc, T.; Sadayappan, P. et al. "A data locality aware online scheduling approach for I/O-intensive jobs with file sharing." in 12th International Workshop on Job Scheduling Strategies for Parallel Processing. (1 2007).
  • Khanna, G.; Catalyurek, U.; Kurc, T.; Sadayappan, P. et al. "A data locality aware online scheduling approach for I/O-intensive jobs with file sharing." (1 2007).
  • Krishnamoorthy, S.; Baskaran, M.; Bondhugula, U.; Ramanujam, J. et al. "Effective Automatic Parallelization of Stencil Computations." (1 2007).
  • Krishnamoorthy, S.; Baskaran, M.; Bondhugula, U.; Ramanujam, J. et al. "Effective automatic parallelization of stencil computations." (6 2007).
  • Sabin, G.; Lang, M.; Sadayappan, P. "Moldable parallel job scheduling using job efficiency: An iterative approach." in 12th International Workshop on Job Scheduling Strategies for Parallel Processing. (1 2007).
  • Devulapalli, A.; Dalessandro, D.; Wyckoff, P.; Ali, N. et al. "Integrating parallel file systems with object-based storage devices." (12 2007).

2006

  • Shet, A.G.; Sadayappan, P.; Bernholdt, D.E.; Nieplocha, J. et al. "A performance instrumentation framework to characterize computation-communication overlap in message-passing systems." (1 2006).
  • Vydyanathan, N.; Krishnamoorthy, S.; Sabin, G.; Catalyurek, U. et al. "An Integrated Approach for Processor Allocation and Scheduling of Mixed-Parallel Applications." (12 2006).
  • Khanna, G.; Vydyanathan, N.; Catalyurek, U.; Kurc, T. et al. "Task scheduling and file replication for data-intensive jobs with batch-shared I/O." (12 2006).
  • Vydyanathan, N.; Krishnamoorthy, S.; Sabin, G.; Catalyurek, U. et al. "Locality conscious processor allocation and scheduling for mixed parallel applications." (12 2006).
  • Krishnamoorthy, S.; Catalyurek, U.; Nieplocha, J.; Sadayappan, P. "An approach to locality-conscious load balancing and transparent memory hierarchy management with a global-address-space parallel programming model." (1 2006).
  • Vydyanathan, N.; Khanna, G.; Catalyurek, U.; Kurc, T. et al. "Scheduling of tasks with batch-shared I/O on heterogeneous systems." (1 2006).
  • Bondhugula, U.; Devulapalli, A.; Fernando, J.; Wyckoff, P. et al. "Parallel FPGA-based all-pairs shortest-paths in a directed graph." (12 2006).
  • Krishnamoorthy, S.; Baumgartner, G.; Lam, C-C.; Nieplocha, J. et al. "Layout transformation support for the disk resident arrays framework." in 5th Symposium of the Los-Alamos-Computer-Science-Institute. (5 2006).
  • U. Bondhugula, A. Devulapalli, J. Dinan, J. Fernando, P. Wyckoff, E. Stahlberg, and P. Sadayappan. "Hardware/Software Integration for FPGA-based All-Pairs Shortest-Paths." in IEEE Symposium on Field-Programmable Custom Computing Machines (FCCM). (4 2006).
  • S. Krishnamoorthy, U. Catalyurek, J. Nieplocha, A. Rountev and P. Sadayappan "Hypergraph Partitioning for Automatic Memory Hierarchy Management." in Proc. Supercomputing (SC). (11 2006).
  • Hartono, A.; Lu, Q.; Gao, X.; Krishnamoorthy, S. et al. "Identifying cost-effective common subexpressions to reduce operation count in tensor contraction evaluations." in 6th International Conference on Computational Science (ICCS 2006). (1 2006).
  • Krishnan, S.; Krishnamoorthy, S.; Baumgartner, G.; Lam, C.C. et al. "Efficient synthesis of out-of-core algorithms using a nonlinear optimization solver." in 18th International Parallel and Distributed Processing Symposium (IPDPS 2004). (5 2006).
  • Vydyanathan, N.; Krishnamoorthy, S.; Sabin, G.; Catalyurek, U. et al. "An integrated approach for processor allocation and scheduling of mixed-parallel applications." in 35th International Conference on Parallel Processing. (1 2006).
  • Shet, A.G.; Sadayappan, P.; Bernholdt, D.E.; Nieplocha, J. et al. "A performance instrumentation framework to characterize computation-communication overlap in message-passing systems." (12 2006).
  • Vydyanathan, N.; Krishnamoorthy, S.; Sabin, G.; Catalyurek, U. et al. "Locality conscious processor allocation and scheduling for mixed parallel applications." in IEEE International Conference on Cluster Computing. (1 2006).
  • Krishnamoorthy, S.; Catalyurek, U.; Nieplocha, J.; Rountev, A. et al. "An extensible global address space framework with decoupled task and data abstractions." (1 2006).
  • Krishnamoorthy, S.; Catalyurek, U.; Nieplocha, J.; Rountev, A. et al. "Hypergraph partitioning for automatic memory hierarchy management." (12 2006).
  • Krishnamoorthy, S.; Baumgartner, G.; Lam, C-C.; Nieplocha, J. et al. "Layout transformation support for the disk resident arrays framework." (5 2006).
  • Vydyanathan, N.; Krishnamoorthy, S.; Sabin, G.; Catalyurek, U. et al. "Locality conscious processor allocation and scheduling for mixed parallel applications." (1 2006).
  • Vydyanathan, N.; Krishnamoorthy, S.; Sabin, G.; Catalyurek, U. et al. "An integrated approach for processor allocation and scheduling of mixed-parallel applications." (1 2006).
  • Krishnan, S.; Krishnamoorthy, S.; Baumgartner, G.; Lam, C.C. et al. "Efficient synthesis of out-of-core algorithms using a nonlinear optimization solver." (5 2006).
  • Hartono, A.; Lu, Q.; Gao, X.; Krishnamoorthy, S. et al. "Identifying cost-effective common subexpressions to reduce operation count in tensor contraction evaluations." (1 2006).
  • Q. Lu, S. Krishnamoorthy and P. Sadayappan "Combining analytical and empirical approachesin tuning matrix transposition." in Proc. Intl. Conf. on Parallel Architectures and Compilation Techniques (PACT 2006). (9 2006).
  • Bondhugula, U.; Devulapalli, A.; Dinan, J.; Fernando, J. et al. "Hardware/software integration for FPGA-based All-Pairs Shortest-Paths." (1 2006).
  • U. Bondhugula, A. Devulapalli, J. Fernando, P. Wyckoff, and P. Sadayappan "Parallel FPGA based All-Pairs Shortest-Paths in a Directed Graph." in Proc. 20th IEEE International Parallel and Distributed Processing Symposium (IPDPS). (4 2006).
  • Krishnamoorthy, S.; Baumgartner, G.; Lam, C.C.; Nieplocha, J. et al. "Layout transformation support for the disk resident arrays framework." (5 2006).
  • Sabin, G.; Sadayappan, P. "Unfairness metrics for space-sharing parallel job schedulers." (6 2006).
  • Chaudhary, V.; Sadayappan, P. "Message from the CRTPC workshop co-chairs." (12 2006).
  • Krishnan, S.; Krishnamoorthy, S.; Baumgartner, G.; Lam, C.C. et al. "Efficient synthesis of out-of-core algorithms using a nonlinear optimization solver." (5 2006).
  • Bondhugula, U.; Devulapalli, A.; Dinan, J.; Fernando, J. et al. "Hardware/software integration for FPGA-based All-Pairs Shortest-Paths." in 14th Annual IEEE Symposium on Field-Programmable Custom Computing Machines. (1 2006).
  • Allam, A.; Ramanujam, J.; Baumgartner, G.; Sadayappan, P. "Memory minimization for tensor contractions using integer linear programming." (12 2006).
  • Bondhugula, U.; Devulapalli, A.; Dinan, J.; Fernando, J. et al. "Hardware/software integration for FPGA-based all-pairs shortest-paths." (12 2006).
  • Hartono, A.; Lu, Q.; Gao, X.; Krishnamoorthy, S. et al. "Identifying cost-effective common subexpressions to reduce operation count in tensor contraction evaluations." (1 2006).
  • Albert Hartono, Qingda Lu, Xiaoyang Gao, Sriram Krishnamoorthy, Marcel Nooijen, Gerald Baumgartner, Venkatesh Choppella, David Bernholdt, Russell Pitzer, J. Ramanujam, Atanas Rountev, and P. Sadayappan "Identifying Cost-Effective Common Subexpressions to Reduce Operation Count in Tensor Contraction Evaluations." in International Conference on Computational Science (ICCS'06). (5 2006).
  • Qingda, L.; Krishnamoorthy, S.; Sadayappan, P. "Combining analytical and empirical approaches in tuning matrix transposition." (12 2006).
  • Sriram Krishnamoorthy, Umit Catalyurek, Jarek Nieplocha, Atanas Rountev, and P. Sadayappan "Hypergraph Partitioning for Automated Memory Management." in International Conference for High Performance Computing, Networking, Storage and Analysis (SC'06). (11 2006).
  • Shet, A.G.; Sadayappan, P.; Bernholdt, D.E.; Nieplocha, J. et al. "A performance instrumentation framework to characterize computation-communication overlap in message-passing systems." in IEEE International Conference on Cluster Computing. (1 2006).

2005

  • Khanna, G.; Vydyanathan, N.; Catalyurek, U.; Kurc, T. et al. "Task scheduling and file replication for data-intensive jobs with batch-shared I/O." in 15th IEEE International Symposium on High Performance Distributed Computing. (1 2005).
  • Krishnamoorthy, S.; Nieplocha, J.; Sadayappan, P. "Data and computation abstractions for dynamic and irregular computations." in 12th International Conference on High Performance Computing (HiPC 2005). (1 2005).
  • Sabin, G.; Sadayappan, P. "Unfairness metrics for space-sharing parallel job schedulers." in 11th International Workshop Job Scheduling Strategies for Parallel Processing. (1 2005).
  • Gao, X.; Sahoo, S.K.; Lam, C.C.; Ramanujam, J. et al. "Performance modeling and optimization of parallel out-of-core tensor contractions." (12 2005).
  • Cociorva, D.; Baumgartner, G.; Lam, C.C.; Sadayappan, P. et al. "Memory-constrained communication minimization for a class of array computations." in 15th Workshop on Languages and Compilers for Parallel Computing. (1 2005).
  • X. Gao, S. Sahoo, Q. Lu, G. Baumgartner, C.-C. Lam, J. Ramanujam and P. Sadayappan "Performance Modeling and Optimization of Parallel Out-of-Core Tensor Contractions." in Proceedings of ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (PPOPP). (6 2005).
  • Sabin, G.; Sahasrabudhe, V.; Sadayappan, P. "Assessment and enhancement of meta-schedulers for multi-site job sharing." in 14th IEEE International Symposium on High Performance Distributed Computing. (1 2005).
  • S. K. Sahoo S. Krishnamoorthy, R. Panuganti and P. Sadayappan "Integrated Loop Optimizations for Data Locality Enhancement of Tensor Contraction Expressions." in Proc. Supercomputing (SC). (11 2005).
  • Lu, Q.D.; Gao, X.Y.; Krishnamoorthy, S.; Baumgartner, G. et al. "Empirical performance-model driven data layout optimization." in 17th International Workshop on Languages and Compilers for High Performance Computing. (1 2005).
  • S. Sahoo, R. Panuganti, S. Krishnamoorthy and P. Sadayappan "Cache Miss Characterizationand Data Locality Optimization for Imperfectly Nested Loops on Shared Memory Multiprocessors." in Proceedings of International Parallel and Distributed Processing Symposium (IPDPS). (5 2005).
  • Khanna, G.; Vydyanathan, N.; Kurc, T.; Catalyurek, U. et al. "A hypergraph partitioning based approach for scheduling of tasks with batch-shared I/O." in 5th IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2005). (1 2005).
  • Chaudhary, V.; Sadayappan, P. "Message from the CRTPC chairs." (12 2005).
  • Khanna, G.; Vydyanathan, N.; Catalyurek, U.; Kurc, T. et al. "Task scheduling and file replication for data-intensive jobs with batch-shared I/O." (1 2005).
  • Krishnamoorthy, S.; Nieplocha, J.; Sadayappan, P. "Data and computation abstractions for dynamic and irregular computations." (1 2005).
  • Hartono, A.; Sibiryakov, A.; Nooijen, M.; Baumgartner, G. et al. "Automated operation minimization of tensor contraction expressions in electronic structure calculations." (9 2005).
  • Sabin, G.; Sahasrabudhe, V.; Sadayappan, P.; IEEE, "Assessment and enhancement of meta-schedulers for multi-site job sharing." (1 2005).
  • Cociorva, D.; Baumgartner, G.; Lam, C.C.; Sadayappan, P. et al. "Memory-constrained communication minimization for a class of array computations." (12 2005).
  • Sabin, G.; Sahasrabudhe, V.; Sadayappan, P. "Assessment and enhancement of meta-schedulers for multi-site job sharing." (11 2005).
  • Hartono, A.; Sibiryakov, A.; Nooijen, M.; Baumgartner, G. et al. "Automated operation minimization of tensor contraction expressions in electronic structure calculations." (1 2005).
  • Krishnamoorthy, S.; Nieplocha, J.; Sadayappan, P. "Data and computation abstractions for dynamic and irregular computations." (12 2005).
  • Cociorva, D.; Baumgartner, G.; Lam, C.C.; Sadayappan, P. et al. "Memory-constrained communication minimization for a class of array computations." (1 2005).
  • Khanna, G.; Vydyanathan, N.; Kurc, T.; Catalyurek, U. et al. "A hypergraph partitioning based approach for scheduling of tasks with batch-shared I/O." (12 2005).
  • Sahoo, S.K.; Panuganti, R.; Krishnamoorthy, S.; Sadayappan, P. "Cache miss characterization and data locality optimization for imperfectly nested loops on shared memory multiprocessors." (12 2005).
  • Sabin, G.; Sadayappan, P. "Unfairness metrics for space-sharing parallel job schedulers." (1 2005).
  • Hartono, A.; Sibiryakov, A.; Nooijen, M.; Baumgartner, G. et al. "Automated operation minimization of tensor contraction expressions in electronic structure calculations." in 5th International Conference on Computational Science (ICCS 2005). (1 2005).
  • Lu, Q.D.; Gao, X.Y.; Krishnamoorthy, S.; Baumgartner, G. et al. "Empirical performance-model driven data layout optimization." (1 2005).
  • G. Sabin, V. Sahasrabudhe, and P. Sadayappan "Assessment and Enhancement Of Meta-Schedulers for Multi-Site Job Sharing." in Proc. 15th IEEE Symp. High Perf. Distributed Computing (HPDC). (7 2005).
  • Lu, Q.; Gao, X.; Krishnamoorthy, S.; Baumgartner, G. et al. "Empirical performance-model driven data layout optimization." (10 2005).
  • Khanna, G.; Vydyanathan, N.; Kurc, T.; Catalyurek, U. et al. "A hypergraph partitioning based approach for scheduling of tasks with batch-shared I/O." (1 2005).
  • Gaurav Khanna; Nagavijayalakshmi Vydyanathan; Kurc, T.; Catalyurek, U.; Wyckoff, P.; Saltz, J.; Sadayappan, P. "A hypergraph partitioning based approach for scheduling of tasks with batch-shared I/O." (1 2005).
  • A. Hartono, A. Sibiryakov, M. Nooijen, S. Hirata, D. Bernholdt, G. Baumgartner, J. Ramanujam, R. Pitzer, C. Lam and P. Sadayappan "Automated Operation Minimization for Tensor Contraction Expressions in Electronic Structure Calculations." in Proceedings of International Conference on Computational Science (ICCS). (5 2005).
  • Sahoo, S.K.; Krishnamoorthy, S.; Panuganti, R.; Sadayappan, P. "Integrated loop optimizations for data locality enhancement of tensor contraction expressions." (12 2005).

2004

  • Islam, M.; Balaji, P.; Sadayappan, P.; Panda, D.K. et al. "Towards provision of quality of service guarantees in job scheduling." (1 2004).
  • Vydyanathan, N.; Khanna, G.; Kurc, T.; Catalyurek, U. et al. "Use of PVFS for efficient execution of jobs with pipeline-shared I/O." (12 2004).
  • Islam, M.; Balaji, P.; Sadayappan, P.; Panda, D.K. "Towards provision of Quality of Service guarantees in job scheduling." (12 2004).
  • Vydyanathan, N.; Gaurav Khanna; Kurc, T.; Catalyurek, U.; Wyckoff, P.; Saltz, J.; Sadayappan, P. "Use of PVFS for efficient execution of jobs with pipeline-shared I/O." (1 2004).
  • Sabin, G.; Kochhar, G.; Sadayappan, P. "Job fairness in non-preemptive job scheduling." (12 2004).
  • Lu, Q.; Wu, J.; Panda, D.; Sadayappan, P. "Applying MPI derived datatypes to the NAS benchmarks: A case study." (12 2004).
  • Sabin, G.; Sahasrabudhe, V.; Sadayappan, P. "On fairness in distributed job scheduling across multiple sites." (12 2004).
  • Berlin, K.; Huan, J.; Jacob, M.; Kochhar, G. et al. "Evaluating the impact of programming language features on the performance of parallel applications on cluster architectures." (12 2004).
  • Bibireata, A.; Krishnan, S.; Baumgartner, G.; Cociorva, D. et al. "Memory-constrained data locality optimization for tensor contractions." (12 2004).
  • Krishnamoorthy, S.; Baumgartner, G.; Lam, C.C.; Nieplocha, J. et al. "Efficient layout transformation for disk-based multidimensional arrays." (1 2004).
  • Vydyanathan, N.; Khanna, G.; Kurc, T.; Catalyurek, U. et al. "Use of PVFS for efficient execution of jobs with pipeline-shared I/O." (1 2004).
  • Islam, M.; Balaji, P.; Sadayappan, P.; Panda, D.K. "Towards provision of quality of service guarantees in job scheduling." in IEEE International Conference on Cluster Computing. (1 2004).
  • S. Krishnan, S. Krishnamoorthy, G. Baumgartner, C.-C. Lam, P. Sadayappan, J. Ramanujamand V. Choppella, "Efficient Synthesis of Out-of-Core Algorithms Using a Nonlinear Optimization Solver." in Proceedings of International Parallel and Distributed Processing Symposium (IPDPS). (4 2004).
  • Bibireata, A.; Krishnan, S.; Baumgartner, G.; Cociorva, D. et al. "Memory-constrained data locality optimization for tensor contractions." in 16th International Workshop on Languages and Compilers for Parallel Computing (LCPC 2003). (1 2004).
  • Sabin, G.; Sahasrabudhe, V.; Sadayappan, P. "On fairness in distributed job scheduling across multiple sites." in IEEE International Conference on Cluster Computing. (1 2004).
  • Sabin, G.; Kochhar, G.; Sadayappan, P. "Job fairness in non-preemptive job scheduling." in 33rd International Conference on Parallel Processing. (1 2004).
  • Berlin, K.; Huan, J.; Jacob, M.; Kochhar, G. et al. "Evaluating the impact of programming language features on the performance of parallel applications on cluster architectures." in 16th International Workshop on Languages and Compilers for Parallel Computing (LCPC 2003). (1 2004).
  • Sabin, G.; Sahasrabudhe, V.; Sadayappan, P.; ieee, "On fairness in distributed job scheduling across multiple sites." (1 2004).
  • Vydyanathan, N.; Khanna, G.; Kurc, T.; Catalyurek, U. et al. "Use of PVFS for efficient execution of jobs with pipeline-shared I/O." in 5th International Workshop on Grid Computing. (1 2004).
  • Sabin, G.; Kochhar, G.; Sadayappan, P. "Job fairness in non-preemptive job scheduling." (1 2004).
  • Bibireata, A.; Krishnan, S.; Baumgartner, G.; Cociorva, D. et al. "Memory-constrained data locality optimization for tensor contractions." (1 2004).
  • Krishnamoorthy, S.; Baumgartner, G.; Lam, C.C.; Nieplocha, J. et al. "Efficient layout transformation for disk-based multidimensional arrays." (12 2004).
  • Berlin, K.; Huan, J.; Jacob, M.; Kochhar, G. et al. "Evaluating the impact of programming language features on the performance of parallel applications on cluster architectures." (1 2004).
  • Krishnamoorthy, S.; Baumgartner, G.; Lam, C.C.; Nieplocha, J. et al. "Efficient layout transformation for disk-based multidimensional arrays." in 11th International Conference on High Performance Computing. (1 2004).
  • Sadayappan, P.; Auer, A.; Baumgartner, G.; Bernholdt, D.E. et al. "Performance optimization issues in automatic synthesis of high-performance codes for correlated electronic structure methods.." (8 2004).
  • Sadayappan, P.; Auer, A.; Baumgartner, G.; Bernholdt, D.E. et al. "Performance optimization issues in automatic synthesis of high-performance codes for correlated electronic structure methods.." in Meeting of the Division of Chemical Toxicology of the American-Chemical-Society held at the 228th National Meeting of the American-Chemical-Society. (8 2004).

2003

  • Sadayappan, P.; Auer, A.; Baumgartner, G.; Bernholdt, D.E. et al. "Automatic synthesis of high-performance parallel programs for electronic structure methods.." in 226th National Meeting of the American-Chemical-Society. (9 2003).
  • Krishnan, S.; Krishnamoorthy, S.; Baumgartner, G.; Cociorva, D. et al. "Data locality optimization for synthesis of efficient out-of-core algorithms." (12 2003).
  • Baumgartner, G.; Cociorva, D.; Bibireata, A.; Gao, X.Y. et al. "Computer aided implementation of many-body methods: The tensor contraction engine.." (9 2003).
  • Bernholdt, D.E.; Auer, A.; Baumgartner, G.; Bibireata, A. et al. "Synthesizing highly optimized code for correlated electronic structure calculations.." (9 2003).
  • Krishnan, S.; Krishnamoorthy, S.; Baumgartner, G.; Cociorva, D. et al. "Data locality optimization for synthesis of efficient out-of-core algorithms." (1 2003).
  • Nooijen, M.; Baumgartner, G.; Bernholdt, D.E.; Cociorva, D. et al. "Automatic synthesis of advanced electronic structure programs." in 225th National Meeting of the American-Chemical-Society. (3 2003).
  • Baumgartner, G.; Cociorva, D.; Bibireata, A.; Gao, X.Y. et al. "Computer aided implementation of many-body methods: The tensor contraction engine.." in 226th National Meeting of the American-Chemical-Society. (9 2003).
  • Krishnan, S.; Krishnamoorthy, S.; Baumgartner, G.; Cociorva, D. et al. "Data locality optimization for synthesis of efficient out-of-core algorithms." in 10th International Conference on High Performance Computing. (1 2003).
  • S. Krishnan, S. Krishnamoorthy, G. Baumgartner, D. Cociorva, C.-C. Lam, P. Sadayappan, J. Ramanujam, D. E. Bernholdt and V. Choppella "Data Locality Optimization for Synthesis of Efficient Out-of-Core Algorithms." in Proc. Tenth Intl. Conf. on High Performance Computing (HiPC). (12 2003).
  • Bernholdt, D.E.; Auer, A.; Baumgartner, G.; Bibireata, A. et al. "Synthesizing highly optimized code for correlated electronic structure calculations.." in 226th National Meeting of the American-Chemical-Society. (9 2003).
  • M. Islam, P. Balaji, P. Sadayappan and D.K. Panda "QoPS: A QoS based scheme for Parallel Job Scheduling." in Proceedings of the Ninth Workshop on Job Scheduling Strategies for Parallel Processing (JSSPP). (7 2003).
  • Nooijen, M.; Baumgartner, G.; Bernholdt, D.E.; Cociorva, D. et al. "Automatic synthesis of advanced electronic structure programs." (3 2003).
  • Cociorva, D.; Gao, X.; Krishnan, S.; Baumgartner, G. et al. "Global communication optimization for tensor contraction expressions under memory constraints." (1 2003).
  • D. Cociorva, X. Gao, S. Krishnan, G. Baumgartner, C. Lam, P. Sadayappan and J. Ramanujam "Global Communication Optimization for Tensor Contraction Expressions under Memory Constraints." in Proceedings of International Parallel and Distributed Processing Symposium (IPDPS). (4 2003).
  • Sadayappan, P.; Auer, A.; Baumgartner, G.; Bernholdt, D.E. et al. "Automatic synthesis of high-performance parallel programs for electronic structure methods.." (9 2003).

2002

  • Subramani, V.; Kettimuthu, R.; Srinivasan, S.; Johnston, J. et al. "Selective buddy allocation for scheduling parallel jobs on clusters." in IEEE International Conference on Cluster Computing. (1 2002).
  • Srinivasan, S.; Kettimuthu, R.; Subramani, V.; Sadayappan, P. "Characterization of backfilling strategies for parallel job scheduling." in 31st International Conference on Parallel Processing (ICPP 2002). (1 2002).
  • Subramani, V.; Kettimuthu, R.; Srinivasan, S.; Sadayappan, P. "Distributed job scheduling on computational grids using multiple simultaneous requests." in 11th IEEE International Symposium on High Performance Distributed Computing (HPDC-11). (1 2002).
  • Srinivasan, S.; Subramani, V.; Kettimuthu, R.; Holenarsipur, P. et al. "Effective selection of partition sizes for moldable scheduling of parallel jobs." (1 2002).
  • Cociorva, D.; Baumgartner, G.; Lam, C.C.; Sadayappan, P. et al. "Space-time trade-off optimization for a class of electronic structure calculations." (1 2002).
  • Gopalsamy, T.; Singhal, M.; Panda, D.; Sadayappan, P. "A reliable multicast algorithm for mobile Ad hoc networks." (1 2002).
  • Srinivasan, S.; Kettimuthu, R.; Subramani, V.; Sadayappan, P. "Selective reservation strategies for backfill job scheduling." (1 2002).
  • Srinivasan, S.; Kettimuthu, R.; Subramani, V.; Sadayappan, P. "Characterization of backfilling strategies for parallel job scheduling." (1 2002).
  • Subramani, V.; Kettimuthu, R.; Srinivasan, S.; Johnston, J. et al. "Selective buddy allocation for scheduling parallel jobs on clusters." (1 2002).
  • Cociorva, D.; Baumgartner, G.; Lam, C.C.; Sadayappan, P. et al. "Space-time trade-off optimization for a class of electronic structure calculations." in Conference on Programming Language Design and Implementation (PLDI 02). (5 2002).
  • Gopalsamy, T.; Singhal, M.; Panda, D.; Sadayappan, P. "A, reliable multicast algorithm for mobile ad hoc networks." in 22nd International Conference on Distributed Computing Systems. (1 2002).
  • D. Cociorva, G. Baumgartner, C. Lam, P. Sadayappan, J. Ramanujam, M. Nooijen, D. Bernholdt, R. Harrison and R. Pitzer "A High-Level Approach to Synthesis of High-Performance Codes for Quantum Chemistry." in Proceedings of Supercomputing (SC). (11 2002).
  • D. Cociorva, G. Baumgartner, C. Lam, P. Sadayappan, J. Ramanujam, M. Nooijen, D. Bernholdt and R. Harrison "Space-Time Trade-Off Optimization for a Class of Electronic Structure Calculations." in Proceedings of ACM SIGPLAN 2002 Conference on Programming Language Design and Implementation (PLDI). (6 2002).
  • Srinivasan, S.; Subramani, V.; Kettimuthu, R.; Holenarsipur, P. et al. "Effective selection of partition sizes for moldable scheduling of parallel jobs." in 9th International Conference on High Performance Computing (HiPC 2002). (1 2002).
  • Srinivasan, S.; Kettimuthu, R.; Subramani, V.; Sadayappan, P. "Selective reservation strategies for backfill job scheduling." in 8th International Workshop on Job Scheduling Strategies for Parallel Processing. (1 2002).
  • Kettimuthu, R.; Subramani, V.; Srinivasan, S.; Gopalasamy, T. et al. "Selective preemption strategies for parallel job scheduling." (1 2002).
  • Kettimuthu, R.; Subramani, V.; Srinivasan, S.; Gopalasamy, T. et al. "Selective preemption strategies for parallel job scheduling." (1 2002).
  • V. Subramani, R. Kettimuthu, S. Srinivasan and P. Sadayappan "Distributed Job Scheduling on Computational Grids using Multiple Simultaneous Requests." in Proc. 11th IEEE Symp. High Perf. Distributed Computing (HPDC 2002). (7 2002).
  • Srinivasan, S.; Subramani, V.; Kettimuthu, R.; Holenarsipur, P. et al. "Effective selection of partition sizes for moldable scheduling of parallel jobs." (1 2002).
  • Subramani, V.; Kettimuthu, R.; Srinivasan, S.; Johnston, J. et al. "Selective buddy allocation for scheduling parallel jobs on clusters." (1 2002).
  • Subramani, V.; Kettimuthu, R.; Srinivasan, S.; Sadayappan, S. "Distributed job scheduling on computational Grids using multiple simultaneous requests." (1 2002).
  • Srinivasan, S.; Kettimuthu, R.; Subramani, V.; Sadayappan, P. "Characterization of backfilling strategies for parallel job scheduling." (1 2002).
  • Srinivasan, S.; Kettimuthu, R.; Subramani, V.; Sadayappan, P. "Selective reservation strategies for backfill job scheduling." (1 2002).
  • Kettimuthu, R.; Subramani, V.; Srinivasan, S.; Gopalasamy, T. et al. "Selective preemption strategies for parallel job scheduling." in 31st International Conference on Parallel Processing (ICPP 2002). (1 2002).
  • Subramani, V.; Kettimuthu, R.; Srinivasan, S.; Sadayappan, P. et al. "Distributed job scheduling on computational grids using multiple simultaneous requests." (1 2002).
  • Cociorva, D.; Baumgartner, G.; Lam, C.C.; Sadayappan, P. et al. "Space-time trade-off optimization for a class of electronic structure calculations." (5 2002).
  • Gopalsamy, T.; Singhal, M.; Panda, D.; Sadayappan, P. "A, reliable multicast algorithm for mobile ad hoc networks." (1 2002).
  • S. Srinivasan, V. Subramani, R. Kettimuthu, P. Holenarsipur and P. Sadayappan "Effective Selection of Partition Sizes for Moldable Scheduling of Parallel Jobs." in Proc. Ninth Intl. Conf. on High Performance Computing (HiPC). (12 2002).
  • S. Srinivasan, R. Kettimuthu, V. Subramani and P. Sadayappan "Selective Reservation Strategies for Backfill Job Scheduling." in Proc. of 8th Workshop on Job Scheduling Strategies for Parallel Processing (JSSPP). (7 2002).
  • Baumgartner, G.; Bernholdt, D.E.; Cociorva, D.; Harrison, R. et al. "A performance optimization framework for compilation of tensor contraction expressions into parallel." (1 2002).

2001

  • Gulati, A.; Panda, D.K.; Sadayappan, P.; Wyckoff, P. "NIC-based rate control for proportional bandwidth allocation in Myrinet clusters." (1 2001).
  • Amit Singhal, Mohammad Banikazemi, P. Sadayappan, and D. K. Panda "Efficient Multicast Algorithms for Heterogeneous Switch-based Irregular Networks of Workstations." in Proceedings of International Parallel and Distributed Processing Symposium (IPDPS). (4 2001).
  • Singhal, A.; Banikazemi, M.; Sadayappan, P.; Panda, D.K. "Efficient multicast algorithms for switch-based irregular heterogeneous networks of workstations." (1 2001).
  • Buntinas, D.; Panda, D.K.; Sadayappan, P. "Performance benefits of NIC-based barrier on myrinet/GM." (1 2001).
  • Banikazemi, M.; Liu, J.X.; Panda, D.K.; Sadayappan, P. "Implementing TreadMarks over virtual interface architecture on Myrinet and Gigabit Ethernet: Challenges, design experience, and performance evaluation." (1 2001).
  • Welsh, D.J.S.; Bedford, K.W.; Guo, Y.; Sadayappan, P. "A coupled wave, current, and sediment transport modeling system." (12 2001).
  • D. Cociorva, G. Baumgartner, D. Bernholdt, R. Harrison, M. Nooijen, J. Ramanujam, P. Sadayappan, and J. Wilkins "Towards Automatic Synthesis of High-Performance Codes for Electronic Structure Calculations: Data Locality Optimization." in Proceedings of Eighth Intl.Conf. on High Performance Computing (HiPC). (12 2001).
  • Holenarsipur, P.; Yarmolenko, V.; Duato, J.; Panda, D.K. et al. "Characterization and enhancement of static mapping heuristics for heterogeneous systems." in 7th International Conference on High Performance Computing (HiPC 2000). (1 2001).
  • D. Cociorva, C. Lam, G. Baumgartner, J. Ramanujam, P. Sadayappan, and J. Wilkins "Loop Optimization for a Class of Memory-Constrained Computations." in Proc. of ACM Intl. Conf. on Supercomputing (ICS). (6 2001).
  • Holenarsipur, P.; Yarmolenko, V.; Duato, J.; Panda, D.K. et al. "Characterization and enhancement of static mapping heuristics for heterogeneous systems." (1 2001).
  • Banikazemi, M.; Liu, J.X.; Panda, D.K.; Sadayappan, P. "Implementing TreadMarks over virtual interface architecture on Myrinet and Gigabit Ethernet: Challenges, design experience, and performance evaluation." in 30th International Conference on Parallel Processing (ICPP 01). (1 2001).
  • Gulati, A.; Panda, D.K.; Sadayappan, P.; Wyckoff, P. "NIC-based rate control for proportional bandwidth allocation in Myrinet clusters." in 30th International Conference on Parallel Processing (ICPP 01). (1 2001).
  • Cociorva, D.; Wilkins, J.; Baumgartner, G.; Sadayappan, P. et al. "Towards automatic synthesis of high-performance codes for electronic structure calculations: Data locality optimization." (1 2001).
  • Cociorva, D.; Wilkins, J.W.; Lam, C.; Baumgartner, G. et al. "Loop optimizations for a class of memory-constrained computations." (1 2001).
  • Gulati, A.; Panda, D.K.; Sadayappan, P.; Wyckoff, P. "NIC-based rate control for proportional bandwidth allocation in Myrinet clusters." (1 2001).
  • Banikazemi, M.; Liu, J.; Panda, D.K.; Sadayappan, P. "Implementing TreadMarks over Virtual Interface Architecture on Myrinet and gigabit Ethernet: Challenges, design experience, and performance evaluation." (1 2001).

2000

  • Lam, C.C.; Cociorva, D.; Baumgartner, G.; Sadayappan, P. "Optimization of memory usage requirement for a class of loops implementing multi-dimensional integrals." (1 2000).
  • Morgan, P.E.; Visbal, M.R.; Sadayappan, P. "Application of a parallel implicit Navier-Stokes solver to three dimensional viscous flows." (12 2000).
  • Paul, A.; Feng, W.C.; Panda, D.K.; Sadayappan, P. "Balancing web server load for adaptable video distribution." in International Conference on Parallel Processing (ICPP 2000). (1 2000).
  • Yarmolenko, V.; Duato, J.; Panda, D.K.; Sadayappan, P. "Characterization and enhancement of dynamic mapping heuristics for heterogeneous systems." in International Conference on Parallel Processing (ICPP 2000). (1 2000).
  • Yarmolenko, V.; Duato, J.; Panda, D.K.; Sadayappan, P. "Characterization and enhancement of dynamic mapping heuristics for heterogeneous systems." (1 2000).
  • Paul, A.; Feng, W.C.; Panda, D.K.; Sadayappan, P. "Balancing web server load for adaptable video distribution." (1 2000).
  • Paul, A.; Feng, W.C.; Panda, D.K.; Sadayappan, P. "Balancing web server load for adaptable video distribution." (1 2000).
  • Holenarsipur, P.; Yarmolenko, V.; Duato, J.; Panda, D.K. et al. "Characterization and enhancement of static mapping heuristics for heterogeneous systems." (1 2000).
  • Yarmolenko, V.; Duato, J.; Panda, D.K.; Sadayappan, P. "Characterization and enhancement of dynamic mapping heuristics for heterogeneous systems." (1 2000).

1999

  • Moorthy, V.; Jacunski, M.G.; Pillai, M.; Ware, P.P. et al. "Low-latency message passing on workstation clusters using SCRAMNet." (1 1999).
  • Moorthy, V.; Jacunski, M.G.; Pillai, M.; Ware, P.P. et al. "Low-latency message passing on workstation clusters using SCRAMNet." in 13th Parallel Processing Symposium / 10th Symposium on Parallel and Distributed Processing (IPPS/SPDP 1999). (1 1999).
  • Moorthy, V.; Jacunski, M.G.; Pillai, M.; Ware, P.P. et al. "Low-latency message passing on workstation clusters using SCRAMNet." (1 1999).

1997

  • Lam, C.C.; Sadayappan, P.; Wenger, R. "Optimal reordering and mapping of a class of nested-loops for parallel execution." (1 1997).
  • White, J.B.; Sadayappan, P. "On improving the performance of sparse matrix-vector multiplication." (12 1997).
  • White, J.B.; Sadayappan, P.; SOC, I.C. "On improving the performance of sparse matrix-vector multiplication." (1 1997).
  • White, J.B.; Sadayappan, P. "On improving the performance of sparse matrix-vector multiplication." in 4th International Conference on High-Performance Computing (HiPC 97). (1 1997).

1996

  • Kaushik, S.D.; Huang, C.H.; Sadayappan, P. "Compiling array statements for efficient execution on distributed-memory machines: Two-level mappings." (1 1996).

1995

  • Krothapalli, V.P.; Sadayappan, P. "On reducing synchronization costs in nested DOACROSS loops." (1 1995).
  • Eswar, K.; Huang, C.H.; Sadayappan, P. "On mapping data and computation for parallel sparse Cholesky factorization." (1 1995).
  • Kaushik, S.D.; Huang, C.H.; Sadayappan, P. "Incremental generation of index sets for array statement execution on distributed-memory machines." (1 1995).
  • Kaushik, S.D.; Huang, C.H.; Ramanujam, J.; Sadayappan, P. "Multi-phase array redistribution: modeling and evaluation." (1 1995).

1994

  • KUMAR, B.; ESWAR, K.; SADAYAPPAN, P.; HUANG, C.H. et al. "A REORDERING AND MAPPING ALGORITHM FOR PARALLEL SPARSE CHOLESKY FACTORIZATION." (1 1994).
  • Gupta, H.; Sadayappan, P. "Communication efficient matrix multiplication on hypercubes." (8 1994).
  • Kumar, B.; Sadayappan, P.; Huang, C.H. "On sparse matrix reordering for parallel factorization." (7 1994).
  • CHOIGROGAN, Y.S.; LEE, R.; ESWAR, K.; SADAYAPPAN, P. "The performance of a partitioning finite element method on the touchstone delta." (1 1994).
  • KROTHAPALLI, V.P.; SADAYAPPAN, P. "ON REDUCING SYNCHRONIZATION COSTS IN NESTED DOACROSS LOOPS." (1 1994).
  • Gupat, S.K.S.; Huang, C.H.; Johnson, R.W.; Sadayappan, P. "Communication-efficient implementation of block recursive algorithms on distributed-memory machines." (12 1994).
  • Kumar, B.; Eswar, K.; Sadayappan, P.; Huang, C.H. "Reordering and mapping algorithm for parallel sparse Cholesky factorization." (12 1994).
  • ESWAR, K.; HUANG, C.H.; SADAYAPPAN, P.; SOC, I.C. "MEMORY-ADAPTIVE PARALLEL SPARSE CHOLESKY FACTORIZATION." (1 1994).
  • Amin, A.; Sadayappan, P.; Gudavalli, M. "Clustered reduced communication element by element preconditioned conjugate gradient algorithm for finite element computations." (1 1994).
  • KROTHAPALLI, V.P.; SADAYAPPAN, P. "ON REDUCING SYNCHRONIZATION COSTS IN NESTED DOACROSS LOOPS." in 1994 Annual International Conference of the IEEE Region-10 - Frontiers of Computer Technology. (1 1994).
  • DAI, D.L.; GUPTA, S.K.S.; KAUSHIK, S.D.; LU, J.H. et al. "EXTENT - A PORTABLE PROGRAMMING ENVIRONMENT FOR DESIGNING AND IMPLEMENTING HIGH-PERFORMANCE BLOCK RECURSIVE ALGORITHMS." (1 1994).
  • Kaushik, S.D.; Huang, C.H.; Johnson, R.W.; Sadayappan, P. "An approach to communication-efficient data redistribution." (7 1994).
  • Eswar, K.; Huang, C.H.; Sadayappan, P. "Memory-adaptive parallel sparse Cholesky factorization." (12 1994).
  • DAI, D.L.; GUPTA, S.K.S.; KAUSHIK, S.D.; LU, J.H. et al. "EXTENT - A PORTABLE PROGRAMMING ENVIRONMENT FOR DESIGNING AND IMPLEMENTING HIGH-PERFORMANCE BLOCK RECURSIVE ALGORITHMS." in Supercomputing 94. (1 1994).
  • ESWAR, K.; HUANG, C.H.; SADAYAPPAN, P. "MEMORY-ADAPTIVE PARALLEL SPARSE CHOLESKY FACTORIZATION." in 1994 Scalable High Performance Computing Conference (SHPCC 94). (1 1994).
  • KUMAR, B.; ESWAR, K.; SADAYAPPAN, P.; HUANG, C.H. "A REORDERING AND MAPPING ALGORITHM FOR PARALLEL SPARSE CHOLESKY FACTORIZATION." in 1994 Scalable High Performance Computing Conference (SHPCC 94). (1 1994).
  • Dai, D.L.; Gupta, S.K.S.; Kaushik, S.D.; Lu, J.H. et al. "EXTENT: a portable programming environment for designing and implementing high-performance block recursive algorithms." (12 1994).
  • CHOIGROGAN, Y.S.; LEE, R.; ESWAR, K.; SADAYAPPAN, P. "The performance of a partitioning finite element method on the touchstone delta." in 10th Annual Review of Progress in Applied Computational Electromagnetics Conference. (1 1994).

1993

  • Eswar, K.; Sadayappan, P.; Huang, C.H.; Visvanathan, V. "Supernodal sparse cholesky factorization on distributed-memory multiprocessors." (1 1993).
  • Nandy, S.K.; Narayan, R.; Visvanathan, V.; Sadayappan, P. et al. "A parallel progressive refinement image rendering algorithm on a scalable multithreaded VLSI processor array." (1 1993).
  • NANDY, S.K.; NARAYAN, R.; VISVANATHAN, V.; SADAYAPPAN, P. et al. "A PARALLEL PROGRESSIVE REFINEMENT IMAGE RENDERING ALGORITHM ON A SCALABLE MULTITHREADED VLSI PROCESSOR ARRAY." (1 1993).
  • GUPTA, S.K.S.; KAUSHIK, S.D.; HUANG, C.H.; JOHNSON, J.R. et al. "ON THE AUTOMATIC-GENERATION OF DATA DISTRIBUTIONS." (1 1993).
  • ESWAR, K.; SADAYAPPAN, P.; HUANG, C.H.; VISVANATHAN, V. "SUPERNODAL SPARSE CHOLESKY FACTORIZATION ON DISTRIBUTED-MEMORY MULTIPROCESSORS." (1 1993).
  • KAUSHIK, S.D.; HUANG, C.H.; JOHNSON, J.R.; JOHNSON, R.W. et al. "EFFICIENT TRANSPOSITION ALGORITHMS FOR LARGE MATRICES." (1 1993).
  • Gupta, S.; Huang, C.H.; Sadayappan, P.; Johnson, R. "On the synthesis of parallel programs from tensor product formulas for block recursive algorithms." (1 1993).
  • GUPTA, S.K.S.; KAUSHIK, S.D.; MUFTI, S.; SHARMA, S. et al. "ON COMPILING ARRAY EXPRESSIONS FOR EFFICIENT EXECUTION ON DISTRIBUTED-MEMORY MACHINES." (1 1993).
  • ESWAR, K.; SADAYAPPAN, P.; HUANG, C.H. "COMPILE-TIME CHARACTERIZATION OF RECURRENT PATTERNS IN IRREGULAR COMPUTATIONS." (1 1993).
  • GHOSH, D.; NANDY, S.K.; SADAYAPPAN, P.; PARTHASARATHY, K. et al. "ARCHITECTURAL SYNTHESIS OF PERFORMANCE-DRIVEN MULTIPLIERS WITH ACCUMULATOR INTERLEAVING." (1 1993).
  • Kaushik, S.D.; Huang, C.H.; Johnson, J.R.; Johnson, R.W. et al. "Efficient transposition algorithms for large matrices." (12 1993).
  • KAUSHIK, S.D.; HUANG, C.H.; JOHNSON, J.R.; JOHNSON, R.W. et al. "EFFICIENT TRANSPOSITION ALGORITHMS FOR LARGE MATRICES." in Supercomputing 93 Conference. (1 1993).
  • Ghosh, D.; Nandy, S.K.; Sadayappan, P.; Parthasarathy, K. "Architectural synthesis of performance-driven multipliers with accumulator interleaving." (1 1993).
  • SHARMA, S.; HUANG, C.H.; SADAYAPPAN, P. "ON DATA DEPENDENCE ANALYSIS FOR COMPILING PROGRAMS ON DISTRIBUTED-MEMORY MACHINES." (1 1993).
  • Gupta, S.K.S.; Kaushik, S.D.; Huang, C.H.; Johnson, J.R. et al. "On the Automatic Generation of Data Distributions." (1 1993).
  • GUPTA, S.K.S.; KAUSHIK, S.D.; HUANG, C.H.; JOHNSON, J.R. et al. "ON THE AUTOMATIC-GENERATION OF DATA DISTRIBUTIONS." in 2ND WORKSHOP ON LANGUAGES, COMPILERS, AND RUN-TIME ENVIRONMENTS FOR DISTRIBUTED MEMORY MULTIPROCESSORS. (1 1993).
  • ESWAR, K.; SADAYAPPAN, P.; HUANG, C.H. "COMPILE-TIME CHARACTERIZATION OF RECURRENT PATTERNS IN IRREGULAR COMPUTATIONS." in 1993 International Conference on Parallel Processing. (1 1993).
  • GUPTA, S.K.S.; KAUSHIK, S.D.; MUFTI, S.; SHARMA, S. et al. "ON COMPILING ARRAY EXPRESSIONS FOR EFFICIENT EXECUTION ON DISTRIBUTED-MEMORY MACHINES." in 1993 International Conference on Parallel Processing. (1 1993).
  • SHARMA, S.; HUANG, C.H.; SADAYAPPAN, P. "ON DATA DEPENDENCE ANALYSIS FOR COMPILING PROGRAMS ON DISTRIBUTED-MEMORY MACHINES." in 2ND WORKSHOP ON LANGUAGES, COMPILERS, AND RUN-TIME ENVIRONMENTS FOR DISTRIBUTED MEMORY MULTIPROCESSORS. (1 1993).
  • NANDY, S.K.; NARAYAN, R.; VISVANATHAN, V.; SADAYAPPAN, P. et al. "A PARALLEL PROGRESSIVE REFINEMENT IMAGE RENDERING ALGORITHM ON A SCALABLE MULTITHREADED VLSI PROCESSOR ARRAY." in 1993 International Conference on Parallel Processing. (1 1993).
  • GHOSH, D.; NANDY, S.K.; SADAYAPPAN, P.; PARTHASARATHY, K. "ARCHITECTURAL SYNTHESIS OF PERFORMANCE-DRIVEN MULTIPLIERS WITH ACCUMULATOR INTERLEAVING." in 30TH DESIGN AUTOMATION CONF. (1 1993).
  • ESWAR, K.; SADAYAPPAN, P.; HUANG, C.H.; VISVANATHAN, V. "SUPERNODAL SPARSE CHOLESKY FACTORIZATION ON DISTRIBUTED-MEMORY MULTIPROCESSORS." in 1993 International Conference on Parallel Processing. (1 1993).

1992

  • McMillan, S.; Sadayappan, P.; Orin, D.E. "Efficient dynamic simulation of multiple manipulator systems with singularities." (4 1992).
  • MCMILLAN, S.; SADAYAPPAN, P.; ORIN, D.E. "EFFICIENT DYNAMIC SIMULATION OF MULTIPLE MANIPULATOR SYSTEMS WITH SINGULARITIES." in 1992 IEEE INTERNATIONAL CONF ON ROBOTICS AND AUTOMATION. (1 1992).
  • MCMILLAN, S.; SADAYAPPAN, P.; ORIN, D.E.; IEEE, "EFFICIENT DYNAMIC SIMULATION OF MULTIPLE MANIPULATOR SYSTEMS WITH SINGULARITIES." (1 1992).
  • Amin, A.; Sadayappan, P.; Chaudhary, A. "Parallel ALPID-3D. A 3-D metal forming program for parallel computers." (6 1992).
  • Amin, A.; Sadayappan, P.; Chaudhary, A. "Parallel ALPID-3D: A 3-D metal forming program for parallel computers." (12 1992).

1991

  • MCMILLAN, S.; ORIN, D.E.; SADAYAPPAN, P.; IEEE, "REAL-TIME ROBOT DYNAMIC SIMULATION ON A VECTOR PARALLEL SUPERCOMPUTER." (1 1991).
  • MCMILLAN, S.; ORIN, D.E.; SADAYAPPAN, P. "REAL-TIME ROBOT DYNAMIC SIMULATION ON A VECTOR PARALLEL SUPERCOMPUTER." in 1991 INTERNATIONAL CONF ON ROBOTICS AND AUTOMATION. (1 1991).
  • McMillan, S.; Orin, D.E.; Sadayappan, P. "Real-time robot dynamic simulation on a vector/parallel supercomputer." (1 1991).
  • KROTHAPALLI, V.P.; SADAYAPPAN, P. "REMOVAL OF REDUNDANT DEPENDENCES IN DOACROSS LOOPS WITH CONSTANT DEPENDENCES." in SYMP ON PRINCIPLES AND PRACTICES PARALLEL PROGRAMMING. (7 1991).
  • Ramanujan, J.; Sadayappan, P. "Tiling multidimensional iteration spaces for nonshared memory machines." (12 1991).
  • WHITMAN, S.; SADAYAPPAN, P. "COMPUTER-GRAPHICS RENDERING ON A SHARED MEMORY MULTIPROCESSOR." (1 1991).
  • ESWAR, K.; SADAYAPPAN, P.; VISVANATHAN, V. "MULTIFRONTAL FACTORIZATION OF SPARSE MATRICES ON SHARED-MEMORY MULTIPROCESSORS." (1 1991).
  • Krothapalli, V.P.; Sadayappan, P. "Removal of Redundant Dependences in DOACROSS Loops with Constant Dependences." (1 1991).
  • ESWAR, K.; SADAYAPPAN, P.; VISVANATHAN, V. "MULTIFRONTAL FACTORIZATION OF SPARSE MATRICES ON SHARED-MEMORY MULTIPROCESSORS." in INTERNATIONAL CONF ON PARALLEL PROCESSING. (1 1991).
  • RAMANUJAM, J.; SADAYAPPAN, P. "TILING MULTIDIMENSIONAL ITERATION SPACES FOR NONSHARED MEMORY MACHINES." in 4TH ANNUAL CONF ON HIGH PERFORMANCE COMPUTING ( SUPERCOMPUTING 91 ). (1 1991).
  • McMillan, S.; Orin, D.E.; Sadayappan, P. "Parallel real-time dynamic simulation of robots." (1 1991).
  • RAMANUJAM, J.; SADAYAPPAN, P.; MACHINERY, A.C. "TILING MULTIDIMENSIONAL ITERATION SPACES FOR NONSHARED MEMORY MACHINES." (1 1991).
  • WHITMAN, S.; SADAYAPPAN, P. "COMPUTER-GRAPHICS RENDERING ON A SHARED MEMORY MULTIPROCESSOR." in INTERNATIONAL CONF ON PARALLEL PROCESSING. (1 1991).
  • KROTHAPALLI, V.P.; SADAYAPPAN, P. "REMOVAL OF REDUNDANT DEPENDENCES IN DOACROSS LOOPS WITH CONSTANT DEPENDENCES." (7 1991).

1990

  • Krothapalli, V.P.; Sadayappan, P. "Dynamic scheduling of DOACROSS loops for multiprocessors." (1 1990).
  • Krothapalli, V.P.; Sadayappan, P. "Exploiting parallelism through run-time analysis on a vector processor." (1 1990).

1989

  • SADAYAPPAN, P.; RAO, S.K.; MACH, A.C. "COMMUNICATION REDUCTION FOR DISTRIBUTED SPARSE-MATRIX FACTORIZATION ON A PROCESSOR MESH." (1 1989).
  • Ramanujam, J.; Sadayappan, P. "Methodology for parallelizing programs for multicomputers and complex memory multiprocessors." (12 1989).
  • Zaky, A.; Sadayappan, P. "Optimal static scheduling of sequential loops on multiprocessors." (12 1989).
  • RAMANUJAM, J.; SADAYAPPAN, P. "A METHODOLOGY FOR PARALLELIZING PROGRAMS FOR MULTICOMPUTERS AND COMPLEX MEMORY MULTIPROCESSORS." in CONF ON SUPERCOMPUTING ( SUPERCOMPUTING 89 ). (1 1989).
  • Sadayappan, P.; Rao, S.K. "Communication reduction for distributed sparse matrix factorization on a processor mesh." (12 1989).
  • ZAKY, A.; SADAYAPPAN, P. "OPTIMAL STATIC SCHEDULING OF SEQUENTIAL LOOPS ON MULTIPROCESSORS." in 1989 INTERNATIONAL CONF ON PARALLEL PROCESSING. (1 1989).
  • ZAKY, A.; SADAYAPPAN, P. "OPTIMAL STATIC SCHEDULING OF SEQUENTIAL LOOPS ON MULTIPROCESSORS." (1 1989).
  • SADAYAPPAN, P.; RAO, S.K. "COMMUNICATION REDUCTION FOR DISTRIBUTED SPARSE-MATRIX FACTORIZATION ON A PROCESSOR MESH." in CONF ON SUPERCOMPUTING ( SUPERCOMPUTING 89 ). (1 1989).
  • RAMANUJAM, J.; SADAYAPPAN, P.; MACH, A.C. "A METHODOLOGY FOR PARALLELIZING PROGRAMS FOR MULTICOMPUTERS AND COMPLEX MEMORY MULTIPROCESSORS." (1 1989).

1988

  • Sadayappan, P.; Visvanathan, V. "Modeling and optimal scheduling of parallel sparse Gaussian elimination." (12 1988).
  • Ramanujam, J.; Sadayappan, P. "Optimization by neural networks." (12 1988).
  • Krothapalli, V.P.; Sadayappan, P. "An approach to synchronization for parallel computing." (6 1988).
  • Sadayappan, P.; Visvanathan, V. "Parallelization and performance evaluation of circuit simulation on a shared-memory multiprocessor." (6 1988).
  • Sadayappan, P.; Visvanathan, V. "Comparative analysis of approaches to hardware acceleration for sparse-matrix factorization." (12 1988).
  • Goel, A.; Ramanujam, J.; Sadayappan, P. "Towards a 'neural' architecture for abductive reasoning." (12 1988).
  • Ercal, F.; Ramanujam, J.; Sadayappan, P. "Task allocation onto a hypercube by recursive mincut bipartitioning." (1 1988).
  • Goel, A.; Sadayappan, P.; Josephson, J.R. "Concurrent synthesis of composite explanatory hypotheses." (12 1988).
  • Ling, Y.L.C.; Sadayappan, P.; Olson, K.W.; Orlin, D.E. "VLSI ROBOTICS VECTOR PROCESSOR FOR REAL-TIME CONTROL.." (1 1988).

1987

  • Ling, Y.L.C.; Olson, K.W.; Orin, D.E.; Sadayappan, P. "LAYERED RESTRUCTURABLE VLSI ARCHITECTURE FOR ROBOTIC CONTROL.." (12 1987).
  • Sadayappan, P.; Visvanathan, V. "CIRCUIT SIMULATION ON A MULTIPROCESSOR.." (1 1987).
  • Sadayappan, P.; Ercal, F.; Martin, S. "MAPPING FINITE ELEMENT GRAPHS ONTO PROCESSOR MESHES.." (12 1987).

1986

  • Ercal, F.; Sadayappan, P.; Schwan, K.; Weide, B. et al. "PARALLEL COMPUTERS FOR FINITE ELEMENT ANALYSIS.." (1 1986).

1985

  • Ashok, V.; Costello, V.A.; Sadayappan, P. "MODELING SWITCH-LEVEL SIMULATION USING DATA FLOW.." (12 1985).
  • Ashok, V.; Costello, R.; Sadayappan, P. "DISTRIBUTED DISCRETE EVENT SIMULATION USING DATAFLOW.." (12 1985).

1984

  • Sadayappan, P.; Smith, D.R. "TASK DISTRIBUTION ON A HIERARCHICAL MULTICOMPUTER.." (12 1984).

Unknown

  • Martin Kong, Louis-Noel Pouchet, and P. Sadayappan "A Roofline-based Performance Estimator for Distributed Matrix-multiply on Intel CnC." in International Workshop on Automatic Performance Tuning (IWAPT’2015), in conjunction with IPDPS 2015.
  • Tobias Grosser, Sebastian PoP, Louis-Noel Pouchet, J. Ramanujam and P. Sadayappan "Optimistic Delinearization of Parametrically Sized Arrays." in ACM SIGARCH 29th International Conference on Supercomputing (ICS’15).