Home > CSSS Seminars > Barton Miller
 
Barton Miller
Computer Sciences Department
University of Wisconsin
http://www.cs.wisc.edu/~bart/


A Path to Performance Diagnosis on 1000+ Nodes

Current performance profiling tools are limited in the size of the system on which they can measure application programs. These limitations are based on (1) the amount of data that must be transferred between the components of the tool and (2) the centralized control and monitoring that is done at the tool front-end process.
The Paradyn project is developing four new techniques which, when used together, will allow Paradyn to be used to effectively profiling applications on 1000+ node systems. The four major components of this effort are:

  • A software multicast/reduction network layer (MCRNL) that allows for efficient distribution of control operations and gathering of results from the nodes. MCNRL incorporates innovative reduction operations and will support a fault tolerant recovery facility.


  • A scalable start-up protocol that all reduces the amount of front-end to daemon communication needed. In systems with 1000's of node, the initial interaction with the corresponding 1000's of daemons can be overwhelming.


  • A distribute Performance Consultant that uses MCRNL to efficiently evaluation global (all node) bottlenecks and distributes evaluation of local (node specific) bottlenecks to local Performance Consultant agents. As part of this effort, we have developed a new, more detailed model of instrumentation overhead and feedback.


  • Sub-Graph Folding, a scalable visualization technique for displaying the results of the Performance Consultant's bottleneck search. This technique exploits the regular structure of SPMD programs, to combine results into equivalence classes and presenting only the exemplars of the class.


I will describe the design based on these features an present some initial results as to their effectiveness.
   
  Home | SDSC | UCSD | Campus Map | Contact Info
   
 
  The San Diego Supercomputer Center (SDSC) is a research unit of the University of California, San Diego, and the leading-edge site of the National Partnership for Advanced Computational Infrastructure. SDSC researchers conduct studies in computational science, develop high-performance computing and networking technologies, and participate in NPACI activities.

SDSC -- UC San Diego, MC 0505 -- 9500 Gilman Drive -- La Jolla, CA 92093-0505 -- 858-534-5000 -- 858-534-5152 (fax)
info@sdsc.edu © 2001, The Regents of the University of California