Timings for the T3E: NAVO's MSRC & SDSC

Early Experiences with the Cray T3E at the SDSC and the NAVOCEANO MSRC

Joel E. Tohline and John Cazes

Department of Physics & Astronomy
Louisiana State University

[e-mail: tohline@rouge.phys.lsu.edu or cazes@rouge.phys.lsu.edu]

In response to a request we received from Jay Boisseau at the SDSC (boisseau@SDSC.EDU), in May and June of 1997 we produced a four-part HTML document that described our early experiences porting a computational fluid dynamics code to the Cray T3E at the San Diego Supercomputer Center (SDSC). A brief summary of these experiences -- pinpointing our early successes as well as areas in which there was still noticeable room for improvement -- can be found in Part IV of that report.

In August, 1997, through the NAVOCEANO DoD Programming Environment and Training (PET) program, we were invited to use the high-performance-computing and visualization facilities at the NAVO MSRC (Stennis, MS). We were encouraged by Luke Lonergan (luke@navo.hpc.mil), in particular, to follow up on our first report and document to what extent we have been able to improve our code's performance on the T3E utilizing various performance tools and optimization tactics. This document represents our first formal report of such activities at the NAVO MSRC, but it also is intended to provide the SDSC with an update on our activities that have been aimed at improving our code's overall execution performance on the T3E.

Because the NAVO MSRC did not acquire an HPF compiler for their T3E until late October, 1997, the results reported here represent work that has been performed only over the past two months. In quantifying improvements in our code's performance, we have utilized "pat" (a performance analysis tool that accesses the hardware performance monitors on the T3E), which only became available at the SDSC in September, 1997. We also have benefitted significantly from a recent workshop at the NAVO MSRC conducted by John Levesque (levesque@apri.com), entitled "Optimization of Fortran for Vector and RISC Based Architectures".

Because this report is a direct extension of our first one, we include here a table of contents that contains links to each section (Parts I - IV) of our first report. Our new results, and the thrust of this second report, begin with Part V, below.

Part I:	Background	[May 8, 1997]
Part II:	Scalability Using the Portland Group HPF Compiler (The Good News)	[May 8, 1997]
Part III:	Performance of DEC Alpha processors (The Bad News)	[June 6, 1997]
Part IV:	Summary Points (from Parts I - III)	[June 22, 1997]
Part V:	Straightforward F90 Compiler Optimizations	[January 18, 1998]
	Impact of Power-of-2 Arrays. The Benefit of STREAMS.

This work has been supported in part through an NSF-funded project entitled, "Star Formation in Galaxies." It also has been supported through significant allocations of supercomputing time on hardware facilities at the SDSC and at the NAVOCEANO MSRC.

Preface