A REQUEST FOR NSF PACI RESOURCES
Joel E. Tohline, Juhan Frank, & Patrick Motl
Department of Physics & Astronomy
Louisiana State University
tohline@physics.lsu.edu
This entire document, along with several illustrative animation sequences is available online at the address http://www.phys.lsu.edu/faculty/tohline/NRAC_2000/ 
1. Summary of Proposed Research, Including Objectives and Goals
Many classes of interesting astronomical phenomena are known for which the proposed astrophysical scenarios involve dynamical or thermal masstransfer in a close binary star system, sometimes leading to a common envelope phase. These models are invoked to explain either their origin, or a particular phase of their evolution, or their current state, or even their ultimate fate. We shall focus on relatively rapid phases of masstransfer, investigate the conditions required for stable/unstable masstransfer, and follow numerically their evolution in unstable cases. In many cases, the presentday systems are known to be undergoing longterm, stable masstransfer driven by angular momentum losses or nuclear evolution. Although such longterm, secular phases of evolution may be easier to study observationally, they are too slow to be amenable to present numerical simulations and are therefore beyond the scope of the present proposal. Wellknown examples of the above are cataclysmic variables (CVs) [Warner 1995], lowmass Xray binaries (LMXBs) [White et al. 1995; van Paradijs & McClintock 1995; Verbunt & van den Heuvel 1995], millisecond pulsars (Bhattacharya 1995), soft Xray transients (SXTs) or Xray novae [Chen, Shrader & Livio 1997], Algols (Batten 1989) and W Serpentis stars (Wilson 1989), contact binaries (W Ursae Majoris stars) [Rucinski 1989], some central stars of planetary nebulae (Iben & Livio 1993), double degenerate white dwarf (WD) binaries, perhaps leading to supernovae (SNe) of type Ia through a merger (Iben & Tutukov 1984), or subdwarf sdO, sdB stars (Iben 1990), and double neutron star (NS) or NSblack hole binaries, perhaps yielding gammaray bursts in a fireball when the system coalesces (Mészáros & Rees 1992; Ruffert et al. 1997; Kluzniak & Lee 1997).
The dynamical masstransfer that ensues in unstable binaries rapidly reaches nonlinear amplitudes and may lead to the formation of a massive disklike object, maybe a common envelope phase (Iben & Livio 1993), some mass loss from the system and, in some cases, end in a merger. These phenomena are truly three dimensional in character and largescale numerical simulations are required to further our theoretical understanding of them. Accurate simulations of these rapid phases of binary interaction and masstransfer are crucial to our understanding of wide classes of objects that are under current observational investigation.
We plan to study the masstransfer stability of close binaries in which the component stars obey a polytropic equation of state (EOS) but, in general, possess nonuniform entropy. As we argue below, these simple structural approximations enable us to treat, with some limitations, a very diverse set of objects including main sequence (MS) stars, WDs and NSs. Even in the case of a double NS binary, our simulations will adopt a Newtonian treatment of the potential but with some ad hoc relativistic corrections.
We distinguish two types of dynamical instability which can occur in close binaries:
The classical tidal instability, first studied by Darwin (1887), deals with the secular evolution away from synchronous rotation due to dissipation. Recent treatments of this secular instability include: an analysis of tidal stability of coplanar binaries (spin axes perpendicular to orbital plane) by Counselman (1973), who shows that in unstable systems evolution away from synchronism is accompanied by a secular decrease or increase in the orbital separation; and a more general treatment by Hut (1980) allowing for arbitrary orientations of spin and orbital angular momenta.
In a series of papers Lai, Rasio & Shapiro (1993a; 1993b; 1994a; 1994b) studied the equilibrium and stability of binaries with polytropic EOS, making use of analytic techniques and a variational energy method. They were able to extend their work to unequal masses and allow for internal motions. They showed that in addition to the classical secular instability, a new dynamical instability exists not far from the point of secular instability, which leads to rapid binary inspiral under certain conditions. They showed that the instability occurred before contact was reached for stiff EOS (see also New & Tohline 1997). More recently, Uryu & Eriguchi (1999) have pointed out that for compact binaries the system may reach the masstransfer limit (contact) before the tidal instability limit. This drastically alters the proposed evolutionary scenarios based on the results of Lai, Rasio & Shapiro, and highlights the need to examine the fully threedimensional evolution of compact binaries with mass ratios different from one.
In binaries with unequal components, in general one of the stars will reach contact with its Roche lobe first (the donor) and masstransfer to its companion star (the accretor) will commence either because the donor star is evolving or because the binary separation is reduced by systemic angular momentum losses. What happens next, i.e. the stability of binaries to masstransfer, depends crucially on the reaction of the donor star to massloss, and on the changes in Roche geometry that ensue from the masstransfer. Important factors determining the evolution of the Roche geometry are: 1) whether all the mass transferred by the donor is accreted (conservative masstransfer); and 2) if some of the mass transferred is lost from the system, what are the consequential angular momentum losses (Webbink 1985; King & Kolb 1995). If upon contact and onset of masstransfer, the radius of the donor and the radius of the Roche lobe vary in such a way that contact becomes deeper, the masstransfer will proceed at an ever increasing rate and the situation is unstable. If, however, the contact becomes shallower, the donor will detach and masstransfer will cease. In the latter case, if some physical effect other than masstransfer insists on driving the system back into contact, stable masstransfer may be possible for as long as the driving is present. In CVs and LMXBs, systemic angular momentum losses due to gravitational radiation and magnetic braking, and in some cases the nuclear evolution of the donor, drive stable secular masstransfer from a less massive donor onto a compact object.
It has become customary to characterize the behavior of the donor and the Roche lobe upon masstransfer by introducing the following parameters (Webbink 1985, Ritter 1996): z_{R}, z_{e}, and z_{s}, which are respectively defined as the logarithmic derivatives of the donor's Roche radius, its thermal equilibrium radius and its adiabatic radius, with respect to the mass of the donor. So, for example,

It can be shown (Webbink 1985, Ritter 1996) that masstransfer is dynamically (or adiabatically) stable if z_{s} > z_{R}, whereas thermal stability requires z_{e} > z_{R}. Stars with deep convective envelopes, such as lower MS stars and giants have z_{s} < z_{e}, indicating that the dynamical instability sets in first, whereas stars with radiative envelopes, such as upper main sequence stars, have z_{e} < z_{s}. In the latter case, if z_{R} < z_{e} < z_{s}, it is in principle possible for the star to shrink rapidly first, avoiding a dynamical instability but then expand slowly by thermal relaxation yielding a stable masstransfer at a rate controlled by thermal relaxation.
The above stability considerations are 1dimensional in the sense that the state of the donor is represented by its instantaneous radius and the Roche geometry is reduced to an effective lobe radius. We plan to check the stability boundary with our numerical simulations, starting from equilibrium stellar models with both uniform and nonuniform entropy, in order to evaluate the effect of using selfconsistent potential, density and angular momentum distributions. 
It is worth mentioning that in stable masstransfer situations, the depth of contact is relatively shallow, on the order of a few pressure scale heights, and does not change much with time. In those situations it is correct to adopt an isothermal atmosphere approximation in order to model the masstransfer and the structure of the stream. Recent simulations by Blondin (1998) confirm that such a model can reproduce the behavior of the stream predicted by Lubow & Shu (1975; 1976). In our case, however, we are interested in dynamically unstable situations which lead to deeper degrees of contact and therefore a polytropic approximation for the envelope is better. It is wellknown that in the absence of nonlinear effects which we expect to limit the growth (see below), a polytropic atmosphere of uniform entropy would yield a divergent masstransfer rate in a finite time (e.g. Webbink 1985).
Before outlining what evolutionary scenarios are likely to develop as we follow the nonlinear evolution of systems that are found to be unstable toward either a tidal or masstransfer instability, it is important to point out that ...
throughout the execution of this project we also will make a concerted effort to demonstrate the degree to which our nonlinear simulation techniques are able to represent accurately the equilibrium properties of close binary systems that are dynamically stable. 
The construction of such models, in itself, is a nontrivial task. In an effort to illustrate our early successes with this aspect of our investigation, we offer two movies which may be accessed via the web by clicking on the "Movie1" and "Movie2" icons shown here (if you're reading this proposal online) or by accessing the following URLs: [ www.phys.lsu.edu/ astro/ cazes/ movies/ stable.db.mov ] and [ www.phys.lsu.edu/ faculty/ tohline/ NRAC_2000/ stable_binary_top.mov ].
Movie1 [taken from New & Tohline 1997] shows that, for a sufficiently soft equation of state, an equalmass (q º M_{donor}/M_{accretor} = 1) binary system can be stable against the tidal instability even when the two stars are so close to one another that they are sharing a common envelope. Movie1 follows the evolution of one such system through 3.5 full orbits. Although the system is being viewed with its orbital plane faceon, orbital motion is not apparent because the evolution is being shown in a rotating reference frame. Movie2 demonstrates how well we are able to follow the orbital motion of an unequalmass (q = 0.81) system in which the less massive component very nearly fills its Roche lobe. In this movie the system is also being viewed faceon, but from the inertial frame of reference.
In those cases where the masstransfer is initially unstable, we plan to follow the evolution in order to investigate how initial conditions such as the mass ratio, the entropy gradients, and the polytropic index influence the nonlinear growth of the masstransfer and the final outcome. 
We expect that, depending on the assumed conditions, some binaries may become stable after a certain mass excess is transferred, some may exchange mass, and still others may evolve through a common envelope phase. This schematic division is adopted for simplicity since there may well exist cases where a combination of the above processes operate sequentially.
We anticipate that if the mass ratio is not extreme and the polytropic indexes are similar, it should be possible for the binary to transfer some mass and regain stability. In systems that have a mass ratio q < 1, for example, the binary separation should increase with time as material flows from the donor to the accretor. But, based on the 1dimensional linear stability theory, if q is not too far from unity the Roche lobe should nevertheless shrink around the donor and thereby promote a continuous mild transfer event. Once q drops to a sufficiently small value, however, shrinkage of the Roche lobe will be discontinued and the mass transfer event terminated. If the donor has a polytropic index n = 3/2, linear theory predicts that mass transfer should stop when q £ 2/3.
Movie3 shows results from our first extended run of an n = 3/2 polytropic system that was expected to be unstable toward mild mass transfer. The illustrated system has an initial mass ratio q = 0.88 and, at the beginning of the simulation, the donor (shown on the left) is just marginally filling its Roche lobe. Clicking on the Movie3 icon ( http://www.phys.lsu.edu/ faculty/ tohline/ NRAC_2000/ q0.88_equ_slice.mov ) will bring up a Quicktime movie that follows the evolution of this system through 7.7 orbits. The movie displays equatorialplane mass densities on a logarithmic color scale, as viewed from a frame of reference rotating with the binary system's initial orbital frequency. Additional animation sequences from this evolution can be accessed by clicking on the "other related movies" caption below the Movie3 icon.
When the donor is more massive than the accretor (q > 1) and has a deep convective envelope, the masstransfer almost certainly will be dynamically unstable. This is because, as mass is transferred from the more massive star to the less massive one, the orbital separation should actually decrease with time while the radius of the donor is increasing. It is generally thought that once the rate of transfer is high enough, the accretor is swamped and driven out of thermal equilibrium, and the matter transferred forms a common envelope. Most calculations of the subsequent evolution have dealt with a relatively compact star (the accretor) orbiting or plunging into the convective (polytropic) atmosphere of a giant star (Rasio & Livio 1996; Terman, Taam & Hernquist 1994; 1995 and references therein). Some calculations specifically focus on the ejection of the envelope (Taam et al. 1997; Sandquist et al. 1998).
MOVIE3 q = 0.88 
UNSTABLE SYSTEMS 
MOVIE4 q = 1.2 




Quicktime (15.6M) [Other Related Movies] 
Quicktime (8.9M) [Other Related Movies] 
The reasons why it is important to investigate the onset and early stages of a common envelope evolution (CEE) are discussed by Rasio & Livio (1996): 1) the dynamical phase is probably the only phase that can be calculated accurately within the capabilities of current computers; 2) the results of this early phase can be used as an initial state for followup calculations using ad hoc approximations; and 3) in some cases the fate of the system is resolved during the dynamical phase.
We plan to study the early dynamical phases of selected common envelope evolutions and investigate the conditions for the formation of a common envelope and, if resources permit, calculate the efficiency with which orbital energy is used to eject the envelope. 
Movie4, above, shows results from our first extended run of an n = 3/2 polytropic system in which the more massive star is the donor. Specifically, the illustrated system has an initial mass ratio q = 1.32 and, at the beginning of the simulation, the donor (shown on the left) is just marginally filling its Roche lobe. It is interesting to note that, in this case, the center of mass of the system (and, hence, the orbital axis for the system) is positioned to the left of and L1 Lagrange point and is actually inside the donor envelope. Clicking on the Movie4 icon ( http://www.phys.lsu.edu/ faculty/ tohline/ NRAC_2000/ q1.32_eq_slice.mov ) will bring up a Quicktime movie that follows the evolution of this system through approximately 5.8 orbits. The movie displays equatorialplane mass densities on a logarithmic color scale, as viewed from a frame of reference rotating with the binary system's initial orbital frequency. Additional animation sequences from this evolution can be accessed by clicking on the "other related movies" caption below the Movie4 icon.
While it would be interesting to compare the results obtained for common envelope evolutions using Smooth Particle Hydrodynamics (SPH; the technique used by Rasio & Livio) and FiniteDifference Hydrodynamics (FDH; our selected technique) our intent is to explore some basic questions in a parameter regime quite different from what most CEE calculations assume. Therefore we are  at least initially  not aiming to reproduce the calculations of Rasio and Livio using a FDH code. In future developments (beyond the immediate scope of this proposed project) we would expect to incorporate more physics in such calculations (e.g., turbulent viscosity, radiation transport and convection). While this is relatively straightforward in FDH, it is not so in SPH.
Ultimately, the incorporation of additional physics  in particular, radiative transport or an ad hoc method of cooling the envelope material  is expected to be crucial to accurately model the CEE. Nevertheless, our proposed adiabatic simulation of the CEE represents, to our knowledge, the first attempt to simulate the dynamical evoluton of two wellresolved and selfconsistent binary components into a common envelope phase. As such, it will serve as an important basis for future research in this area.
MOVIE5 Binary Merger 



mpeg (701K) 
Merging compact binaries are also expected to be relatively strong sources of gravitational radiation. The gravitational radiation emitted during the inspiral phase of double neutron star binary evolution (i.e., before tidal effects become sizeable) is likely to be detected by terrestrial interferometric detectors such as LIGO and VIRGO, which will be sensitive to frequencies in the range of 1010^{3} Hz (Abramovici et al. 1992; Cutler et al. 1993; Thorne 1995). Proposed "dualrecycled" interferometers and spherical "TIGA" type resonant detectors will be more sensitive than LIGO to the higher frequency radiation, ~ 10^{3} Hz, emitted during the brief final merger stage of the coalescence (Meers 1988; Strain & Meers 1991; Cutler et al. 1993; Merkowitz & Johnson 1994; Thorne 1995). The gravitational wave radiation emitted during the merger phase in double white dwarf binary evolution is unlikely to be detected in the near future because the expected frequency of the radiation falls in or just beyond the upper end of the frequency range (10^{4}10^{1} Hz) of proposed spacebased laser interferometric detectors (Faller et al. 1989; Hough et al. 1995) and the duration of the phase will be too short to produce a significant signal in this range of the detectors' sensitivity.
The initial separation at which a particular binary model becomes unstable to merger, if one exists, can be estimated via a stability analysis of a set of binary models constructed in hydrostatic equilibrium, along a constant mass sequence of decreasing orbital separation. This sequence serves as an approximate representation of the evolution of the binary as its components are brought closer together by the effects of gravitational radiation. Analyses along these lines have recently been done by Lai, Rasio, & Shapiro (hereafter LRS), by Rasio & Shapiro (hereafter RS); and by Uyru & Eriguchi for binaries with polytropic EOSs. (For earlier related work see Gingold & Monaghan 1979; Hachisu & Eriguchi 1984a, b; Hachisu 1986.)
New & Tohline (1997) recently have performed stability analyses of equilibrium sequences of double white dwarf binaries constructed with the zerotemperature white dwarf equation of state, double neutron star binaries constructed with realistic neutron star equations of state and, for the sake of comparison with the work of LRS and RS, polytropic binaries with n = 0.5, 1.0, and 1.5. For simplicity, all binary models along these sequences were constructed as synchronously rotating systems having equalmass (q = 1.0) components. Movie1 (shown earlier) shows the behavior of one system that was found to be dynamically stable against merger. Systems that were found to be dynamically unstable via the tidal instability, such as the one illustrated here in "Movie5", were followed through a nonlinear phase of evolution to merger. The properties of these merged component systems have been detailed by New (1996). In many respects the stability analyses and dynamical simulations being proposed here can be viewed as an extension of the New (1996) and New & Tohline (1997) work. This published work on q = 1 systems, as well as more recent test runs on q ¹ 1 systems, demonstrates that we have the tools in hand to carry out the proposed simulations, although the entire project will require a number of years to complete.
2. Computational Methodology/Algorithms
A primary focus of this research project is to carefully map out boundaries of stability for close binary systems. In order to accomplish this goal in a quantifiably meaningful fashion, it is important that we have the capability to construct accurate equilibrium structures as initial input into each dynamical simulation so that the simulations are not dominated by dynamical transients that depend heavily on initial conditions. We have used Hachisu's SelfConsistent Field technique (hereafter HSCF) to construct threedimensional equilibrium binary models. This iterative method was initially used by Hachisu to generate models of synchronously rotating white dwarf binaries with unequal masses (Hachisu 1986) to compare the total energies of white dwarf binaries and systems containing a white dwarf surrounded by a massive disk of white dwarf material.
To date, we have used the HSCF technique to generate synchronous, polytropic binaries with mass ratios 0.1 £ q £ 4 in which both stars have uniform (but different) entropy distributions (homentropic). To uniquely specify a model we must specify the maximum density of each component and set three boundary points along the line of centers where the density distribution vanishes. These points correspond to the inner boundary points of each star and the outer boundary point for one of the stars. The cylindrical computational lattice contains 130´ 49 ´ 256 grid zones in the R, Z, and q directions, respectively. Due to the symmetry of the tidal and centrifugal potentials about the equatorial plane, these initial models need only be calculated in the hemisphere above the equatorial plane. (In our hydrodynamical simulations, however, this symmetry restriction is removed in order to permit motions to develop that cross the equatorial plane within the stars or in the masstransfer stream.)
In addition to polytropic binaries, we are able to construct selfconsistent, Newtonian models of compact binaries by choosing an appropriate equation of state such as the zero temperature WD equation of state (Hachisu 1986) or any of several realistic NS equations of state as described by New & Tohline (1997). Binary models using homentropic equations of state are, however, inadequate for examining the stability of mass transfer in main sequence stars without deep convective envelopes (stars with masses greater than but on the order of one solar mass). For an homentropic polytrope of index n, z_{s} = (1  n)/(3  n) so that z_{s} = 1/3 when n = 3/2 for example. On the other hand, the proper mainsequence massradius relationship demands z_{s} » 1. By allowing the specific entropy to be a function of the density, we are able to construct equilibrium configurations that are both convectively stable and that reproduce the desired massradius relationship.
Our primary dynamical simulation tool is an mpi (message passing interface) based algorithm designed to perform multidimensional, gravitational CFD (computational fluid dynamic) simulations efficiently on parallel computing platforms. The evolution of a system is governed through a finitedifference representation of the multidimensional equations that describe the dynamics of inviscid, selfgravitating, compressible gas flows. More specifically, the employed finitedifference algorithm is based on the secondorder accurate, van Leer monotonic interpolation scheme described by Stone and Norman (1992), but extended to three dimensions on a uniform, cylindrical coordinate mesh. Most simulations are performed on a computational lattice containing 130´98´256 grid zones in the R, Z, and q directions, respectively, with no assumed geometric symmetries. The fluid equations are advanced in time via an explicit integration scheme (often referred to as "direct numerical simulation") so an upper limit to the size of each integration time step is set by the familiar Courant condition. At each time step, the Poisson equation is solved in order to determine, in a selfconsistent fashion, how fluid elements are to be accelerated in response to the fluid's own selfgravity. Complete descriptions of our algorithm to solve the coupled CFD equations can be found in New (1996) and in Cazes (1999). The Poisson equation is solved implicitly using a combined Fourier transformation and iterative ADI (alternating direction, implicit) scheme (Cohl, Sun & Tohline 1997) in conjunction with boundary values determined via a compact cylindrical Green's function technique (Cohl & Tohline 1999).
When performing multidimensional gravitational CFD simulations, it is always important to develop an efficient technique by which the numerical results can be readily analyzed for physical content. Because threedimensional CFD simulations in particular generate enormous amounts of data, and the computational efficiency of modern highperformance computers is quickly degraded by any significant I/O demands, generally it is preferable to analyze the data "on the fly" rather than trying to save the data to disk for future, postprocessing. As described more fully below, we have constructed a heterogeneous computing environment through which we are able to routinely create volumerendered images and 3D animation sequences of our evolving fluid configurations. (The 3D "Movie" sequences referred to in §1 of this proposal have been produced in this fashion.) Relatively small amounts of data are regularly passed across the network from the parallel machine that is performing the gravitational CFD simulation to a machine that is more suited to visualization tasks; images that will become part of an extended animation sequence are created from this data; then the 3D data sets are immediately destroyed in order to minimize the impact on disk storage. At the SDSC, we have utilized the scriptable attributes of AliasWavefront (TM) software on middle and highend SGI systems in order to perform these critical visualization tasks. This visualization tool and the heterogeneous computing environment in which it functions (see § 3b for a more detailed description) will continue to provide a critical component of our ongoing gravitational CFD simulations at the SDSC.
This is the computationally intensive segment of a multiyear research project that is designed to address many of the scientific objectives that have been outlined here in §1, and thereby contribute to a more complete understanding of the tidal and masstransfer instabilities in close binary systems. In this context, we propose to carry out the following welldefined simulation tasks.
We have laid the groundwork for these investigations by (a) constructing numerous equilibrium binary sequences with q close to but not equal to unity; (b) following through more than four orbits the dynamical evolution of one system that was expected to be stable because neither star was completely filling its Roche lobe (Movie2); and (c) following through 7.7 orbits the evolution of one n = 3/2, q = 0.88 model that we expected to be unstable toward a mild masstransfer event (Movie3). A careful analysis of the "Movie3" evolution (see further discussion, below) shows that we are able to model these complex, dynamically evolving systems with a high degree of quantitative accuracy.
We have laid the groundwork for these investigations by following through more than 5 orbits the evolution of one n = 3/2, q = 1.32 model that we expected to be rather violently unstable toward mass transfer (Movie4). This simulation is discussed in more detail in §3c, below.
3. Preliminary Results and Progress to Date
Taking advantage of our acquisition of an 8Knode MasPar at LSU, in December 1991 we rewrote our gravitational CFD code in mpf (MasPar Fortran), which is a language patterned after Fortran 90 but includes certain extensions along the lines of highperformance Fortran (HPF). Utilizing the MP1, we also gained valuable experience in the development of parallel algorithms to efficiently solve the Poisson equation in curvilinear coordinate systems (Cohl, Sun & Tohline 1997). It was at LSU as well that we initially developed a heterogeneous computing environment through which our gravitational CFD data can be visualized "onthefly" (Cazes, Tohline, Cohl, & Motl 1999; http://www.phys.lsu.edu/ astro/ cazes/ dodconf/ hce.html). Over the past six or seven years, our parallel CFD algorithm has been ported to the TMC CM5 at NCSA; to the IBM SP2 at the Cornell Theory Center and here at LSU; and to the Cray T3E at the SDSC and the NAVOCEANO MSRC in Stennis, MS. Attachment C2 to this proposal details our "early experiences with the Cray T3E at the SDSC" and, in particular, explains why we selected the T3E as our preferred simulation platform several years ago.
i) Porting the Gravitational CFD Code to the T3E
Over the past three and a half years, we have been awarded a total of approximately 180,000 SUs at the SDSC to study dynamically evolving astrophysical systems in connection with our NSF and NASAsponsored research into the birth and destruction of binary star systems. Our present allocation of 65,000 SUs on the SDSC T3E runs through October 31, 2000. Our current request for PACI resources will build directly upon this successful, ongoing NPACI project.
During the first couple of months of this project in 1997, we focused our efforts on porting our gravitational CFD code to the T3E and documenting its performance. Using the available Cray HPF compiler (PGHPF, developed by the Portland Group), we found that relatively few unexpected code modifications were required in order to successfully compile and execute our code on the SDSC system. (This was because we already had had experience successfully modifying the code to execute on the CM5 using cmf and on the Cray C90 using the Cray Fortran90 compiler.) In June, 1997, we submitted a technical report entitled, "Early Experiences with the Cray T3E at the SDSC," to Jay Boisseau at the SDSC in which we fully documented how well our code was performing on the T3E. This technical document is included as Attachment C2 to this proposal. In summary, we found that:
On the down side, we also discovered that the PGHPF compiler was producing executable code that performed computational instructions on each T3E processor with extraordinarily low efficiency. That is, as a parallel application, the code was scaling well, but its overall efficiency measured relative to the published peakperformance capabilities of the machine was poor.
A year and a half ago, we successfully rewrote our entire gravitational CFD algorithm, incorporating explicit message passing instructions via mpi. Exhaustive tests have convinced us that this new version of our code is producing physical results identical to the ones generated with our welltested HPF algorithm. Indeed, the simulations illustrated by movies numbered 3 and 4, above, have been performed with this new mpibased algorithm. As illustrated by the numbers shown in Attachment C3:
The following table (adapted from Table 5 in Attachment C3) illustrates the present level of performance and degree of scalability of our mpibased, gravitationalCFD code on the SDSC T3E_600. The first column lists the number of processors used in each test; and the numbers at the top of every other column specifies each test's grid size.
Gravitational CFD
Code Timings
on the SDSC T3E_600
Using mpi (seconds per integration timestep) 

66^{2}´128  130´66´128  130^{2}´128  130^{2}´256  258´130´256  
4  7.016         
8  4.122         
16  2.237  4.008       
32  1.311  2.269  4.394     
64    1.280  2.381  4.850  8.638 
128      1.398  2.832  4.848 
(iii) A Significantly Improved Poisson Solver
It has been clear to us for quite some time that one of the most computationally demanding components of our current gravitational CFD code is the algorithm used to calculate boundary values for the gravitational potential that must be fed into the Poisson solver in order to be able to solve for the gravitational potential throughout the volume of the fluid. Historically, we have used an algorithm based on a multipole expansion in terms of spherical harmonics (Tohline 1980), but spherical harmonics are not a natural basis set when simulations are being conducted on a cylindrical coordinate mesh. Recently we have discovered that the traditional Green's function expansion in cylindrical coordinates which involves an infinite integral over products of Bessel Functions can be written in a much more compact form (Cohl & Tohline 1999) that is extremely wellsuited for implementation on cylindrical coordinate meshes. We have implemented this "compact cylindrical Green's function" technique in parallel on the T3E. It will be used in all of our future gravitational CFD simulations because it is both faster and more accurate than the one based on a multipole expansion in terms of spherical harmonics. Most importantly, though, its implementation has permitted us to bring the top (and bottom) cylindrical grid boundary down closer to the rotationally flattened mass distribution and reduce the computational lattice size in the vertical direction.
(iv) Establishing a Heterogeneous Computing Environment
Working with several staff members in the SDSC Vislab during the summer of 1997, we successfully established a heterogeneous computing environment (HCE) at the SDSC through which the results of our gravitational CFD simulations can be routinely visualized (Cazes, Tohline, Cohl, & Motl 1999). Extending the ideas that we first developed at LSU, we have established a procedure by which, at predetermined time intervals during a simulation, the CFD code running on the T3E writes out a 3D density file and sends a signal to the Vislab's SGI Power Onyx/Infinite Reality Engine indicating that a new data file is available for processing. Having received the signal from the T3E, the graphics engine executes a previously written shell script which (i) searches through the data to locate four separate isodensity surfaces and generates a wireframe structure (i.e., vertices and polygons) that corresponds to each of these 3D surfaces; (ii) executes an AliasWavefront script which renders the set of wireframe structures as a multiply nested, reflective and colored transparent set of isosurfaces; (iii) stores on disk this rendered image of our evolving model at one particular time step in the simulation; then (iv) deletes the original 3D data set and wire frame models to conserve disk space. This entire task is continually reinitiated as data is generated during an extended CFD simulation on the T3E. The net result of this process is a stored sequence of images illustrating the 3D timeevolution of our physical system. The series of images is transferred via the internet to LSU where it can be displayed as a movie, allowing us to rapidly interpret the dynamics of the systems that we are examining. All of the 3D movie sequences referenced in §1 of this proposal have been generated in this manner. We consider the successful establishment of this environment which smoothly couples the highend, SGI visualization tools at the SDSC with our simulation efforts on the T3E to have been a significant accomplishment.
Our first two years of research utilizing NPACI facilities were focused on the problem of binary star formation. As detailed in the accompanying attachment entitled, "Publications Resulting from NPACI Research," the results of this work have been summarized in a number of different publications, including one feature article in the AprilJune, 1999 issue of NPACI's EnVision Magazine.
Over the past year, we have turned our attention to problems associated with masstransfer in binary star systems during the late stages of their evolution, as described in this proposal. In an effort to guarantee that we are following the evolution of these sytems in the most accurate manner possible at present grid resolutions, we invested a great deal of time "tuning" (a) our initial equilibrium models, (b) our finitedifference representation of forces that accelerate each fluid element in a rotating frame of reference, and (c) the manner in which the low density "vacuum" surrounding both stars is modeled, in order to perfect the manner in which stable and marginally unstable systems are represented. We then carefully followed the extended evolution of two "standard" unstable systems: one with q < 1 (Movie3), and one with q > 1 (Movie4). These simulations have clearly demonstrated to us to what degree we will be able to describe, with quantitative accuracy, dynamical masstransferring events in such systems.
For example, after 7.7 orbits (approximately 185,000 integration time steps) of the "Movie3" evolution, the orbital separation remained constant to better than half a percent and the total angular momentum was conserved to better than 0.1 percent. By the end of this simulation, 2.0´10^{4} of the donor mass had been transferred to the accretor; assuming both components are MS stars of roughly one solar mass, this translates into a masstransfer rate of approximately 0.1 solar masses per year. (This rate is very high compared to normal cataclysmic variables, but this is understandable because CVs are driven into contact by secular, rather than dynamical processes.) We are extremely pleased that we are able to properly simulate masstransfer rates at this level. It means that our initial models are excellent representations of marginally unstable systems, and it indicates that we should be able to answer many of the physically relevant questions about these types of masstransfer systems that have been posed in the introduction of this proposal.
However, we are now faced with the situation that these masstransfer rates are too low to be useful in our studies. Because of the computing demands (1500  3000 SU per orbit; see Table 2, below), and the fact that the centerofmass of each system eventually begins to wander away from the center of the coordinate mesh, it will be impractical for us to follow any single evolution for more than about 10 orbits. We will therefore need to generate higher masstransfer rates. Our plan is to slowly (and artificially) drain angular momentum from each system's orbit until the masstransfer rate climbs to a level that is roughly an order of magnitude higher than the rate just quoted, at which point the natural tendancy for the masstransfer rate to increase should take over in each unstable system, and we should be able to turn off the artificial driving. We are confident that this approach will permit us to derive physically interesting results from each of the simulation runs that is itemized in Table 2, below.
4. Justification of Resources Requested
i) Required Platforms
For the bulk of our production simulations, we are requesting an allocation of 60,000 SUs on the Cray T3E at the SDSC as well as continued access to the SGI workstations in the SDSC Vislab. From our past work at SDSC and elsewhere we feel these platforms are most suitable for our simulation work for the following reasons. First, through our past experience we have found that our gravitational CFD algorithm scales very well on the T3E architecture (for details, see Attachments C2 and C3). Second, it is on the T3E platform that our mpi implementation has been developed and thoroughly tested, showing a performance improvement of roughly a factor of 4 over our HPF code. In addition, our familiarity with the T3E architecture suggests that further improvement in our code's performance can be realized by closely scrutinizing and optimizing the use of the cache system in critical subroutines within the overall program. Finally, we have developed a heterogeneous computing environment that smoothly integrates the capabilities of the T3E with those of the SGI platforms in the SDSC Vislab.
As Table 2 shows, we are also requesting 24,000 SUs on SDSC's IBM SP (Blue Horizon). As is explained more fully, below, this will permit us to test the relative accuracy of our simulations when the grid resolution is doubled in all three spatial dimensions. It will also permit us to thoroughly test our simulation tools on the hardware platform that is most likely to be wellsupported in the near future at the SDSC.
ii) Code Requirements and Optimization Issues
Our proposed simulations at a resolution of 130´98´256 will require at least 32 nodes on the SDSC T3E  that is, the problem requires at least 4 GBytes of RAM. In practice, though, it has proven to be more efficient for us to handle turnaround times and data management issues when we utilize 64 processor nodes at a time, so that is what we plan to do routinely over the coming year. Since we have discussed code optimization issues fairly thoroughly in earlier sections of this proposal, we will not comment further on them here.
In order to verify that the present accuracy of our simulations is primarily limited by our grid resolution and, at the same time, to illustrate to what degree we should expect this accuracy to be improved in the future as the capacity of available computing hardware increases, we would like to follow one of our "standard" models through one full orbit on a computational mesh that has twice the resolution in all three spatial dimensions. (Doubling the grid resolution in only one or two dimensions will unevenly distort these accuracy measurements.) This higher resolution simulation will require approximately 8 times the amount of memory and, if run on the Cray T3E, roughly 16 times as many SU per orbit (a factor of 8 due to the increased number of grid cells and an extra factor of 2 because of the Courant limitation on our integration time steps). This single simulation would therefore require approximately 32 GBytes of RAM (256 nodes) and 48,000 SU/orbit on the Cray T3E  a task that cannot be undertaken on the Cray T3E, as it is presently configured. But it appears to be feasible on the SDSC's IBM SP (Blue Horizon) because each node has 4 GBytes of RAM, and our timings on a 32processor (200 MHz) SP here at LSU indicate that, relative to the T3E, our mpibased code should gain a factor of two in execution speed per processor on Blue Horizon. (Note that we encountered no significant problems or new bugs when we migrated our mpibased code from the T3E to LSU's SP system.) Hence, for this final task we are requesting 24,000 SU on SDSC's IBM SP.
Proposed Simulations on SDSC's Cray T3E  N_{models}  N_{orbits}  SU  

A.  Tasks I and II [Grid resolution 130´98´256 ==> 1500 SU/orbit]  
  Examine stability of models with different values of n and q;  3  5  22,500  
  Extend one earlier evolution (from 5) up to 10 total orbits.  1  5  7,500  
B.  Tasks III and IV [Grid resolution 130´98´256 ==> 3000 SU/orbit]  
  Examine stability of a model with q › 1;  1  5  15,000  
  Examine stability of a nonhomentropic model.  1  5  15,000  
Total requirement.  60,000 
Proposed Simulations on SDSC's IBM SP  N_{models}  N_{orbits}  SU  

C.  [Grid resolution 260´196´512]  
  Repeat evolution of one model with q › 1.  1  1  24,000  
Total requirement.  24,000 
iii) Proposed Production Runs
In §2d of this proposal we described the simulation tasks that we would like to complete over the coming year utilizing PACI resources. Following up on that discussion, our specific request for T3E resources at the SDSC is detailed in Table 2. The Task numbers that have been identified in connection with each proposed simulation in this table follow the outline used in our earlier discussion of the respective problems in §2d. The column labeled N_{models} specifies the number of different initial model configurations that will be examined and the column labeled N_{orbits} lists the number of orbits through which we estimate we will need to follow each binary system in order to complete the specified task. Along with each Task we have specified the (R´Z´q) grid resolution that will be used and the approximate number of SUs that will be required to simulate one complete orbit at that grid resolution. (Because our integration time step varies with time according to the limit imposed instantaneously by the Courant condition, the number quoted for SU/orbit is only an estimate based on the average cost that we have realized in carrying the "Movie3" and "Movie4" simulations through multiple orbits. Roughly speaking, the quoted number for Tasks I and II can be understood as follows: The singleprocessor execution rating of 36 MFlops, as reported in Table 4 of Attachment C3 leads to a singleprocessor execution speed of approximately 70 microseconds per zone per time step as reported in Table 5 of Attachment C3. Also, approximately 24,000 Courantlimited time steps are required to complete one binary orbit during a masstransferring event when q is less than unity. Hence, on a grid containing 3 Megazones, each integration time step will require approximately 0.061 SUs, and one binary orbit will require approximately 1,500 SUs.) Finally, the number of SUs that we estimate will be required to complete each proposed simulation task (last column of Table 2) is obtained by multiplying the required number of SUs/orbit by N_{orbits} and by N_{models}.
iv) Performance Monitor Results
Utilizing "pat"  a performance analysis tool that accesses the hardware performance monitors on the T3E  we have been able to document in detail the performance of our gravitational CFD code on the T3E. Numbers drawn directly from "pat" are provided in Attachment C1 to this proposal. Numbers from "pat" also have been used throughout our "code efficiency" discussion in Attachments C2 and C3.
v) Special Requirements
As explained in detail above, we rely heavily on the SDSC Vislab facilities to routinely analyze our simulation results. In order for this project to be a success, therefore, it is essential that we continue to have access to those facilities. It is extremely important that the AliasWavefront software continue to be made available on the SGI Onyx as well.
5. Local Computing Environment
In 1991, through an equipment grant from the Louisiana State Board of Regents, the LSU Department of Physics and Astronomy acquired an 8,192node, SIMDarchitecture MasPar MP1 to support the development of parallel algorithms for a variety of computational physics and chemistry projects. LSU recently has also acquired a 32processor (16node; 200 MHz) IBM SP2 for use by the broader campus research community. As described in §4, above, Both of these platforms have proven to be useful for parallel algorithm development and for training students how to perform astrophysical simulations at moderate spatial resolutions. However, in order to meet our most challenging scientific goals, we have had to migrate the majority of our simulation tasks to much larger parallel platforms off campus. Our department also supports a large cluster of Sun Microsystems workstations. Sparcstations that are integrated into this cluster, plus an SGI Octane provide the PI and his students with graphical unix interfaces into LSU's campuswide ethernet system and the internet. Working with several institutions in SURA (the Southeastern University Research Association) and with NSF funding, roughly one year ago LSU became connected directly into the vBNS network. This has significantly improved our ability to perform simulation tasks at the SDSC of the type outlined in this proposal.
6. Other Supercomputer Support
The PI has begun a collaboration with Professors Thorne and Lindblom (Caltech), where the initial objectives are to model the nonlinear growth of an unstable "rmode" in isolated, rotating neutron stars, and to simulate the tidal disruption of a neutron star that is in orbit about a more massive, Kerr black hole. Through this collaboration, the PI has gained access to the new HP Exemplar at CACR. However, the present allocation of time at CACR is only 10,000 SU, and the intent of this allocation (to Professor Thorne) is to provide a mechanism by which students in Thorne's group can begin to develop the numerical tools necessary to study these general relativistic processes. The request being made here for NSF PACI resources is intended to support our own primary investigation into masstransferring, nonrelativistic binary systems; none of the simulations being proposed herein will be performed on CACR machines.
7. Project Team Qualifications
A major component of the PI's research work over the past twenty years has involved the largescale numerical simulation of multidimensional gas flows in astrophysical systems utilizing a variety of different high performance computing platforms. In the early 1980s, the PI held a postdoctoral research position in group T6 at Los Alamos National Laboratories at which time he broadened his exposure to stateoftheart CFD techniques and learned to develop efficient algorithms for vector supercomputers. From the mid1980s through the early 1990s, Tohline was PI on a large account at the Cornell Theory Center during which time he and his students experimented with the parallel implementation of CFD algorithms on the IBM 3090 and various accompanying FPS Array Processors. When the 8Knode MasPar MP1 system arrived at LSU (see §5, above), the PI abandoned the CTC facilities because his group was having considerable success utilizing the MasPar system for its largest simulation efforts. As has been detailed elsewhere in this proposal, in recent years the PI and his students have gained experience developing gravitational CFD algorithms for execution on a variety of parallel platforms including MasPar's MP2, the CM5, the SP2, and the Cray T3E. The group also has developed and implemented a heterogeneous computing environment (at LSU and at the SDSC) that tightly couples its parallel simulation efforts to sophisticated visualization tools.
CoPI J. Frank has broad experience in studies of the astrophysics of interacting binary systems, particularly when such interactions involve fluid flow and accretion events. He is, for example, lead author of a widely referenced technical CUP publication entitled, "Accretion Power in Astrophysics." His expertise within this project helps ensure that the results of the proposed complex multidimensional and nonlinear simulations are physically correct and quantitatively meaningful when placed in the context of the extensive literature on this subject. Frank also is lead investigator on the NASAfunded, Astrophysics Theory Program project in partial support of which this request for NPAC resources is being made. (Tohline is CoPI on this NASAfunded project.)
Attachment B entitled, "Curriculum Vitae and Selected Publications," documents the formal educational background and professional experience of both Tohline and Frank, and provides a list of significant publications that are particularly relevant to the project being proposed here.
Patrick Motl is a graduate student enrolled in the Ph.D. physics program at LSU. He is expected to be heavily involved in the simulation work outlined in this proposal. Indeed, the astrophysics problem on which this project proposal is based is the focus of Patrick's Ph.D. dissertation research. Patrick has perfected the 3D selfconsistentfield technique which permits us to construct unequalmass, equilibrium binary systems and then to test their stability with the dynamical simulation tools described herein. He also has been solely responsible for migrating the gravitational CFD code to an mpibased environment.