A REQUEST FOR NSF PACI RESOURCES
Joel E. Tohline & Juhan Frank
Department of Physics & Astronomy
Louisiana State University
This entire document, along with several illustrative animation sequences is available online at the address http://www.phys.lsu.edu/faculty/tohline/NRAC_2002/ 
1. Summary of Proposed Research, Including Objectives and Goals
Over half of all stars in the sky are multiple star systems and most of these are binary stars. We recently have completed a review of the mechanisms that have been proposed to explain the origin of binary star systems (Tohline 2002). Approximately one half of the binary stars are close enough to one another for mass to be exchanged between the components at some point in their evolution (Trimble 1983). The components of these interacting binaries influence one another in a variety of ways including tidal forces, irradiation, and in the most extreme manifestation of the interaction, through mass transfer and mass exchange. As a result, the evolution and observational properties of the binary components and the binary itself are dramatically different from the behavior expected in the absence of mass transfer, even when this transfer proceeds at a leisurely pace.
Interacting binaries of one type or another, especially those in which the accretor is a compact object (Frank, King & Raine 2002), play central roles in many of the astrophysical sources of high energy photons, and are involved in the generation of gravitational and highenergy particle radiation. Most of the interacting binary types known observationally are presently undergoing longterm, stable masstransfer on timescales dictated by secular angular momentum losses (e.g. gravitational radiation and magnetic braking) and/or nuclear and/or thermal evolution. Wellknown examples of the above are cataclysmic variables (CVs) [Warner 1995, Beuermann 2002], lowmass Xray binaries (LMXBs) [White et al. 1995; van Paradijs & McClintock 1995; Verbunt & van den Heuvel 1995], soft Xray transients (SXTs) or Xray novae [Chen, Shrader & Livio 1997], Algols (Batten 1989, Vesper Honeycutt & Hunt 2001; Zavala et al. 2002) and W Serpentis stars (Wilson 1989), contact binaries (W Ursae Majoris stars) [Rucinski 1989; Qian 2001], supersoft Xray sources [Greiner 2000, Rappaport, Di Stefano & Smith 1994], and the ultraluminous Xray sources in nearby galaxies [King et al. 2001]. In these systems, the fraction of the donor star's mass that is transferred during one orbit is typically » 10^{12} to 10^{9}, many orders of magnitude less than what current numerical 3D hydrocodes can resolve. All of the above systems, however, must have descended from binaries in which the accretor of today was initially the more massive component who evolved first off the mainsequence. The mass transfer in these progenitor systems was in many instances dynamically unstable, yielding masstransfer rates many orders of magnitude above the currently observed rates and leading in some cases to a common envelope phase (see e.g. Warner 1995; Verbunt & van den Heuvel 1995; Nelson & Eggleton 2001) or a thermal masstransfer phase (King & Schenker 2002).
In addition, there is a wide class of binary star systems not presently undergoing masstransfer for which the astrophysical scenarios that have been proposed to explain their origin or ultimate fate involve dynamical or thermal masstransfer in a close binary. Examples of such systems are millisecond pulsars (Bhattacharya 1995), some central stars of planetary nebulae (Iben & Livio 1993), double degenerate white dwarf binaries, perhaps leading to supernovae of type Ia through a merger (Iben & Tutukov 1984), or subdwarf sd0, sdB stars (Iben 1990), and double neutron star binaries, perhaps yielding gray bursts in a fireball when the neutron stars coalesce (Paczynski 1986, Mészáros & Rees 1992, Ruffert et al 1997, Mészáros 2001). The evolutionary scenarios that are drawn upon to explain the existence of some of these systems call for events in which 10% or more of the donor's mass can be transferred during a single orbit.
We shall focus on these relatively rapid phases of masstransfer, investigate the conditions required for stable/unstable masstransfer, and follow numerically their evolution in unstable cases. The dynamical masstransfer that ensues in unstable binaries rapidly reaches nonlinear amplitudes and may lead to the formation of a massive disklike object, maybe a common envelope phase (Iben & Livio 1993), some mass loss from the system, and in some cases, end in a merger. These phenomena are truly threedimensional in character and require largescale numerical simulations to further our theoretical understanding of them. While the direct observation of these phases – with the possible exception of gravitational wave emission from mergers – remains unlikely in the near future, these investigations will help us understand the origin of these binaries and their mutual relationships. We plan to study both the tidal and the masstransfer stability of close binaries in which the component stars obey a polytropic equation of state (EOS) but, in general, possess nonuniform entropy. As we argue below, these simple structural approximations enable us to treat, with some limitations, a very diverse set of objects including main sequence (MS) stars, WDs and NSs.
We distinguish two types of dynamical instability that can occur in close binaries: 1) A dynamical instability which arises in close proximity of extended binary components with sufficiently stiff EOS due to the presence of tidal forces, even before contact and onset of masstransfer; and 2) A dynamical instability which arises upon contact, even when total angular momentum and mass are conserved, as mass is transferred from one component to the other. We shall refer to these instabilities as the tidal and masstransfer instabilities, respectively.
The classical tidal instability, first studied by Darwin (1887), deals with the secular evolution away from synchronous rotation due to dissipation. More recent treatments of this secular instability include: an analysis of tidal stability of coplanar binaries (spin axes perpendicular to orbital plane) by Counselman (1973), who shows that in unstable systems evolution away from synchronism is accompanied by a secular decrease or increase in the orbital separation; and a more general treatment by Hut (1980) allowing for arbitrary orientations of spin and orbital angular momenta.
In a series of papers Lai, Rasio & Shapiro (1993a; 1993b; 1994a; 1994b) studied the equilibrium and stability of binaries with polytropic EOS, making use of analytic techniques and a variational energy method. They were able to extend their work to unequal masses and allow for internal motions. They showed that in addition to the classical secular instability, a new dynamical instability exists not far from the point of secular instability, which leads to rapid binary inspiral under certain conditions. They showed that the instability occurred before contact was reached for stiff EOS (see also New & Tohline 1997).
In binaries with unequal components, in general one of the stars will reach contact with its Roche lobe first (the donor) and masstransfer to its companion star (the accretor) will commence either because the donor star is evolving or because the binary separation is reduced by systemic angular momentum losses. What happens next, i.e. the stability of binaries to masstransfer, depends crucially on the reaction of the donor star to massloss, and on the changes in Roche geometry that ensue from the masstransfer. Important factors determining the evolution of the Roche geometry are: 1) whether all the mass transferred by the donor is accreted (conservative masstransfer); and 2) if some of the mass transferred is lost from the system, what are the consequential angular momentum losses (Webbink 1985; King & Kolb 1995). If upon contact and onset of masstransfer, the radius of the donor and the radius of the Roche lobe vary in such a way that contact becomes deeper, masstransfer will proceed at an ever increasing rate and the situation is unstable. If, however, the contact becomes shallower, the donor will detach and masstransfer will cease. In the latter case, if some physical effect other than masstransfer insists on driving the system back into contact, stable masstransfer may be possible for as long as the driving is present. As mentioned above, in CVs and LMXBs, systemic angular momentum losses drive stable secular masstransfer from a less massive donor onto a compact object.
It has become customary to characterize the behavior of the donor and the Roche lobe upon masstransfer by introducing the following parameters (Webbink 1985, Ritter 1996): z_{R}, z_{e}, and z_{s}, which are respectively defined as the logarithmic derivatives of the donor's Roche radius, its thermal equilibrium radius and its adiabatic radius, with respect to the mass of the donor. So, for example,

It can be shown (Webbink 1985, Ritter 1996) that masstransfer is dynamically (or adiabatically) stable if z_{s} > z_{R}, whereas thermal stability requires z_{e} > z_{R}. Stars with deep convective envelopes, such as lower MS stars and giants have z_{s} < z_{e}, indicating that the dynamical instability sets in first, whereas stars with radiative envelopes, such as upper main sequence stars, have z_{e} < z_{s}. In the latter case, if z_{R} < z_{e} < z_{s}, it is in principle possible for the star to shrink rapidly first, avoiding a dynamical instability but then expand slowly by thermal relaxation yielding a stable masstransfer at a rate controlled by thermal relaxation.
The above stability considerations are 1dimensional in the sense that the state of the donor is represented by its instantaneous radius and the Roche geometry is reduced to an effective lobe radius. We plan to check the stability boundary with our numerical simulations, starting from equilibrium stellar models with both uniform and nonuniform entropy, in order to evaluate the effect of using selfconsistent potential, density and angular momentum distributions. 
MOVIE1 Stable, EqualMass Binary 



Quicktime (1,380K) 
Before outlining what evolutionary scenarios are likely to develop as we follow the nonlinear evolution of systems that are found to be unstable toward either a tidal or masstransfer instability, it is important to point out that ...
throughout the execution of this project we also will make a concerted effort to demonstrate the degree to which our nonlinear simulation techniques are able to represent accurately the equilibrium properties of close binary systems that are dynamically stable. 
In an effort to illustrate our early successes with this aspect of our investigation, we offer two movies which may be accessed via the web (if you're reading this proposal online) by clicking on the "Movie1" icon shown here [URL: www.phys.lsu.edu/ astro/ cazes/ movies/ stable.db.mov], or the "Movie4" icon as discussed in §3.b. Movie1 [taken from New & Tohline 1997] shows that, for a sufficiently soft equation of state, an equalmass (q º M_{donor}/M_{accretor} = 1) binary system can be stable against the tidal instability even when the two stars are so close to one another that they are sharing a common envelope. Movie1 follows the evolution of one such system through 3.5 orbits. Although the system is being viewed with its orbital plane faceon, orbital motion is not apparent because the evolution is being shown in a rotating reference frame. Movie4 demonstrates how well we are able to follow through more than five orbits the evolution of an unequalmass (q = 0.84) system in which the less massive component (on the right in the Movie4 icon) very nearly fills its Roche lobe. The movie displays equatorialplane mass densities on a logarithmic color scale, as viewed from a frame of reference rotating with the binary system's initial orbital frequency. Additional animation sequences from this evolution can be accessed through URL: casa.colorado.edu/ ~motl/ research/ binary/ stable_binary_visc.html.
In those cases where the masstransfer is initially unstable, we plan to follow the evolution in order to investigate how initial conditions such as the mass ratio, the entropy gradients, and the polytropic index influence the nonlinear growth of the masstransfer and the final outcome. 
We expect that, depending on the assumed conditions, some binaries may become stable after a certain mass excess is transferred, some may exchange mass, and still others may evolve through a common envelope phase. This schematic division is adopted for simplicity since there may well exist cases where a combination of the above processes operate sequentially.
We anticipate that if the mass ratio is not extreme and the polytropic indexes are similar, it should be possible for the binary to transfer some mass and regain stability. In systems that have a mass ratio q < 1, for example, the binary separation should increase with time as material flows from the donor to the accretor. But, based on the 1dimensional linear stability theory, if q is not too far from unity the Roche lobe should nevertheless shrink around the donor and thereby promote a continuous mild transfer event. Once q drops to a sufficiently small value, however, shrinkage of the Roche lobe will be discontinued and the mass transfer event terminated. If the donor has a polytropic index n = 3/2, linear theory predicts that mass transfer should stop when q £ 2/3.
When the donor is more massive than the accretor (q > 1) and has a deep convective envelope, the masstransfer almost certainly will be dynamically unstable. This is because, as mass is transferred from the more massive star to the less massive one, the orbital separation should actually decrease with time while the radius of the donor is increasing. It is generally thought that once the rate of transfer is high enough, the accretor is swamped and driven out of thermal equilibrium, and the matter transferred forms a common envelope. Most calculations of the subsequent evolution have dealt with a relatively compact star (the accretor) orbiting or plunging into the convective (polytropic) atmosphere of a giant star (Rasio & Livio 1996; Terman, Taam & Hernquist 1994; 1995 and references therein). Some calculations specifically focus on the ejection of the envelope (Taam et al. 1997; Sandquist et al. 1998).
The reasons why it is important to investigate the onset and early stages of a common envelope evolution (CEE) are discussed by Rasio & Livio (1996): 1) the dynamical phase is probably the only phase that can be calculated accurately within the capabilities of current computers; 2) the results of this early phase can be used as an initial state for followup calculations using ad hoc approximations; and 3) in some cases the fate of the system is resolved during the dynamical phase.
We plan to study the early dynamical phases of selected common envelope evolutions and investigate the conditions for the formation of a common envelope and, if resources permit, calculate the efficiency with which orbital energy is used to eject the envelope. 
Movie2 shows results from our first partial run of an n = 3/2 polytropic system in which the more massive star is the donor. Specifically, the illustrated system has an initial mass ratio q = 1.32 and, at the beginning of the simulation, the donor (shown on the left in the Movie2 icon) is just marginally filling its Roche lobe, but eventually it develops a welldefined masstransfer stream. It is interesting to note that, in this case, the center of mass of the system (and, hence, the orbital axis for the system) is positioned to the left of and L1 Lagrange point and is actually inside the donor envelope. Clicking on the Movie2 icon [URL: casa.colorado.edu/ ~motl/ reseach/ binary/ movies/ driven/ streamq.mov] will bring up a Quicktime movie that follows the evolution of this system through approximately three orbits. The evolution is shown in a rotating frame of reference, however, so the primary orbital motion is not apparent. Additional movie sequences and accompanying diagnostic plots from this simulation can be obtained from the URL: casa.colorado.edu/ ~motl/ research/ binary/ driven.html.
MOVIE2 Mass Transferring Binary 



Quicktime
(4,805K) 
Ultimately, the incorporation of additional physics – in particular, radiative transport or an ad hoc method of cooling the envelope material – is expected to be crucial to accurately model the CEE. Nevertheless, our proposed adiabatic simulation of the CEE represents, to our knowledge, the first attempt to simulate the dynamical evoluton of two wellresolved and selfconsistent binary components into a common envelope phase. As such, it will serve as an important basis for future research in this area.
The coalescence of double white dwarf (WD) and double neutron star (NS) binaries is important to examine since this process may produce a number of astrophysically interesting objects and events. Double white dwarf binary mergers have been suggested as precursors to some Type Ia Supernovae (Iben & Tutukov 1984; Iben 1988, 1991; Iben & Webbink 1989; Branch et al. 1995) and to long gray bursts (Katz & Canel 1996). WDWD mergers may also lead to the formation of massive single white dwarfs or neutron stars (Colgate & Petschek 1982; Saio & Nomoto 1985; Iben & Tutukov 1986; Kawai, Saio, & Nomoto 1987; Chen & Leonard 1993); to the formation of subdwarf stars; or to the formation of hydrogen deficient, highly luminous stars (Iben 1990 and references therein; Webbink 1984). NSNS mergers may result in bursts of grays and neutrinos, in the production of rprocess elements, and in the formation of black holes (BH) (Eichler et al. 1989; Meyer 1989; Narayan, Paczynski, & Piran 1992; Rasio & Shapiro 1992; Davies, Benz, Piran, & Thielemann 1994; Katz & Canel 1996; Lipunov et al. 1995; Ruffert, Janka, & Schäfer 1996; New & Tohline 1997; Swesty, Wang & Calder 2000; Rosswog, Davies, Thielemann & Piran 2000). Merging compact binaries are also expected to be relatively strong sources of gravitational radiation (Abramovici et al. 1992; Cutler et al. 1993; Thorne 1995).
MOVIE3 Binary Merger 



mpeg
(701K) 
2. Computational Methodology/Algorithms
In order to accomplish the stated objectives of this project, we plan, first, to use a 3D selfconsistent field (SCF) technique to generate accurate equilibrium structures of synchronously rotating binaries in circular orbits. Each rotationally flattened and tidally distorted binary model will then serve as input into (initial conditions for) a dynamical simulation. Each dynamical simulation will be performed using a 3D, finitedifference hydrodynamic (FDH) technique that follows in a selfconsistent fashion the motion of every fluid element in both stars. Our present SCF and FDH codes, as well as forerunners of these codes, have been effectively used to study gravitationally driven instabilities in protostellar gas clouds (Durisen et al. 1986; Woodward, Tohline & Hachisu 1994; Cazes & Tohline 2000); rapidly rotating, compact stellar objects (New, Centrella & Tohline 2000; Lindblom, Tohline & Vallisneri 2001, 2002); and, as mentioned above, close binaries with equal mass (New & Tohline 1997). Over the past few years, we have made key modifications in both of these algorithms in order to ensure that they perform with a sufficiently high level of numerical accuracy to permit us to confidently model masstransfer instabilities in unequalmass binaries. A detailed description of both codes and their capabilities is given in Motl, Tohline, & Frank (2002). Here we briefly summarize the principal elements of both.
It is important that we have the capability to construct accurate equilibrium structures as initial input into each dynamical simulation so that the simulations are not dominated by dynamical transients that depend heavily on initial conditions. We use the iterative SCF technique described by Hachisu (1986a,b) in its most general form in order to construct unequalmass polytropic binaries with the internal structure of both stellar components fully resolved. Models are constructed on a uniformly zoned, cylindrical mesh that conforms exactly to the mesh that is used in the FDH code (see below) so that no interpolation is needed when the converged initial state is introduced into the dynamical code. To date, we have used this SCF technique to generate a wide variety of synchronous, polytropic binaries with mass ratios 0.1 < q < 1.5 in which both stars have uniform (but different) entropy distributions (homentropic). We can readily build detached, semidetached or, in the limit of identical components, contact systems. These models have been constructed on computational grids ranging in size form 128^{3} to 256^{3} as necessary to conform to the chosen resolution of the FDH code. As measured by the global virial error, these models typically converge to better than one part in 10^{5}. As a further test, we have compared parameters from our equilibrium binaries with those calculated from the point mass Roche model by Mochnacki (1984). As expected, our results differ from Mochnacki's because we use a selfconsistent gravitational potential as opposed to the point mass potential as is appropriate in the Roche model. Our results converge to Mochnacki's in cases where the mass ratio becomes small and/or the separation increases.
For unequalmass binaries, we have discovered that it is not possible to construct an initial model in which the centerofmass of the system lies precisely on the coordinate axis of the cylindrical grid. Since the FDH simulations are performed in a frame of reference that rotates with the orbital frequency of the binary and the coordinate axis of the grid serves as the axis of rotation of the grid, it is then necessary to impart a very small velocity "kick" to every fluid element in the initial model in order to compensate for this discrepancy and minimize motion of the centerofmass through the grid (for details, see Motl et al. 2002).
In addition to polytropic binaries, we are able to construct selfconsistent, Newtonian models of compact binaries by choosing an appropriate equation of state such as the zero temperature WD equation of state (Hachisu 1986) or any of several realistic NS equations of state as described by New & Tohline (1997). Binary models using homentropic equations of state are, however, inadequate for examining the stability of mass transfer in main sequence stars without deep convective envelopes (stars with masses greater than but on the order of one solar mass). For an homentropic polytrope of index n, z_{s} = (1  n)/(3  n) so that z_{s} = 1/3 when n = 3/2 for example. On the other hand, the proper mainsequence massradius relationship demands z_{s} » 1. By allowing the specific entropy to be a function of the density, we are able to construct equilibrium configurations that are both convectively stable and that reproduce the desired massradius relationship.
Our primary dynamical simulation tool is an mpi (message passing interface) based algorithm designed to perform multidimensional, gravitational CFD (computational fluid dynamic) simulations efficiently on parallel computing platforms. As has been described in detail by Motl et al. (2002), the evolution of a system is governed through a finitedifference representation of the multidimensional equations that describe the dynamics of inviscid, selfgravitating, compressible gas flows. More specifically, the employed finitedifference hydrodynamics (FDH) algorithm is based on the secondorder accurate, van Leer monotonic interpolation scheme described by Stone and Norman (1992), but extended to three dimensions on a uniform, cylindrical coordinate mesh. Most of our simulations, to date, have been performed on a computational lattice containing 130´98´256 grid zones in the R, Z, and q directions, respectively, with no assumed geometric symmetries. But over the past year, utilizing the capabilities and capacity of Blue Horizon (the SP3) at the SDSC, we have begun to utilize a computational lattice with twice this resolution in each dimension: 258´194´512. The fluid equations are advanced in time via an explicit integration scheme so an upper limit to the size of each integration time step is set by the familiar Courant condition. At each time step, the Poisson equation is solved implicitly using a combined Fourier transformation and iterative ADI scheme (Cohl, Sun & Tohline 1997) in order to determine, in a selfconsistent fashion, how fluid elements are to be accelerated in response to the fluid's own selfgravity. As is described more fully, below, boundary values for the Poisson equation are determined via a compact cylindrical Green's function technique (Cohl & Tohline 1999). By performing our simulations in a frame of reference that is rotating with an angular velocity W_{0} equal to the initial angular velocity of the binary orbit, we are able to minimize the effects of numerical viscosity (see New & Tohline 1997; Swesty, Wang and Calder 2000), which invariably plays a role in any FDH algorithm.
We have constructed a heterogeneous computing environment through which we are able to routinely create volumerendered images and 3D animation sequences of our evolving fluid configurations. Relatively small amounts of data are regularly passed across the network from the parallel machine that is performing the FDH simulation to a machine that is more suited to visualization tasks; images that will become part of an extended animation sequence are created from this data; then the 3D data sets are immediately destroyed in order to minimize the impact on disk storage. We are utilizing the scriptable attributes of Maya (formerly AliasWavefront) software on SGI systems in order to perform these critical visualization tasks. (The 3D "Movie" sequences referred to throughout this proposal have been produced in this fashion. A variety of other animation sequences that have been produced with this tool can be accessed online via the URL: www.phys.lsu.edu/ faculty/ tohline/ image.ensemble/ movies.html.) This visualization tool and the heterogeneous computing environment in which it functions will provide a critical component of the simulations being proposed here.
This is the computationally intensive segment of a multiyear research project that is designed to address many of the scientific objectives that have been outlined here in §1, and thereby contribute to a more complete understanding of the tidal and masstransfer instabilities in close binary systems. In this context, we propose to carry out the following welldefined simulation tasks.
We have laid the groundwork for these investigations by (a) constructing numerous equilibrium binary sequences with q close to but not equal to unity; and (b) following through more than five orbits the dynamical evolution of two detached systems and showing that these systems are perfectly stable against mass transfer (see Motl et al. 2002 and the discussion here associated with Movie4).
We have laid the groundwork for these investigations by following through several orbits the evolution of one n = 3/2, q = 1.32 model that we expected to be rather violently unstable toward mass transfer (see the discussion associated with Movie2).
3. Preliminary Results and Progress to Date
Over the past five and a half years, we have been awarded a total of approximately 390,000 SUs at the SDSC and UTAustin to study dynamically evolving astrophysical systems in connection with our NSF and NASAsponsored research into the birth and destruction of binary star systems. Our present allocation of 130,000 SUs on SDSC's SP3 (Blue Horizon) runs through October 31, 2002. Our current request for PACI resources will build directly upon this successful, ongoing NPACI project.
i) Parallel CFD Algorithm
Our initial simulations on the T3E at the SDSC were carried out with a version of our code that had been developed on other parallel platforms and written in highperformancefortran (HPF). Using the available Cray HPF compiler (PGHPF, developed by the Portland Group), we found that relatively few unexpected code modifications were required in order to successfully compile and execute our code on the SDSC system. We discovered, however, that the PGHPF compiler was producing executable code that performed computational instructions on each T3E processor with extraordinarily low efficiency. In 1998, we successfully rewrote our entire FDH algorithm, incorporating explicit message passing instructions via mpi. We found that this new, mpibased code executed four to five times faster than the old HPF version of the code! According to numbers drawn from "pat," a performance monitoring tool that ran on the T3E, we determined that the code's singleprocessor execution rating was 36 MFlops. We also found that the code scaled very well as the number of processors was increased from 4 to 128. (Details of the code's performance on the Cray T3E can be found in several of our previous NRAC proposals, which are available online: 1999, 2000, 2001.)
FDH
Code Timings
on the SDSC SP3
(seconds per integration timestep) 

66^{2}´128  130´66´128  130^{2}´128  130^{2}´256  258´130´256  258^{2}´256  258^{2}´512  
8  0.873  1.858  4.516         
16  0.506  0.956  2.030  3.203       
32  0.339  0.604  1.117  2.464  4.991     
64  0.272  0.430  0.682  1.332  2.675  5.215   
128      0.468  0.831  1.447  2.833  6.377 
256        0.648  1.066  1.725  3.419 
512          0.808  1.154  2.119 
Two years ago, we migrated our mpibased FDH code to the SP3 at SDSC. Table 1 illustrates the present level of performance and degree of scalability of our mpibased FDH code on the SDSC SP3. The first column lists the number of processors used in each test; and the numbers at the top of every other column specifies each test's grid size. Reading the numbers along the diagonal in Table 1, we see that each time we double the problem size and simultaneously double the number of processor elements, our FDH algorithm slows down by 10 to 20%. This scaling gets closer to linear as we take fuller advantage of the machine's available memory.
If we utilize 64 processors on the SP3 to simulate a problem with a grid resolution of 130^{2}´256, it requires 1.332 wallclock seconds per integration time step. Since 50,000 time steps are typically required to evolve the system through one binary orbit at this lattice resolution, each orbit typically requires 1200 SUs of computing resources. This is 3.6 times faster than the code's earlier performance on 64 nodes of the Cray T3E. However, since each processor on the SP3 has access to more memory, we're actually able to handle this size problem on 16, rather than 64 SP3 processors. As Table 1 shows, this same size problem requires 3.2 seconds per integration time step on 16 SP3 processors, that is, about 700 SUs per binary orbit. Hence, by moving our simulations from the T3E to the SP3, we gained a factor of six in efficiency over the T3E. Note also that although only one fourth as many processors are required on the SP3, the total wallclock time is reduced by 33%.
On the SP3 we are now able to simulate a problem that has twice the resolution in all three dimensions, that is, we are able to use a grid of size 258^{2}´512 and resolve structure in both stars and the masstransferring stream that heretofore has not been possible. As mentioned in §2.b, above, over the past year we have begun to utilize this higher resolution in recent simulations. Based on the timings shown in Table 1, such an evolution requires about 6.38 wallclock seconds per integration time step if we utilize 128 SP3 processors. Since the Courantlimited time step drops by roughly a factor of two (because each grid zone is half its original size), one orbital period requires about 100,000 integration steps, that is, about 177 wallclock hours (7.4 days) and about 22,700 SUs.
(ii) A Significantly Improved Poisson Solver
It has been clear to us for quite some time that one of the most computationally demanding components of our FDH code is the algorithm used to calculate boundary values for the gravitational potential that must be fed into the Poisson solver in order to be able to solve for the gravitational potential throughout the volume of the fluid. Historically, we have used an algorithm based on a multipole expansion in terms of spherical harmonics (Tohline 1980; see also Stone & Norman 1992), but spherical harmonics are not a natural basis set when simulations are being conducted on a cylindrical coordinate mesh. Recently we have discovered that the traditional Green's function expansion in cylindrical coordinates which involves an infinite integral over products of Bessel Functions can be written in a much more compact form (Cohl & Tohline 1999) that is extremely wellsuited for implementation on cylindrical coordinate meshes. A parallel implementation of this "compact cylindrical Green's function" technique is now part of our FDH code. It will be used in all of our future simulations because it is both faster and more accurate than the one based on a multipole expansion in terms of spherical harmonics. In addition, its implementation has permitted us to bring the top (and bottom) cylindrical grid boundary down closer to the rotationally flattened mass distribution and reduce the computational lattice size in the vertical direction.
(iii) Establishing a Heterogeneous Computing Environment
As we have described above in §2.c, at the SDSC we have constructed a heterogeneous computing environment through which we are able to routinely create volumerendered images and 3D animation sequences of our evolving fluid configurations. We consider the successful establishment of this environment which smoothly couples highend, SGI visualization tools with our simulation efforts on NPACI machines to have been a significant accomplishment.
Our first two years of research utilizing NPACI facilities were focused on the problem of binary star formation. As detailed in the accompanying attachment entitled, "Publications Resulting from NPACI Research," the results of this work have been summarized in a number of different publications, including one feature article in the AprilJune, 1999 issue of NPACI's EnVision Magazine.
Over the past three years, we have turned our attention to problems associated with masstransfer in binary star systems during the late stages of stellar evolution, as described in this proposal. Over the past year and a half in particular, we have invested considerable effort finetuning the capabilities of our SCF and FDH algorithms so that they will be able to perform the required modeling tasks with a high degree of quantitative accuracy. Motl et al. (2002) describe in detail how well our code presently performs on a variety of benchmark simulations. In one simulation, for example, a detached binary system with mass ratio q = 0.84 was evolved through just over five orbital periods on a grid having 130 × 98 × 256 zones in R, Z, q respectively. This evolution is illustrated by Movie4 [URL: casa.colorado.edu/~motl/research/binary/movies/stable_binary_visc/eq.mov ], which shows how the equatorialplane density distribution of the system changes with time. Other aspects of this system's evolution as modelled by the current version of our FDH code are illustrated in the middle two frames of Fig. 1 (see proposal p. 14). The solid curves in the lefthand frame of Fig. 1 show as a function of time the volumes occupied by material inside three different isodensity layers in the secondary star (the star closest to filling its Roche lobe and shown on the right in the Movie4 icon), with the topmost solid curve tracing the stellar surface; the dashed curve shows as a function of time the volume enclosed by the secondary's selfconsistently determined Roche lobe. All of the curves are essentially straight, horizontal lines, which means that the star (and its companion, which influences the Roche lobe volume) remains in hydrostatic equilibrium to a high degree of accuracy.
MOVIE4 Stable, UnequalMass Binary 



Quicktime (8,451K) 
The results shown in the top two frames of Fig. 1 display information that is directly analogous to the middle two frames, but the displayed results come from an earlier version of our SCF and FDH codes; a version only slightly different from the one used by New & Tohline (1997) to examine the stability of equalmass binaries against tidal instabilities. As the two figures from this evolution illustrate, with this earlier version of our simulation tools, we observed a slow expansion of the secondary star; the orbit itself developed a significant epicyclic amplitude; and after about three orbits, the Roche lobe was encroaching on the surface of the secondary. A comparison of the top two frames of Fig. 1 with the middle two indicates that significant improvements have been made over the past few years in our ability to accurately model the evolution of binary stars. These include: Taking advantage of the messagepassing interface (MPI) libraries on parallel computers so that the number of azimuthal zones could be doubled – from 128 to 256 zones over the full 2p radians; SCF iterations are performed in double precision to obtain better initial states; as mentioned earlier, a small velocity is imparted to the stars as they are introduced into the FDH code in order to compensate for the slight offaxis initial location of the system center of mass; the FDH algorithm was modified to make the integration scheme more properly timecentered; a more accurate scheme has been developed to determine the gravitational potential on the boundary of the computational grid (see § 3.a.ii); and artificial viscosity was included to mediate a variety of weak shocks that appear at the surface of the stars.
These improvements have put us in a position to study masstransfer instabilities in unequalmass binaries with a degree of quantitative accuracy that heretofore has not been possible. We not only have improved on the simulation capabilities of New & Tohline (1997), but as Motl et al. (2002) detail, the epicyclic amplitude of our binary orbits are an order of magnitude smaller and angular momentum conservation is an order of magnitude better than in the recently published equalmass binary simulations of Swesty, Wang, and Calder (2000). As Motl et al. (2002) have concluded, with our current simulation tools we should be able to follow masstransfer events in which the fraction of the donor's mass that is transferred is greater than a few × 10^{5} per orbit for approximately 20 orbits. In particular, we are in a position to accurately determine masstransfer instability boundaries of the type defined elsewhere in this project proposal.
We also have followed the evolution of several semidetached systems in order to begin our investigation of both stable and unstable mass transfer. An uptodate listing of the simulations (with corresponding links to movies of each) can be found at URL casa.colorado.edu/~motl/research/binary/binary.html. For example, as described earlier in connection with Movie2 (see §1.c.ii) we have followed the evolution of a semidetached system in which the donor is the more massive component and therefore the mass transfer rate should rapidly grow with time. It is generally difficult to generate initial conditions for semidetached binaries in which the contact is deep enough so that the resultant mass transfer can be calculated correctly. To generate such initial conditions we introduced driving by artificially removing orbital angular momentum at the level of about 2% per orbit. We expected that once the system is driven into sufficiently deep contact and the mass transfer is growing, we might then be able to remove this artificial driving and continue the evolution without it. These simulations did eventually lead to a merger of the components of the binary but also revealed a subtle problem with the conservation of angular momentum which is being investigated.
4. Justification of Resources Requested
i) Required Platforms
To complete the tasks outlined for this project, we are requesting an allocation of 480,000 SUs on the IBM SP3 at the SDSC (see Table 2 for details). From our past work at the SDSC and elsewhere we feel that this platform is quite suitable for our simulation work for the following reasons. First, the mpi implementation of our FDH code has been developed and thoroughly tested on the SP3. As illustrated by the numbers in Table 1, our FDH code scales fairly well on the SP3 architecture. Second, the SP3 has sufficient capacity to permit us to simulate systems with significantly higher spatial resolution than previously has been possible (or at least practical). Third, we have a small SP3 system at LSU, so numerous tests can be conducted on this local machine in direct concert with production runs at SDSC. Finally, we have developed a heterogeneous computing environment that smoothly integrates the capabilities of the SP3 with our available visualization tools.
ii) Code Requirements and Optimization Issues
Since we have discussed code optimization issues fairly thoroughly in earlier sections of this proposal, we will not comment further on them here. From the information provided above  in §3.a.ii and Table 1, for example  we can estimate how much computing time will be required to make significant progress on this project. Typically, each binary simulation must be followed through 5 or more orbits before the physical outcome of the evolution can be ascertained. Utilizing 128 processors on the SP3, and a grid resolution of 258^{2}´512, each evolution will therefore demand 885 hours, that is, 1.25 months (!) of wallclock time to complete; and this assumes that we have uninterrupted access to 128 processors. This is not very practical. Instead, we propose to utilize 256 (or 512) processors which, according to the numbers shown in Table 1, will demand only 95 wallclock hours = 4 days (or 59 hours = 2.5 days) per orbit. Thus, each 5orbit evolution can be completed in as few as 20 (or 12) days of dedicated execution. In a single year, we can realistically expect to complete only a few evolutions of this magnitude. Quite a few 'trial' evolutions can be completed on lowerresolution grids, however.
Our proposed simulations at a resolution of 258^{2}´512 will require at least 32 GB of RAM, hence, at least 128 processors on the SP3. More specifically, according to the AIX "size" command, if we utilize 128, 256, or 512 processors, our simulations will require 0.25 GB/proc., 0.14 GB/proc., or 0.08 GB/proc., respectively. As just discussed, due to execution time constraints, we will most likely run these highresolution simulations on 256 or 512 processors. Our estimated SU requirements (calculated below) assume we will use 512 processors.
iii) Proposed Production Runs
In §2d of this proposal we described the simulation tasks that must be finished in order to satisfactorily complete this multiyear project. We will postpone Task IV to future years and will focus this year on Tasks I, II and III. As Table 2 indicates, in connection with each task we plan to carry out several runs at low spatial resolution (requiring 64 or fewer compute nodes) and for only a few orbits each in order to determine which initial state is most worthy of an extended, highresolution run. The number of service units required to complete each highresolution run has been estimated assuming we utilize 512 compute nodes. (If we use fewer nodes, somewhat fewer SUs would be required but, as explained earlier, the wallclock time required to complete each evolution becomes prohibitive.)
The Task numbers that have been identified in connection with each proposed simulation in Table 2 follow the outline used in our earlier discussion of the respective problems in §2d. The column labeled N_{models} specifies the number of different initial model configurations that will be examined and the column labeled N_{orbits} lists the number of orbits through which we estimate we will need to follow each binary system in order to complete the specified task. Along with each Task we have specified the (R´Z´q) grid resolution that will be used and the approximate number of SUs that will be required to simulate one complete orbit at that grid resolution (see §3.a.i for details). Finally, the number of SUs that we estimate will be required to complete each proposed simulation task (last column of Table 2) is obtained by multiplying the required number of SUs/orbit by N_{orbits} and by N_{models}. Our Table 2 total is approximately 480,000 SU. This is the amount of computing resources that we are requesting over the coming twelvemonth allocation period.
Proposed Simulations on SDSC's IBM SP3  N_{models}  N_{orbits}  SU  

A.  Tasks I and II  
  Grid resolution 130^{2}´256 on 64nodes ==> 1,200 SU/orbit]  5  3  18,000  
  Grid resolution 258^{2}´512 on 512nodes ==> 30,000 SU/orbit]  2  5  300,000  
B.  Task III  
  Grid resolution 130^{2}´256 on 64nodes ==> 1,200 SU/orbit]  3  3  10,800  
  Grid resolution 258^{2}´512 on 512nodes ==> 30,000 SU/orbit]  1  5  150,000  
Total requirement.  478,800 
v) Special Requirements
Our primary simulation tools (the SCF code and the FDH code) have been developed entirely by our group at LSU and require no special packages other than standard mathematics and FFT utilities on the SP3. In the past, we have relied heavily on visualization software running on the SDSC Vislab facilities, but this will no longer be a requirement because the necessary software (specifically, Maya) now resides on our local machines at LSU.
5. Local Computing Environment
Through LSU's Concurrent Computing Laboratory for Materials Simulation we have routine access to a dualprocessor SGI Octane and an SGI Onyx that drives an ImmersaDesk system. On this SGI equipment we have installed Maya (TM), the AliasWavefront software that will be used to routinely render 3D images and generate movie sequences from our dynamical simulations, as explained in §2.c. LSU operates a 32processor IBM SP3 that is dedicated to research computing. On this machine we are able to run dynamical evolutions of modest size and length. We expect to continue to utilize this machine to complete a variety of short test runs, with the idea that full production runs will be carried out on NPACI facilities.
This past year, the Louisiana legislature funded a new, statewide information technology initiative through which LSU received $7M in new, recurring funding. The PI (Tohline) has been serving as interim director of the Center for Applied Information Technology and Learning (LSU CAPITAL) that has been established at LSU through this initiative. A portion of this year's funding has been used to purchase a 1024processor, Beowulfclass supercomputer: 1.8 GHz Intel P4 Xeon processors; myrinet network; 1 GByte RAM per processor (see www.lsu.edu/ lsucapital/ beowulf.html for details). The hardware for this system arrived in May, 2002 and, within the past week (1^{st} week of July), the system has been installed and integrated successfully. Early linpack benchmarks indicate that the system as a whole clocks at 2 TFlops. In preparation for the arrival of this machine, two students from our research group (Shangli Ou and Mario D'Souza) attended a workshop at NCSA in October, 2001 that was designed to introduce users to the Pentium IIIbased Beowulf cluster that had just been installed at NCSA. During this workshop our gravitational CFD code was successfully ported to the NCSA cluster. This has positioned us well to take advantage of the new cluster that is being installed at LSU. Early, short test runs indicate that, on a given number of physical processors (16, for example) our code executes at least as fast on LSU's P4 Xeonbased system as it does on the SP3 at SDSC. We expect, therefore, that the system at LSU will ultimately be a tremendous resource for our group. Since the hardware system has not yet been thoroughly tested, however, we are not sure how robust it will be and, in particular, we are not sure how well our code will scale to large numbers of nodes or will perform on it in a production mode. It is therefore important that we continue to secure access to the SP3 at the SDSC to support our research program.
6. Other Supercomputer Support
Over the past 2 years, the PI has collaborated with Thorne and Lindblom (Caltech) to model the nonlinear growth of an unstable "rmode" in isolated, rotating neutron stars, and to simulate the tidal disruption of a neutron star that is in orbit about a more massive, Kerr black hole. Through this collaboration, the PI has gained access to the HP Exemplar at CACR. However, the intent of this allocation (to Thorne) has been to provide a mechanism by which students in Thorne's group can begin to develop the numerical tools necessary to study these general relativistic processes. The request being made here for NSF PACI resources is intended to support our own primary investigation into masstransferring, nonrelativistic binary systems; none of the simulations being proposed herein will be performed on CACR machines. We should note that our initial application of the PI's simulation tools (as described here in §2) to the "rmode" problem have proven to be very successful (Lindblom, Tohline, & Vallisneri 2001, 2002). The results of this work (and images produced with our visualization tools) were featured in the AprilJune 2001 issue of NPACI & SDSC's EnVision Magazine.
7. Project Team Qualifications
A major component of the PI's research work over the past twenty years has involved the largescale numerical simulation of multidimensional gas flows in astrophysical systems utilizing a variety of different high performance computing platforms. In the early 1980s, the PI held a postdoctoral research position in group T6 at Los Alamos National Laboratories at which time he broadened his exposure to stateoftheart CFD techniques and learned to develop efficient algorithms for vector supercomputers. From the mid1980s through the early 1990s, Tohline was PI on a large account at the Cornell Theory Center during which time he and his students experimented with the parallel implementation of CFD algorithms on the IBM 3090 and various accompanying FPS Array Processors. When an 8Knode MasPar MP1 system arrived at LSU in 1991, the PI abandoned the CTC facilities because his group was having considerable success utilizing the MasPar system for its largest simulation efforts. As has been detailed elsewhere in this proposal, in recent years the PI and his students have gained experience developing gravitational CFD algorithms for execution on a variety of parallel platforms including MasPar's MP2, the CM5, the Cray T3E, and the IBM SP2 & SP3. The group also has developed and implemented a heterogeneous computing environment (at LSU and at the SDSC) that tightly couples its parallel simulation efforts to sophisticated visualization tools.
CoPI J. Frank has broad experience in studies of the astrophysics of interacting binary systems, particularly when such interactions involve fluid flow and accretion events. He is, for example, lead author of a widely referenced technical CUP publication entitled, "Accretion Power in Astrophysics." His expertise within this project helps ensure that the results of the proposed complex multidimensional and nonlinear simulations are physically correct and quantitatively meaningful when placed in the context of the extensive literature on this subject. Frank also is lead investigator on the NASAfunded, Astrophysics Theory Program project in partial support of which this request for NPAC resources is being made. (Tohline is CoPI on this NASAfunded project.)
Shangli Ou and Mario D'Souza are graduate students enrolled in the Ph.D. physics program at LSU. They both will be heavily involved in the simulation work outlined in this proposal. Both students are entering the fourth year of their graduate studies and will be deciding the focus of their dissertation research based in large part on our success with this year's simulations of masstransferring binaries, as described in this proposal.