Dynamical Modeling of Star Formation in Galaxies

A REQUEST FOR NPACI RESOURCES
by

Joel E. Tohline
Department of Physics & Astronomy
Louisiana State University
tohline@rouge.phys.lsu.edu

This entire document, along with several illustrative animation sequences is available on-line at the address http://www.phys.lsu.edu/faculty/tohline/NPACI98/


Proposal

1. Summary of Proposed Research, Including Objectives

Recent observational investigations of the frequency of occurrence of pre-main-sequence (PMS) binary stars have reinforced earlier suspicions (Abt 1983; Duquennoy & Mayor 1991) that "binary formation is the primary branch of the star-formation process" (Mathieu 1994). That is to say, stars tend to form in pairs rather than as single objects. Every indication is that, even in the PMS phase of stellar evolution, binaries with total masses less than or equal to three solar masses exhibit a wide range of orbital periods (days to at least 105 yr) with a median period of a few hundred years corresponding to a median separation on the order of 50 AU (astronomical units). Why and how do stars preferentially form in pairs? This is the fundamental question that motivates the proposed research activities described here.

As Bodenheimer, Ruzmaikina, and Mathieu (1993) have reviewed, a number of different theoretical scenarios have been proposed to explain the preponderance of binary stars. The ones given most attention over the past two decades are: Capture; direct fragmentation; disk fragmentation; and fission. Because either three-body encounters or close encounters that increase the importance of tides are required to make the capture process work, in normal star formation environments capture alone does not appear to provide a satisfactory explanation. (Note, however, that the presence of disks may enhance capture rates in dense clusters as pointed out by Larson 1990 and Clarke & Pringle 1991.)

As Shu, Adams and Lizano (1987) have summarized, it seems likely that binary systems with orbital periods between a few hundred and a few million years result from the direct fragmentation of a protostellar cloud early in its phase of collapse (at cloud densities n ~ 103 - 1010 cm- 3). Hydrodynamic simulations performed, for example, by Bodenheimer, Tohline & Black (1980), Boss (1980, 1986), Bodenheimer & Boss (1983), Miyama, Hayashi, & Narita (1984), Monaghan & Lattanzio (1991), and Miyama (1989, 1992) illustrate how this might occur. But at higher densities, clouds are unable to cool efficiently upon contraction and, consequently, direct fragmentation becomes problematical. Because higher mean densities are directly associated with systems having shorter dynamical times, one is led to investigate mechanisms other than direct cloud fragmentation for forming binary systems with orbital periods less than a few hundred years.

Through his ongoing NSF-funded project (AST-9528424), the PI is carrying out a variety of different numerical simulation projects in an effort to understand whether disk fragmentation and/or the fission of rapidly rotating protostars produce short-period binary star systems under physical conditions that are expected to exist naturally in protostellar environments. For completeness, we review briefly here our recent work on disk fragmentation. Then our discussion will turn to the fission problem, which is the focus of the research proposed herein.

a. Fragmentation of Protostellar Disks

Because gas clouds have difficulty getting rid of excess angular momentum during a phase of dynamical collapse, there is reason to believe that all stars form with some sort of (accretion) disk surrounding them. Depending on the rate at which mass is lost to the central protostar via accretion and the rate at which mass is gained via infall from the surrounding interstellar cloud, a disk can grow to be substantial in size (e.g., Terebey, Shu & Cassen 1984; Cassen, Shu & Terebey 1985). From an observational perspective, disks certainly appear to be ubiquitous in the context of star formation. Not only did the solar system sport a disk at a young age, but it is now clear that most classical T Tauri stars are surrounded by disks (Rucinski 1985; Basri & Bertout 1993; Beckwith & Sargent 1993) and disks even appear to frequently accompany PMS binary systems (Mathieu 1994). Accumulating observational evidence suggests, in fact, that protostellar disks can contain a significant amount of mass (Adams et al. 1990; Mathieu et al. 1995). If a disk becomes sufficiently massive, compared to the central object that it surrounds, a gravitational instability in the system may cause the disk to accumulate into an off-axis, binary companion of the central object or to break into two or more pieces. In the latter case, one or more components would have to be ejected from the system before an isolated binary would result.

We recently have completed an exhaustive study of the onset of nonaxisymmetric instabilities in geometrically thick, self-gravitating protostellar disks (Woodward, Tohline & Hachisu 1994; hereafter WTH). Animation sequences illustrating two of the simulations from this study are available on-line (see the http address at the top of this document) in formats indicated in the captions to the two "Movie Icons" shown here. Movie-1 shows the nonlinear development of the "Papaloizou-Pringle" instability in a massless disk; through simulations of this type we have been able to demonstrate that our nonlinear simulations produce nonaxisymmetric growth rates that match the predictions of linear theory (Papaloizou and Pringle 1984; Frank and Robertson 1988; Kojima 1986) to very high precision. Movie-2 shows the development of an "m = 2" mode instability in a fully self-gravitating torus.

MOVIE-1
"Massless" Torus
MOVIE-2
Self-Gravitating Torus

Quicktime (834K)
24-bit mpeg (515K)
mpeg (190K)

Upon examining systems with a wide range of disk-to-central-object mass ratios, we have concluded that the mode that is most likely to go unstable in a disk of non- negligible mass is an "m = 1" mode very similar -- if not identical -- to the "eccentric mode" first discussed by Adams, Ruden & Shu (1989; hereafter ARS). As this long-wavelength, global mode amplifies in the disk, it resonates with a particular macroscopic motion of the central protostellar object, driving the central object into an orbit of increasing radius about the center of mass of the combined star-disk system. Apparently a variety of physical mechanisms in protostellar disks are likely to promote the growth of a global, m = 1 spiral mode if the central protostar's motion is taken properly into account (Lin & Papaloizou 1989; Shu et al. 1990; Papaloizou & Savonije 1991; Heemskerk, Papaloizou & Savonije 1992). ARS have suggested that the nonlinear growth of the "eccentric mode" is likely to lead in a very natural way to the formation of a binary star system.

To the extent that we have been able to follow the nonlinear growth of this mode in our own thick-disk simulations, we support the ARS conjecture. We have observed the development of a single, coherent, self-gravitating clump of material in the disk and, in concert with the breaking of the axisymmetry of the disk, we have watched the central star (treated as an idealized point mass in our simulations) spiral out away from the center-of-mass of the system. Unfortunately, we have not yet been able to determine whether a binary star system is created as a result of this instability. In every case, we have been forced to terminate our simulations before the nonlinear development of the disk's m = 1 mode has been completed because the spiraling trajectory of the central point mass invariably brings the central object in contact with the inner edge of the disk. Using an SPH simulation technique, Adams & Benz (1992) have provided tantalizing, but equally unconvincing evidence that this instability leads to the formation of a binary star system. As part of our ongoing NSF-sponsored research, we are conducting a series of more complete simulations that are designed to determine whether or not, in protostellar disk systems, the fully nonlinear development of the eccentric instability leads to the formation of a binary star system, but it is not for this work that we are requesting an allocation of NPACI resources.

b. Fission of Rapidly Rotating Protostars

Before discussing either the classical fission hypothesis for binary star formation, or what we consider to be a more reasonable, modified version of that hypothesis in view of recent simulations, it is useful to review what is known about allowable equilibrium configurations for rapidly rotating protostars. Consider, first, all rotating objects that are precisely ellipsoidal in shape, having principal axes a, b, and c. Furthermore, assign these labels such that a always refers to the object's longest axis and c is the axis about which the object rotates. Figure 1, with coordinate axes 0 b/a 1 and 0 c/a 1, then defines a two-dimensional parameter space that contains all such ellipsoidal figures. For example, the top-right corner of the box in Figure 1 (point S) defines a sphere ( b/a = 1; c/a = 1 ); the right-hand edge of the box ( b/a = 1; 0 c/a 1 ) identifies all possible (axisymmetric) oblate spheroids; and a diagonal line connecting point S with the bottom-left corner of the box identifies all (axisymmetric) prolate spheroids ( c/b = 1 ).

As Chandrasekhar (1969) has reviewed in detail (see also Durisen and Tohline 1985), if protostellar objects are assumed to be self-gravitating,
Figure 1
incompressible fluids with simple rotation (i.e., uniform rotation or uniform vorticity), one can show analytically that their allowed (and, most often, preferred) equilibrium configurations are defined precisely by ellipsoidal figures. The parameter space defined by Figure 1 therefore contains not only all geometrically allowed ellipsoidal figures but also a very wide range of physically allowed equilibrium structures of rotating protostellar objects.

Physically meaningful evolutionary sequences also can be diagrammed within the parameter space defined by Figure 1. For example, as an initially slowly rotating (nearly spherical) protostellar gas cloud gravitationally contracts toward a mean density and size that is typical of a hydrogen-burning (main-sequence) star, it will become more and more rotationally flattened if it conserves its mass and angular momentum. The sequence of oblate spheroidal models along the right-hand edge of the box in Figure 1 (between points S and E, for example) describes well this phase of the protostar's early evolution. The curve in Figure 1 that connects the right-hand edge (at point E) to the lower-left corner of the box illustrates another possible evolutionary path (for a protostar that is already substantially flattened due to rotation) as it traces out a sequence of equilibrium configurations having constant vorticity. This sequence of genuinely triaxial ellipsoids is said to "bifurcate" from the axisymmetric sequence at point E. The comprehensive, 19th-century analytical work of Jacobi, Dedekind, and particularly Riemann (again, see Chandrasekhar 1969 for a review) has demonstrated that a variety of constant vorticity sequences bifurcate from the axisymmetric sequence at different locations. For our purposes here, the curve drawn in Figure 1 will serve to illustrate any one of a number of these different sequences. If, at the point of bifurcation, the ellipsoidal sequence defines a configuration that has the same mass and angular momentum, but a lower total energy than the adjacent configuration on the axisymmetric sequence, then after reaching point E the energetically preferred evolutionary path will be one that takes the contracting protostar away from the axisymmetric sequence.

A careful examination of equilibrium models along sequences of progressively more elongated ellipsoids (such as the constant vorticity sequence illustrated in Figure 1) shows that these sequences exhibit additional points of bifurcation. Each point of bifurcation identifies the first model belonging to a new sequence of objects that have "higher order" surface distortions. For example, analytical work on incompressible objects (Chandrasekhar 1969) clearly has identified bifurcation points that lead to equilibrium sequences of pear-shaped and dumbbell-shaped configurations. Point D in Figure 1 shows where bifurcation to a dumbbell sequence occurs along one illustrative, constant-vorticity sequence of ellipsoids. Figure 2 (taken from the review by Durisen and Tohline 1985) illustrates furthermore how the surface distortion of equilibrium figures becomes more and more extreme as one moves along the dumbbell sequence away from the initial point of bifurcation: There is a smooth transition from the initially ellipsoidal figure, to figures with a pronounced dumbbell shape, then to fully detached binary star structures.

(i) Classical Fission Hypothesis

It is easy to see how a "fission hypothesis for binary star formation" emerges from this classical work on ellipsoidal figures of equilibrium. Imagine that as a rotating protostar undergoes slow contraction, it evolves from point S, through point E, to point D along the equilibrium sequences illustrated in Figure 1. Then, from point D it contracts along the dumbbell sequence through each of the geometrical configurations (a --> d) illustrated in Figure 2 until a well-separated binary star system has been formed. This is certainly an aesthetically pleasing picture of how (predominantly equal-mass) binary star systems might form. If the outlined evolutionary path proves to be the energetically preferred one under a variety of different protostellar cloud conditions, the fission hypothesis would also explain naturally why the majority of stars form as binary systems.

In reality, however, the picture is not this clear. Even when this classical fission hypothesis was discussed by mathematicians and astronomers a century ago, several problems were recognized:

(ii) Recent Problems Promoting the Fission Hypothesis

Because of its significantly higher level of mathematical complexity, the extension of this important study to more realistic, compressible configurations -- as well as to systems with nontrivial internal motions -- was not possible before the advent of modern computers. (Indeed, even the construction of incompressible figures with significantly nonellipsoidal geometries, such as models along the dumbbell/binary sequence displayed in Figure 2, was not possible until very recently; see Hachisu and Eriguchi 1984.) Even with the availability of high-performance computers, however, progress on this problem has been pathetically slow. More importantly, the progress that has been made toward understanding the equilibrium and stability properties of rotating, compressible structures has not been particularly supportive of the fission hypothesis. Consider the following:

MOVIE-3
Spiral-Mode Instability
24-bit Quicktime
(2,554K)
Figure 3 (Caption)

The top six frames (read, in sequence, from left to right) have been taken directly from "Movie-3," an animation sequence illustrating the nonlinear development of the two-armed, spiral-mode instability in a rapidly rotating protostar. More precisely, the initial model (1st frame) is an axisymmetric, n = 3/2 polytrope with a rotation profile given by an n' = 0 sequence (Durisen et al. 1986), and an initial ratio of rotational to gravitational energy T/|W| = 0.30. The separation in time between the first frame and the second is 7.66 cirps (central initial rotation periods); all other frame pairs are separated in time by approximately 2.75 cirps. Frame six shows the "final" steady-state configuration, 18.7 cirps into the evolution. This 47,500 time-step simulation was performed with a spatial resolution of 1283 primarily on the Cray T3E_600 at the SDSC, although a portion of it was on the Cray T3E_900 at the NAVOCEANO's MSRC. The last two frames illustrate how, in the future, tracer particles may be placed in the fluid in order to permit visualization of internal, differential motions.

With regard to this last point, the results appear to be particularly significant. Simulations conducted with several independently developed gravitational hydrodynamics codes (Durisen, et al. 1986; Houser, Centrella, and Smith 1994) have each shown that the dynamical instability that arises in rapidly rotating, axisymmetric models of compressible gas clouds does not lead to the formation of a binary star system, as envisioned classically. Hence, as has been reviewed by Bodenheimer, Ruzmaikina, and Mathieu (1993), among the various ideas that have been proposed to explain the preponderance of binary stars, the "binary fission hypothesis" currently is out of favor on theoretical grounds.

c. Insight from Recent Numerical Simulations

Evidence that binary star formation may yet result from the "fission" of a rapidly rotating, protostellar gas cloud comes from several numerical simulation projects that have been conducted by the PI and his students in the context of his ongoing NSF-sponsored research.

(i) The Existence of a Robust, Ellipsoidal Structure with Nontrivial Internal Motions

As stated (and illustrated) above, through a fairly violent process, the two-armed spiral-mode instability is able to transform an initially isolated, axisymmetric gas cloud into an entirely new steady-state configuration with a much different radial distribution of angular momentum. As described above, at the end of its dynamical phase of evolution the system contains a new centrally condensed object that is surrounded by a disk of high specific angular momentum material. It is also clear from the simulation results presented here, however, (see Figure 3) that the central object is not axisymmetric! Instead, as was emphasized a number of years ago by Williams and Tohline (1988) and is most easily appreciated by watching the animation sequence labeled "Movie-3", the final object has a decidely ellipsoidal (even slightly dumbbell) shape. An analysis of the simulation data also clearly shows that this nonaxisymmetric, steady-state configuration contains a nontrivial internal flow. That is, although the configuration as a whole appears to be an ellipsoidal-shaped figure that is spinning coherently as a solid body, in reality the steady-state configuration also contains strong differential fluid motions internally.

(ii) Sequences of Equilibrium, Nonaxisymmetric Disks with Nontrivial Internal Motions

As mentioned above, approximately a decade ago Hachisu (1986b) developed a self-consistent-field technique that permitted him to construct nonaxisymmetric, equilibrium configurations of self-gravitating objects with nonellipsoidal shapes and nontrivial internal motions if the structures were assumed to be incompressible fluids. Working with the PI, LSU graduate student Saied Andalib recently has developed a technique by which nonaxisymmetric, equilibrium models of self-gravitating, compressible configurations can be constructed with nontrivial internal motions (Tohline and Andalib 1997). In order to accomplish this task, Andalib has adopted the following two simplifications: (a) The models are two-dimensional, in the sense that they are either infinite in extent or infinitesimally thin in the vertical direction; and (b) each system is assumed to have uniform vortensity, where vortensity is defined at each point in the fluid to be the ratio of vorticity to density (see, for example, Lin and Papaloizou 1989).

Figure 4
a b
c d
Figure 4 displays the velocity flow-field (as viewed from a rotating reference frame) from four separate models of infinitesimally thin disks that Andalib has constructed along an equilibrium sequence analogous to the purely ellipsoidal sequence that has been displayed schematically in Figure 1 for incompressible figures with uniform vorticity. It is clear from Figure 4 that Andalib's compressible disk models with uniform vortensity are not simply elliptical in shape. Instead, frames a --> d from Figure 4 exhibit shapes that qualitatively resemble the equatorial cross-sections of the 3D figures that are displayed in frames a --> d, respectively, of Figure 2. That is to say, Andalib has constructed equilibrium models along a dumbbell-binary sequence that are compressible and have nontrivial internal flows. So, although we have yet to develop a numerical technique that will permit the construction of compressible 3D structures along a dumbbell-binary sequence, Andalib's results strongly suggest that such equilibrium structures do exist; but to maintain a steady-state structure, they almost certainly will contain nontrivial internal flows.

In constructing equilibrium sequences of this type, Andalib has discovered that he is able to construct compressible disks with a wide range of equatorial axis ratios b/a, except models having b/a close to 1. That is to say, unlike the incompressible models with simple rotation (see Figure 1), his nonaxisymmetric sequences cannot be connected to the axisymmetric sequence. This result has been illustrated schematically in Figure 5; a gap exists between point E on the axisymmetric sequence and the beginning of the sequence of nonaxisymmetric equilibrium objects. We conclude from this work that it is possible for rotating protostellar gas clouds to exist in equilibrium as nonaxisymmetric objects, but it is unlikely that they can make the transition from an axisymmetric structure to a nonaxisymmetric one gracefully.

Figure 5
Tieing this new work on nonaxisymmetric equilibrium structures in with the studies discussed above in connection with the "two-armed, spiral-mode instability" that arises in rapidly rotating axisymmetric structures we suspect the following:

With the NPACI resources being requested here, we propose to investigate thoroughly both of these conjectures.

(iii) The Merger (or Lack Thereof) of Close, Equal-Mass Binaries

Because an objective of the PI's ongoing research efforts is to understand how binary star systems form, it is important to demonstrate that the gravitational CFD simulation techniques being employed by the PI are capable of spatially resolving and accurately representing the equilibrium structure of binary stars. Recently, the PI has published the results of an extensive study of the stability of close, equal-mass binary stars systems that exhibit a variety of different degrees of gas compressibility (New and Tohline 1997). In essence, this was a numerical investigation designed to determine whether or not, in realistic astrophysical situations, evolution will ever occur backwards in Figure 2 (i.e., from frame d to frame a), resulting in the merger (or destruction) of a binary star system. In connection with this proposal, we stress two key results that have emerged from this published investigation:

The New and Tohline (1997) study has demonstrated that the PI's gravitational CFD techniques are capable of accurately modeling protostellar binary star systems with very small orbital separations (in terms of the protostellar radius). In demonstrating that binary systems with soft equations of state are stable against merger (see the discussion in connection with Movie-4), this study may also help explain why the binary state ultimately is the preferred physical state for protostellar gas clouds. This lends extra support to our suspicion, stated above, that upon further contraction the ellipsoidal-shaped object depicted in frame 6 of Figure 3 may evolve smoothly along a dumbbell sequence to a binary configuration as depicted schematically in Figure 2.

MOVIE-4
Stable Binary
MOVIE-5
Binary Merger

Quicktime (1,380K)
mpeg (701K)

With the NPACI resources being requested here, we propose to explore more completely the relationship between this binary merger problem and the fission problem.

d. Proposed Investigation

We propose to perform the following well-defined simulation tasks in order to examine to what extent the ideas outlined above contribute to a more complete understanding of the binary star formation process in rotating protostellar gas clouds. If our educated speculations based on recent simulations prove to be correct, we will be in a position to demonstrate conclusively why and how stars preferentially form in pairs.

Proposed simulations utilizing NPACI resources:

A. Extend the "Movie-3" evolution, testing the stability of the final ellipsoidal configuration.

  1. First, we propose to isolate the central object from its surrounding disk (not shown in Figure 3), and place it in a coordinate frame that is rotating with the measured pattern speed of the ellipsoidal configuration. In this rotating frame, the model will be exposed to a minimum amount of numerical viscosity, permitting us to examine carefully its dynamical stability and the detailed properties of its internal flow. Properties of the flow will be compared in detail with Andalib's steady-state disk models having uniform vortensity. We expect this phase of our study to require that we follow the model evolution through 1.5 complete pattern rotations, which is equivalent to 6 cirps (central initial rotation periods).

  2. In order to take maximum advantage of computational grid resolution, our simulation of the two-armed, spiral-mode instability (shown above in Movie-3 and Figure 3) was performed on a 1283 grid that imposed "p-symmetry." That is, in the azimuthal coordinate direction, a periodic boundary condition was imposed at both p and 2p radians (rather than just at 2p radians), providing effectively 256 zones in the azimuthal direction, but forcing the configuration to maintain an azimuthally two-fold symmetry. This assumption is permissible during the earliest phases of the evolution because linear stability analysis has indicated that the only unstable mode in the initially axisymmetric model is one with an ellipsoidal or two-armed spiral character. After the centrally condensed, ellipsoidal-shaped object has taken final form, however, this geometrical constraint may not be physically justifiable. As we pointed out above, for example, along the sequence of incompressible ellipsoidal configurations illustrated schematically in Figure 1, there exists a bifurcation point to a pear-mode sequence between points E and D in the figure. In order to test whether our "final" compressible ellipsoidal-shaped object is susceptible to such a pear-mode distortion, it will be necessary for us to remove the p-symmetry constraint. This will require a doubling of our grid size (in the azimuthal direction only).

  3. Next, we propose to "cool" the central object slowly to examine whether it will naturally evolve along a dumbbell-binary sequence, as suggested above. To accomplish this task, at each time step we will impose a very small reduction in the gas pressure uniformly across all fluid elements. The reduction will be at a rate such that, over a period of 15 cirps, a spherically symmetric object of the same size would be expected to contract to half its initial radius. This technique should permit the object to contract slowly conserving its total mass and angular momentum, and evolve quasistatically along an equilibrium dumbbell-binary sequence, if such a contraction sequence is available to it energetically.

B. Examine the stability against merger of unequal-mass binaries

  1. The recent study by New and Tohline (1997) has indicated that common-envelope binary systems with a soft equation of state are stable against merger. As pointed out above, this may help explain why the binary state ultimately is the preferred physical state for protostellar gas clouds. However, New and Tohline only examined equal-mass binary systems. The study of unequal-mass systems quickly becomes more complicated, not only because an additional parameter is introduced into the choice of models to be studied -- the initial system mass ratio q -- but also because one object can easily fill its "Roche lobe" before the other and a situation quickly arises in which asymmetric mass exchange occurs. In systems with uniform specific entropy, it is expected that the less massive component will rapidly transfer most of its mass to the more massive component (or into a large common-envelope) thereby driving the system away from (rather than toward) the value q = 1. If binary stars are formed via the modified fission scenario outlined above, then they will evolve through an early common-envelope (dumbbell-shaped configuration) phase and would be equally susceptible to a mass-transfer instability that drives the mass ratio away from unity. We therefore propose to extend the New and Tohline investigation to binary configurations that have values of q close to, but not equal to 1, in order to determine whether the observed stability of close binary systems is sensitive to the initial system mass ratio. Evolutions covering as few as 4 initial orbital periods should be sufficient to answer the question of dynamical stability.

2. Computational Methodology/Algorithms

a. Current Implementation

Our primary dynamical simulation tool is an algorithm designed to perform multidimensional computational fluid dynamic (CFD) simulations in astrophysical settings. It is a High-Performance-Fortran (HPF) algorithm that is a finite-difference representation of the multidimensional equations governing the dynamics of inviscid, self-gravitating, compressible gas flows. More specifically, the employed finite-difference algorithm is based on the second-order accurate, van Leer monotonic interpolation scheme described by Stone and Norman (1992), but extended to three dimensions on a uniform, cylindrical coordinate mesh. The fluid equations are advanced in time via an explicit integration scheme (often referred to as "direct numerical simulation") so an upper limit to the size of each integration time step is set by the familiar Courant condition. At each time step, the Poisson equation is solved in order to determine, in a self-consistent fashion, how the fluid is to be accelerated in response to the fluid's own self-gravity. Complete descriptions of our algorithm to solve the coupled CFD + Poisson equations can be found in WTH and in New (1996). In many of our recent simulations, the Poisson equation has been solved implicitly using a combined Fourier transformation and iterative ADI (alternating- direction, implicit) scheme. However, in the future we will be replacing the iterative ADI component of this algorithm with a fully direct solution as described by Cohl, Sun and Tohline (1997).

As discussed in 1 of this proposal, our ongoing NSF-funded research is aimed at understanding to what extent various rapidly rotating, self-gravitating structures -- such as protostars, accretion disks, and close binary stars -- are dynamically stable. Utilizing the gravitational CFD code described above, our broad objective is to examine the development of nonaxisymmetric structures that arise spontaneously as "normal modes" in such systems. It is important in such studies that we have the capability to construct accurate equilibrium structures as initial input into each dynamical simulation so that the simulations are not dominated by dynamical transients that depend heavily on initial conditions. Drawing on the work of Hachisu (1986a,b), we have successfully developed a robust self-consistent-field code (hereafter, HSCF code) that readily generates nonaxisymemtric as well as axisymmetric equilibrium structures as input to our gravitational CFD code. We have made an axisymmetric version of this code available to others via the world wide web ( http://www.phys.lsu.edu/ astro/ H_Book.current/ Applications/ Structure/ HSCF_Code/ HSCF.outline.shtml). To date, most of our investigations have focused on structures that have relatively simple angular momentum distributions, but as discussed above in connection with the fission problem, we are extending the capabilities of the HSCF code to include the generation of nonaxisymmetric equilibrium structures with nontrivial internal motions.

When performing multidimensional CFD simulations, it is always important to develop an efficient technique by which the numerical results can be readily analyzed for physical content. Because three-dimensional CFD simulations in particular generate enormous amounts of data, and the computational efficiency of modern high-performance-computers is quickly degraded by any significant I/O demands, generally it is preferable to analyze the data "on the fly" rather than trying to save the data to disk for future, post-processing. As described more fully below, we have constructed a heterogeneous computing environment through which we are able to routinely create volume-rendered images and 3D animation sequences of our evolving protostellar cloud configurations. (All of the "Movie" sequences referred to in 1 of this proposal have been produced in this fashion.) Relatively small amounts of data are regularly passed across the network from the parallel machine that is performing the CFD simulation to a machine that is more suited to visualization tasks; images that will become part of an extended animation sequence are created from this data; then the 3D data sets are immediately destroyed in order to minimize the impact on disk storage. At the SDSC, we have utilized the scriptable attributes of Alias/Wavefront (TM) software on middle- and high-end SGI systems in order to perform these critical visualization tasks. (The images displayed in Fig. 3 as well as the images associated with the animation sequences labeled Movie-1 and Movie-3, above, are products of this visualization tool.) This visualization tool and the heterogeneous computing environment in which it functions will continue to provide a critical component of our ongoing gravitational CFD simulations at the SDSC.

b. Planned Developments

There is always room for improvement in connection with the numerical tools that are used to simulate and visualize the structure and evolution of complex physical systems. Recently we have invested a considerable amount of time developing two new tools that we expect to integrate into our program during the coming year. First, it is clear that one of the most computationally demanding components of our current gravitational CFD code is the algorithm used to calculate boundary values for the gravitational potential that must be fed into the Poisson solver in order to be able to solve for the gravitational potential throughout the volume of the fluid. Historically, we have used an algorithm based on a multipole expansion in terms of spherical harmonics (Tohline 1980). To improve the execution speed of this component of our code, we are developing an algorithm based on cylindrical Bessel functions because Bessel functions are much more naturally suited to the (cylindrical) geometry of our chosen computational grid.

Second, although we have been very pleased with the volume-rendering tool that we have developed using Alias/Wavefront software on the SGI platforms at the SDSC, this tool presently provides no mechanism for visualizing the differential fluid motions within the evolving protostellar cloud structure. Working with the visualization team at the NAVOCEANO MSRC (see 6, below), we are developing an algorithm through which Wavefront can follow the motion of a large number of "tracer particles" in the fluid at the same time as it renders various nested "surfaces" of the entire cloud. The last two frames shown in Figure 3 provide a simple example of how 10 tracer particles initially radially aligned with one another in the fluid can be followed to show, a short time later, a "spiral" pattern created by the underlying fluid's differential motion.

3. Local Computing Environment

In 1991, through an equipment grant from the Louisiana State Board of Regents, the LSU Department of Physics and Astronomy acquired an 8,192-node, SIMD-architecture MasPar MP-1 to support the development of parallel algorithms for a variety of computational physics and chemistry projects. The architectural topology of the MP-1 is that of a 2D torus characterized by a 128 x 64, two-dimensional grid of MasPar-produced processing elements. The MP-1 also incorporates a two-tier processor communication strategy which includes rapid nearest neighbor communication via X-Net communication and efficient global communication performance using a tree- nested global router. Each processing element contains 64 Kbytes of local dynamic RAM, that is, the total memory on LSU's MP-1 system is 0.5 GBytes. As described in 4, below, we have taken full advantage of this parallel hardware platform to perform a variety of astrophysical simulations at moderate spatial resolutions. Our computing strategy is to continue to use this machine to conduct "preliminary" calculations over a broad range of the available physical parameter space, then to study in detail the dynamical evolution of selected systems at higher spatial resolution on the Cray T3E at the SDSC.

The LSU Department of Physics and Astronomy also supports a large cluster of (over 45) Sun Microsystems workstations. Sparc 10 and Sparc 2 workstations that are integrated into this cluster, plus a NeXT Dimension workstation and a recently acquired SGI Octane provide the PI and his students with graphical unix interfaces into LSU's campuswide ethernet system and the internet. LSU is currently working through SURA (the Southeastern University Research Association) to upgrade its off-campus internet connections from a pair of T1 lines to a T3 bandwidth and Internet-2 status.

4. Preliminary Progress

a. On LSU Facilities

Taking advantage of our acquisition of an 8K-node MasPar at LSU (see 3, above), in December 1991, we rewrote our CFD code in mpf (MasPar Fortran), which is a language patterned after Fortran 90 but includes certain extensions along the lines of high-performance Fortran (HPF). Utilizing the MP-1, we also have gained valuable experience in the development of parallel algorithms to efficiently solve the Poisson equation in curvilinear coordinate systems (Cohl, Sun & Tohline 1997). Numerous gravitational CFD simulations have been performed successfully over the past six years on the LSU MP-1 system and on an MP-2 platform at the Scaleable Computing Laboratory of the Laboratory at Iowa State University. Our algorithm has achieved a very high degree of parallelism on the MasPar. Because our algorithm involves primarily nearest-neighbor communications, we have been able to take particular advantage of the hardware's X-Net communications network. As described in Cohl, Sun and Tohline (1997), our global solution of the Poisson equation incorporates a matrix transpose technique that utilizes both the X-Net and global router networks efficiently (Flanders 1982).

It was at LSU as well that we initially developed a heterogeneous computing environment through which our gravitational CFD data can be visualized "on-the-fly." Using communication strategies, such as unix sockets and PVM (Parallel Virtual Machine), at LSU the computational work load is distributed such that workstations perform the rendering of our data simultaneously with any ongoing CFD simulation on the MasPar. For example, the animation sequences labeled Movie-2, Movie-4, and Movie-5, above, were generated at LSU using the IDL software package running on Sun Microsystems workstations.

b. On Parallel Platforms Outside LSU

Over the past four years, our parallel CFD algorithm that was developed in mpf on the MasPar has been ported to the Thinking Machines CM-5 at NCSA and to the IBM SP2 at the Cornell Theory Center. We also have been able to implement an f77 serial version on the Cray Y/MP at NCSA and an F90 version on the Cray C90 at SDSC. Table 1 in Attachment C2 documents the performance of our CFD code on these five different machine architectures. To date, performance results on the CM-5 and the SP2 have not been sufficiently good to warrant migration of our production runs away from LSU's MP-1 system. As discussed below, however, the Cray T3E architecture appears to be well-suited to our existing parallel algorithm so we have moved our primary simulation work to the SDSC.

c. At the SDSC

Early in 1997, the PI applied for and was granted 19,200 SUs on the Cray T3E at the SDSC for a project entitled, "Dynamical Modeling of Star Formation in Galaxies." (The allocation was granted on April 1, 1997 and will expire on March 31, 1998. Our current request for NPACI resources is intended to be a direct extension of this successful, 1st-year SDSC project.) During the first couple of months of this project, we focused our efforts on porting our gravitational CFD code to the T3E and documenting its performance. Using the available Cray HPF compiler (PGHPF, developed by the Portland Group), we found that relatively few unexpected code modifications were required in order to successfully compile and execute our code on the SDSC system. (This was because we already had had experience successfully modifying the code to execute on the CM-5 using cmf and on the Cray C90 using the Cray Fortran90 compiler.) In June, 1997, we submitted a technical report entitled, "Early Experiences with the Cray T3E at the SDSC," to Jay Boisseau at the SDSC in which we fully documented how well our code was performing on the T3E. (This technical document is included as Attachment C2 to this proposal.) In summary, we found that:

On the down side, we also discovered that the PGHPF compiler was producing executable code that performed computational instructions on each T3E processor with extraordinarily low efficiency. That is, as a parallel application, the code was scaling well, but its overall efficiency measured relative to the published peak-performance capabilities of the machine was poor. By way of illustration, we show in Table 1a the measured MFlop rating of our CFD code on a single processor of the T3E_600 at the SDSC. For two different lattice sizes (one with and one without power-of-two arrays), the PGHPF-compiled code runs at about 13 MFlops, which is only 2% of the published peak rating of 600 MFlops. (The numbers in Table 1a were obtained using "pat," a performance monitoring tool that only became available to us at the SDSC in September. The report from which these numbers were drawn accompanies this proposal as Attachment C1.) Clearly there was plenty of room for improvement, but it was equally clear that our performance measures were not exceptional. Throughout its first year of general availability, users have been reporting "typical" T3E single-processor performance ratings of anywhere from 4 to 100 MFlops (John Levesque [APRI], private communication; see also "The Benchmarker's Guide to Single-processor Optimization for CRAY T3E Systems" published by the Benchmarking Group at Cray Research). Our recent progress on improving the code's performance is described below.

Table 1a: Single-Processor Test Results

SDSC: T3E_600
Test Compiler Streams Execution Speed (MFlops)
Grid size
64 × 32 × 32
Grid size
67 × 34 × 32
B F90 OFF 7.20 21.58
C PGHPF OFF 12.79 13.06

Working with several staff members in the SDSC Vislab, during the summer we successfully established a heterogeneous computing environment at the SDSC through which the results of our gravitational CFD simulations can be routinely visualized. Extending the ideas that we first developed at LSU, we have established a procedure by which, at predetermined time intervals during a simulation, the CFD code running on the T3E writes out a 3D density file and sends a signal to the Vislab's SGI Power Onyx/Infinite Reality Engine indicating that a new data file is available for processing. Having received the signal from the T3E, the graphics engine executes a previously written shell script which (i) transforms the data from its cylindrical coordinate lattice to a cartesian lattice; (ii) searches through the data to locate four separate isodensity surfaces and generates a wire-frame structure (i.e., vertices and polygons) that corresponds to each of these 3D surfaces; (iii) executes an Alias/Wavefront script which renders the set of wire-frame structures as a multiply nested, reflective and colored transparent set of isosurfaces; (iv) stores on disk this 2D, rendered image of our evolving model at one particular time step in the simulation; then (v) deletes the original 3D data set and wire frame models to conserve disk space. This entire task is continually re-initiated as data is generated during an extended CFD simulation on the T3E. The net result of this process is a stored sequence of 2D images illustrating the evolution of our physical system. The series of 2D images is transferred via the internet to LSU where it can be displayed as a movie, allowing us to rapidly interpret the dynamics of the systems that we are examining. The animation sequences labeled as Movie-1 and Movie-3 in 1 of this proposal (see also Fig. 3) were generated in this manner. We consider the successful establishment of this environment which smoothly couples the high-end, SGI visualization tools at the SDSC with our simulation efforts on the T3E to have been a significant accomplishment this past year.

Throughout the fall, we used the SDSC facilities to study carefully two unstable physical systems: (1) A "massless" accretion disk surrounding a point-like protostellar object (see Movie-1, above). As noted above, this particular system has been examined previously through linear stability analyses and was simulated at this time in order to provide us with a quantitative check of the CFD code's performance on the T3E. (2) A rapidly rotating protostellar object that was subject to the two-armed, spiral-mode instability described above (see Movie-3 and Fig. 3). By tracking the nonlinear development of this instability at a spatial resolution that heretofore has been computationally unachievable, we have confirmed the general results of earlier simulation efforts and have produced a spatially well-resolved, rapidly rotating ellipsoidal-shaped structure whose relative stability can now be examined in the context of the classical fission hypothesis. The results of this work are currently being prepared for presentation at the "Numerical Astrophysics 1998" conference to be held during March in Tokyo, Japan and will be published in the proceedings of that meeting.

As is explained in 6, below, the PI recently has gained access to some of the computing facilities at the NAVOCEANO MSRC in Stennis, Mississippi. Via access to the T3E_900 and to several onsite workshops at the NAVO MSRC, the PI and his students have gained a much better appreciation of the way a single processor on the T3E handles the loading and storing of data arrays and the streaming of data into floating-point registers. By making adjustments in the stride of our principal data arrays and by taking advantage of some compiler/environment options, but without making any changes in the implemented HPF algorithm, we are able to generate executable code with a substantially higher execution efficiency than originally reported. As the data in Table 1b indicates, for example, by switching the STREAMS environment variable "on" and avoiding power-of-two array dimensions, through the Cray F90 compiler our code's single-processor execution speed can be increased from 7.74 MFlops to 38.2 MFlops, that is, by a factor of 5!

At present it is less obvious how improvements of this magnitude can be gained when using the PGHPF compiler. Hence, it is unclear how such speedups can be realized on parallel HPF applications. Based on the fact that, with relatively little effort, our current CFD code can achieve 38 MFlops on a single node of the T3E_900, we are optimistic that we will be able to increase the execution efficiency of our parallel applications on the T3E_600 at the SDSC by a factor of 2.5 over the next few months. With this in mind, we feel confident that a new allocation of 20,000 service units on the T3E will permit us to complete a number of substantial and significant simulations over the coming year.

Table 1b: Single-Processor Test Results

NAVOCEANO MSRC: T3E_900
Test Compiler Streams Execution Speed (MFlops)
Grid size
64 × 32 × 32
Grid size
67 × 34 × 32
E1 F90 OFF 7.74 24.88
E2 ON 8.85 38.19
F1 PGHPF OFF 15.07 15.41
F2 ON 23.08 23.31

5. Justification of the Number of Service Units Requested

Required Platforms

We are requesting an allocation of 20,000 SUs on the Cray T3E at the SDSC as well as continued access to the SGI platforms in the SDSC Vislab. These platforms are suitable because, as has been detailed above in 4 and in Attachment C2, over the past year we have (i) demonstrated that our primary HPF simulation tool scales well on the T3E architecture; (ii) shown that we understand, in principal, how significant improvements may be made in the execution speed of our application code on the T3E; and (iii) developed a heterogeneous computing environment that smoothly integrates the unique capabilities of the T3E with those of the SGI platforms in the Vislab.

Code Requirements and Optimization Issues

Over the past year, we have been performing gravitational CFD simulations with lattice resolutions of 1283 and selecting the physical problems to be studied accordingly. At this resolution, our problem will not fit on fewer than 32-nodes of the SDSC T3E -- that is, the problem requires at least 4 GBytes of RAM. However, when the visualization tasks (being carried out simultaneously on the SGI Onyx) are considered as an integral component of each CFD simulation, a more ideal load-balancing is achieved if we run each simulation on 64-nodes of the T3E. In this configuration, typically 14 minutes of wall-clock time are required to evolve the system forward in time by 1/60 of a rotation period and slightly under 10 minutes of wall-clock time are required to complete the volume-rendering of one 3D data set. Hence, the SGI Onyx and the T3E are kept approximately equally busy generating 60 movie frames per rotation period and a comfortably smooth animation sequence is produced during each run.

When we double our current lattice size in order to handle the simulation tasks outlined in 1d of this proposal, we will not be able to run on fewer than 64-nodes of the T3E. Depending on how much the visualization tasks slow down as a result of this larger problem size, we may find it advisable to occasionally utilize 128-nodes; but it is unlikely that our planned simulations will require the full machine.

Proposed Production Runs

In 1 of this proposal we provided an extensive discussion of the particular scientific problem that we are investigating in connection with our ongoing, NSF-sponsored research. In 1d, in particular, we described in detail the simulation tasks that we would like to complete over the coming year in support of this research utilizing NPACI resources. Following up on that discussion -- as well as the discussion in 4 of our progress, to date -- our specific request for T3E resources at the SDSC is detailed in Table 2. The letter/number label that has been adopted for each proposed simulation in this table follows the outline used in our earlier discussion of the respective problems in 1d; the required number of system rotation periods (cirps) listed in the table also has been drawn directly from this earlier discussion. to

Table 2: Estimated, Default SU Requirements

Proposed Simulations "cirps"
required
Default SUs
required
A. Test the stability of the ellipsoidal configuration that was formed at the end of the "Movie-3" simulation.
1. Shift to rotating frame of reference. 6 10,800
3. Cool the object slowly. 15 27,000
B. Examine the stability against merger of unequal-mass binaries.
1. Follow the evolution of two systems, each with a mass ratio q 0.95, but with slightly different initial separations. One separation should be chosen so that both objects are inside their Roche lobe; this system is expected to be stable. The other separation should be chosen such that one object initially just fills its Roche lobe; this system is predicted to be unstable. 8 14,400

In estimating the SU requirements for each extended run (refer to the last column of Table 2) we have adopted a default single-processor execution rating of 13 MFlops (see Table 1a). At this rate, and doubling the lattice size (that is, moving to a lattice of 128 128 256) in order to remove the p-symmetry constraint discussed in 1d, each system rotation period will require approximately 1,800 SUs. By default, then, the SUs required to complete all of these simulations totals 52,200.

Now, we argued in 4, above, that over the next few months we should be able to improve the execution efficiency of our gravitational CFD code by a factor of 2.5. If we achieve this goal, the SUs required to complete all of the simulation tasks outlined in Table 2 will be approximately 20,880. Hence, our formal request is for 20,000 SUs on the T3E_600 at the SDSC and the burden is on us to achieve the factor of 2.5 improvement in code efficiency.

Performance Monitor Results

As discussed earlier, utilizing "pat" -- a performance analysis tool that accesses the hardware performance monitors on the T3E -- we have been able to document the single-processor performance of our gravitational CFD code on the T3E. Numbers drawn directly from "pat" are shown in Tables 1a and 1b; they are also listed in Attachment C1 to this proposal.

Special Requirements

As explained in detail above, we rely heavily on the SDSC Vislab facilities to routinely analyze our simulation results. In order for this project to be a success, therefore, it is essential that we continue to have access to those facilities. It is extremely important that the Alias/Wavefront software continue to be made available on the SGI Onyx as well.

We should point out that the manner in which disk storage is presently controlled and maintained at the SDSC is a problem for us. A single 3D data array in our gravitational CFD simulations is 16 MBytes and a "continuation file" which contains the minimum amount of information that is necessary to start or continue an evolution is approximately 100 MBytes. Therefore, in connection with the T3E, it is very difficult to work with a maximum personal storage allocation of only 10 MBytes. In the Vislab, data storage -- even on scratch disks -- appears to always be at a premium. We have worked very hard to develop a visualization tool (described in detail above) that preserves a bare minimum of data, but in order to make it work properly the tool must have access to a reasonable amount of scratch disk space on the SGI Onyx during each T3E simulation run. Frequently during an overnight run, the automatic visualization tool will fail to function simply because the scratch disk has been completely filled up by other users. We strongly encourage the SDSC staff to correct these problems with disk storage and maintenance.

6. Other Supercomputer Support

Having recently been designated a Major Shared Resource Center for the Department of Defense, during 1997 the Naval Oceanographic Offices (NAVOCEANO) at the Stennis Space Center in Mississippi began to purchase and install a variety of high- performance- computing platforms and visualization architectures. In particular, they have acquired a Cray T3E and an SGI Onyx with supporting compilers and software tools to create a supercomputing environment that resembles the one to which we have grown accustomed at the SDSC. In part because of LSU's close proximity to Stennis (the Stennis Space Center is only 2 hours away from LSU by car), in August, 1997, the PI and his students were invited to utilize the MSRC resources at a modest level of activity through the center's Programming Environment and Training (PET) program. Close interactions with the MSRC technical staff already are proving useful in connection with our efforts to gain better single-processor execution efficiencies on the T3E and to develop visualization tools that will permit us to study in detail the velocity flow-fields that develop during our dynamical simulations.

Because the PI does not have a DoD-funded research project, he has not been granted a priority allocation of time on the high-performance-computing hardware at the NAVOCEANO's MSRC. In connection with the PI's NSF-funded research activities, therefore, it is critically important that a guaranteed allocation of supercomputing time be secured through NPACI resources.

7. Qualifications of the Principal Investigator

A major component of the PI's research work over the past twenty years has involved the large-scale numerical simulation of multidimensional gas flows in astrophysical systems utilizing a variety of different supercomputing and high- performance- computing platforms. In the early 1980s, the PI held a postdoctoral research position in group T-6 at Los Alamos National Laboratories at which time he broadened his exposure to state-of-the-art CFD techniques and learned to develop efficient algorithms for vector supercomputers. From the mid-1980s through the early 1990s, Tohline was PI on a large account at the Cornell Theory Center during which time he and his students experimented with the parallel implementation of CFD algorithms on the IBM 3090 and various accompanying FPS Array Processors. When the 8K-node MasPar MP-1 system arrived at LSU (see 3, above), the PI abandoned the CTC facilities because his group was having considerable success utilizing the MasPar system for its largest simulation efforts.

As has been detailed elsewhere in this proposal, in recent years the PI and his students have gained experience developing gravitational hydrodynamic algorithms for execution on a variety of parallel platforms including MasPar's MP-2, the CM-5, the SP-2, and the Cray T3E. The group also has developed and implemented a heterogeneous computing environment (at LSU, at the SDSC, and at the NAVOCEANO's MSRC) that tightly couples their parallel simulation efforts to sophisticated visualization tools.

An accompanying attachment entitled, "Curriculum Vitae and Selected Publications," documents the PI's formal educational background and professional experience, and provides a list of ten significant publications that are particularly relevant to the project being proposed here.

8. Others Working on the Project

John E. Cazes (cazes@rouge.phys.lsu.edu): John Cazes is currently a 6th-year graduate student enrolled in the Ph.D. physics program at LSU. To date, John has been principally responsible for the migration and optimization of our CFD code, having successfully ported it to the CM-5, the SP-2, and the Cray T3E. He also has played the lead role in developing the heterogeneous computing environment through which our CFD simulations are routinely and automatically accompanied by 3D animation sequences. The astrophysics problem on which this project proposal is based also is the focus of John's Ph.D. dissertation research.

Howard S. Cohl (hcohl@rouge.phys.lsu.edu): Howard Cohl is currently a 6th-year graduate student enrolled in the Ph.D. physics program at LSU. He has been principally responsible for developing, implementing, and optimizing the Poisson solver that is tightly integrated into our CFD simulations. Howie has worked closely with John Cazes on the development of the heterogeneous computing environment through which animation sequences of our CFD simulations are routinely created. He is also playing the lead role in developing techniques by which the our 3D velocity flow-fields can be readily visualized.

Patrick M. Motl (motl@rouge.phys.lsu.edu): Patrick Motl is currently a 5th-year graduate student enrolled in the Ph.D. physics program at LSU. He recently has perfected the 3D self-consistent-field technique which will permit us to construct unequal-mass, equilibrium binary systems and then to test their dynamical stability with the dynamical simulation tools described herein.


Attachments

Bibliography
Curriculum Vitae and Selected Publications for the PI
Timing and Performance Data