The Maximum Entropy Method for Analytic Continuation of Quantum Monte Carlo Data.

Sponsored by NSF

Quantum Monte Carlo methods are typically constructed with a path integral in Matsubara (imaginary) time. This presents a problem for comparison to transport and spectroscopic experiments, which produce data as a function of real frequencies. For example, the Fermi real frequency spectral function A and the Matsubara imaginary time or frequency Green's function G are related by

This equation does not have an inverse. That is, given A, you can calculate G, but given G, you cannot uniquely calculate A. The problem is that there are many A which reproduce G within its error bars (an equivalent statement is that if we write the equation above in matrix form, G=KA, then det(K)=0). The difficulty is mainly at high frequencies; due the exponential falloff of the kernel K, A can have many values at high frequencies and produce the same G. Therefore, the question Given G what is A? has no solution.

Fortunately, the question Given G what is the most probable A? does have a sensible answer provided by methods of conditional probabilities and statistical inference (for a nice review, see this article by Devinder Sivia). Here, we must:

Maximize the probability of A given G, (Bayes theorem).
where the prior probability of A is given by ,
with S the Shannon entropy
and the likelihood function

You can learn more about the Maximum Entropy Method (MEM) for the analytic continuation of QMC data in our review article Bayesian Inference and the Analytic Continuation of Imaginary-Time Quantum Monte Carlo Data, M. Jarrell, and J.E. Gubernatis, Physics Reports Vol. 269 #3, pp133-195, (1996). See also this (unpublished) paper. You may download MEM codes from GitHub (https://github.com/msjarrell/MEM) or you may download a snapshot (present at the time of this writing) from http://www.phys.lsu.edu/~jarrell/CODES/MEM.

Below, I will discuss a few of the typical problems people encounter when applying MEM (for more details see the review article cited above) and briefly mention the codes.

The likelihood function. The first problem is that you must make sure that the likelihood function is justified. This is the most common problem that people have with MEM. The probability of G given A only has the chi-squared form mentioned above if the samples of G are statistically independent. This means both that

each sample must be independent of the others, and that
each G(tau) must be independent of G(tau').

In order to ensure the former, you will need to form the data into bins which are much larger than the autocorrelation time of the green function data. This rule is easy to satisfy. I separate each measurement of G, by at least one autocorrelation time of G, and then (mainly to save disk space) I concatenate the data into larger bins. To ensure the latter rule, you must calculate the full covariance matrix of G and rotate both K and G into the space where the covariance is diagonal. As Jim and I explain in our review, this latter is extremely important and tough to fulfill. The best way to ensure this is to calculate the eigenvalues of the covariance and make sure that they are all within the numerical precision of your computer (Fig. 1). If your eigenvalue spectrum looks qualitatively different from this one, particularly if it shows either a break for large index, or if it is incomplete (i.e. For a covariance matrix of size nl x nl, you have less than nl eigenvalues), then you will have to generate more bins of data, and recalculate.

The default model, m(w) also is important. We have found that it is important to start with a relatively featureless default model at high temperatures as pictured below, and anneal the spectra in temperature T always using the result from a higher T to as a default model at the next lower T. The benefit of starting at high T, is that you can generally calculate the exact spectra for use as the initial default model. It is important to choose a featureless model, since a model with sharp but incorrect features may be difficult if not impossible to overcome with a finite amount of data. Examples of the featureless models we use at high T are shown on the left below, and on the right we show one temperature step in the annealing process for the Density of states of the 2D Hubbard model away from half filling (the red curve is the spectrum N(w) from the next higher temperature which is used to calculate the spectra shown in black, the green lines are the integrated intensity and error bars—see the review).

Illustration 2: (LEFT) Featureless default models (for different DCA K values) used only to analytically continue the highest temperature data set. (RIGHT) One temperature step in the annealing procedure. Here the model (red) is the spectra from the next higher temperature.

After we perform an analytic continuation of a temperature series, we then go back and re-run the calculation with twice the precision (i.e. run four times longer) to see if the spectra changes. You may also find it advantageous to add datasets for temperatures where the spectra is changing quickly or developing important features (such as the pseudogap feature in the spectrum above on the right). Once the data is sufficiently good that the spectra no longer changes with such improvements, publish!

The MEM codes may be downloaded from from GitHub or an old snapshot from http://www.phys.lsu.edu/~jarrell/CODES/MEM . These codes are released under a modified GPL by M. Jarrell and J.E. Gubernatis (the modification being that you must cite our review article in Physics Reports and provide bug reports, etc. on GitHub see the readme file). There are two directories under the link above. In the subdirectory Bryan, you will find a version of the bryan MEM algorithm [R.K. Bryan, Eur. Biophys, J, 1990, 18, 165-174.]. In the subdirectory Readmc, you will find codes which read the QMC data, calculate the covariance and kernel K and rotate the Kernel and data into the space where the covariance is diagonal.

Modified 12/15/16