-
Notifications
You must be signed in to change notification settings - Fork 0
/
numerical-amr.tex
500 lines (435 loc) · 24.5 KB
/
numerical-amr.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
\section{The Structured Adaptive Mesh Refinement (SAMR) Method}
\label{sec.amr}
The back-bone of the SAMR idea is the patch, or grid, which we take to
be rectilinear to simplify bookkeeping (in practice, the lower-level
routines such as the hydro solvers all assume Cartesian coordinates).
The single root grid covers the entire computational domain and plays
the role of the root node in the grid hierarchy. Subgrids are placed
such that they cover (hopefully small) sub-volumes of the root grid at
higher resolution. This structure forms a tree that may be extended
to arbitrary depths by adding additional grids to refine regions in
the subgrids. We use the following notation to describe grid
relationships. A grid's \textit{parent} is the coarse grid that
completely contains it. The \textit{root grid} is the most
coarsely-resolved grid, and it has no parent. A \textit{child grid}
is a grid completely contained by its parent grid (note that a grid
may have many children but only one parent), while sibling grids are
those that have the same resolution, but not necessarily the same
parent. We also refer to levels of the hierarchy, where the root grid
is labeled level zero and all grids with the same resolution are on
the same level.
In order to simplify bookkeeping, a number of restrictions are imposed
on the subgrids:
\begin{itemize}
\item The refinement factor $r$, or the ratio of the coarse cell
width to the fine cell width, must be an integer. In practice, we
generally adopt $r=2$, since tests have shown this typically
provides the best performance.
\item Each subgrid must begin and end on the boundary of a coarse
cell. This means that the number of cells in a patch will be a
multiple of the refinement factor.
\item A subgrid must be completely enclosed by its parent.
\end{itemize}
The AMR grid patches are the primary data structure in \enzo. Each
patch is treated as an individual object that can contain both field
variables and particle data. Individual grids are organized into a
dynamic, distributed hierarchy of mesh patches.
\begin{figure}
\begin{center}
\includegraphics[width=0.5\textwidth]{figures/amr_hierarchy.eps}
\end{center}
\caption{\emph{Left:} Example of a simple, distributed AMR hierarchy
showing real and ghost grids. \emph{Right:} Example 2D \enzo\ grid
showing real and ghost zones, as needed for the PPM hydro stencil. }
\label{fig.amr_hierarchy}
\end{figure}
Each grid patch in \enzo\ contains arrays of values for baryon and
particle quantities. Grids are partitioned into a core of
\emph{active zones} and a surrounding layer of \emph{ghost zones}, as
shown in Figure~\ref{fig.amr_hierarchy}. The active zones store field
values and ghost zones are used to temporarily store values that have
been obtained directly from neighboring grids or interpolated from a
parent grid. These zones are necessary to accommodate the
computational stencil of the (magneto)hydrodynamics solvers
(Sections~\ref{sec.hydro.ppm} through~\ref{sec.hydro.zeus}) and the
gravity solver (Section~\ref{sec.gravity}). The PPM and \zeus\ hydro
solvers require 3 layers of ghost zones, and the gravity solver
requires 6 (although only for the density and potential fields). This
can lead to significant memory and computational overhead,
particularly for smaller grid patches at high levels of refinement.
From the point of view of the hydro solver, each grid is solved as an
independent computational fluid dynamics (CFD) problem, with Dirichlet
boundary conditions stored in the ghost zones. At the beginning of
each timestep, cells on a given level fill ghost zones by
interpolating from the parent and copying from neighbors on the same
level. Grids then correct the flux at the interface boundary between
fine and coarse zones, and finally project their active zone data to
the parent grid.
\begin{figure}
\begin{center}
\includegraphics[width=0.6\textwidth]{figures/timestepping.eps}
\end{center}
\caption{\emph{Left:} Example of the timesteps on a 2-level AMR
hierarchy. \enzo\ does not restrict the timesteps on the finer levels
to be a factor of $1/2^n$ smaller. \emph{Right:} The order in which
the AMR grids are evaluated on each level.\vspace{1ex}}
\label{fig:wcycle}
\end{figure}
Timesteps are determined on each level as described in
Section~\ref{sec.timestepping}, and the hierarchy is advanced on a
level-by-level basis in a W-Cycle. Beginning with the coarsest level,
$l$, all grids on that level are advanced one timestep. Then, one
timestep is taken on all grids at the next level of refinement, $l+1$,
and so on until the finest level is resolved. The finest level is then
advanced, using as many steps as it takes to reach the level
immediately above. The finest level is then synched to the level
above it, which then proceeds forward in time one more step. This is
shown graphically in Figure~\ref{fig:wcycle}.
At the end of every timestep on every level, each grid updates its
ghost zones by exchanging information with its neighboring grid
patches (if any exist) and/or by interpolating from a parent grid. In
addition, cells are examined to see where refinement is required and
the entire grid hierarchy is rebuilt at that level (including all more
highly refined levels). The timestepping and hierarchy
advancement/rebuilding process described here is repeated recursively
on every level to the specified maximum level of refinement in the
simulation.
The basic control algorithm looks very much like any used in a
grid-based code of a single resolution, and can be written
schematically as:
\begin{verbatim}
InitializeHierarchy
While (Time < StopTime):
dt = ComputeTimeStep(0)
EvolveLevel(0, dt)
Time = Time + dt
CheckForOutput(Time)
\end{verbatim}
The AMR control algorithm is contained within the recursive
EvolveLevel algorithm, which takes the level upon which to operate as
an argument. This looks something like the following pseudo-code:
\begin{verbatim}
EvolveLevel(level):
SetBoundaryValues
while (Time < ParentTime):
dt = ComputeTimeStep(level)
SolveHydroEquations(dt)
SolveOtherEquations(dt)
Time = Time + dt
SetBoundaryValues
EvolveLevel(level+1, dt)
FluxCorrection
Projection
RebuildHierarchy(level+1)
\end{verbatim}
Each function operates on all the grids on the given level. In the
following sections, we will address each of the major elements in
turn.
\subsection{Setting the boundary values}
\label{sec:interpolation}
In order to solve the hydrodynamic equations on a patch, it is
necessary to specify the boundary conditions on that patch. There are
two cases. The first is that the boundary is external to the
computational domain. This occurs for all the boundaries of the root
grid. The second possibility is that the boundary is within the
computational domain, which is true for most, but not all, subgrids.
In the first case, we allow for four types of boundary conditions (for
simplicity, represented here as being at $x=0$, with positive $x$
interior and negative $x$ exterior to the domain):
\begin{enumerate}
\item{\em Reflecting.} In this case, the boundary behaves like a
mirror, with the solution on the boundary reflecting the solution
in the computational domain ($q(-x) = q(+x)$) but with the
velocity normal to the boundary reversed: $v_x(-x) = -v_x(+x)$.
\item{\em Outflow.} We approximate outflow boundary conditions by
duplicating the solution at the edge of the computational domain:
$q(-x) = q(0)$.
\item{\em Inflow.} In this case, the boundary values are fixed by a
pre-determined function ($q(-x) = q0(-x,t)$). The code provides a
simple way to set inflow values that are constant in both time and
space over an entire face, or hooks are provided for special cases
where the boundary must be time-varying.
\item{\em Periodic.} The boundary solution is obtained from the
other side of the grid: $q(-x) = q(x_{\rm{max}}-x)$.
\end{enumerate}
In the second case, the value must be either (1) interpolated from a
parent or (2) copied from a sibling. In practice, we solve this
problem by first interpolating all boundaries from the grid's parent
and then performing the sibling copy wherever possible, since these
values will always be as good as, or superior to, interpolation.
We focus first on conserved, cell-centered quantities. The
interpolation function should be conservative in the sense that the
mean over the interpolated quantity, in a given cell, must return the
cell-centered value. It should also be monotonic, so that new extrema
are not introduced into a smoothly varying field.
In one dimension, such functions are well known; examples include
donor cell, Van Leer, and piecewise parabolic interpolation, which are
first-, second-, and third-order accurate respectively
\citep[e.g.,][]{Stone92a}. However, we require fully
multi-dimensional interpolation functions, and so focus on
second-order schemes since they are tractable and relatively
accurate\footnote{We could, in principle, build a multi-dimensional
interpolator out of a one-dimensional interpolation function; however,
at higher order, this would not necessarily satisfy our constraint
regarding not introducing new extrema in interpolated quantities.}.
In Appendix~\ref{app:interpolation}, we explicitly write down
functions in one, two and three dimensions, but here we discuss the
general design considerations.
In two dimensions, it is relatively easy to build a linear, monotonic,
conservative function by requiring, first, that the function at the
center of the cell return the cell-centered quantity (and since it is
linear, this implies that an average over the cell will also be equal
to this value), and second, by treating the cell-diagonals as
one-dimensional Van Leer-type functions. The reason to focus on the
cell diagonals is that for a linear function, the extrema will occur
at the cell corners, which are in turn linked by the diagonals. This
process can be done in two dimensions since there are three
constraints (one at the center and one along each diagonal), and three
parameters in a two-dimensional linear function.
In three dimensions this process breaks down since the number of
constraints (center + four diagonals = five) exceeds the number of
parameters (four). One way to solve this problem would be to perform
a least-squares fit under the given constraints; however, this is too
computationally expensive. Therefore, in
Appendix~\ref{app:interpolation} we describe a heuristic approach that
satisfies four constaints analytically and then adjusts the
interpolation parameters in such a way as to satisfy the fifth. While
the result is not necessarily optimal, it does appear to perform
adequately in our tests.
\subsection{Solving the hydrodynamic equations on a grid}
\label{sec:solve_hydro}
Given the grid and its boundary conditions, the process of solving
equations~(\ref{eq:mass})--(\ref{eq:total_energy}) can be detached
from the rest of the SAMR technique, so that it is possible to
experiment with a number of different methods. The restrictions are
that the method should be flux-conservative, or at least produce an
accurate estimate of the flux of conserved quantities at cell
interfaces. This is required in the flux-correction step (see
\ref{sec:flux_correction}). There is also the issue of centering,
which refers to the position of physical quantities within a cell (the
primary types are cell-centered, face-centered, and node-centered).
The interpolation step, described in Section~\ref{sec:interpolation},
must correctly interpolate with the selected centering.
\subsection{Projection}
\label{sec:projection}
Since a refined region is simulated with (at least) two different
resolutions (one fine and one coarse), it has two or more
``solutions'' (i.e.~values of density, velocity, etc.) that represent
the same volume of space. This is one important aspect of the SAMR
method -- some physical regions are represented by grids at multiple
resolutions. To maintain consistency, the coarse grid quantities are
updated with the finer values, using a volume-weighted average of the
conserved quantities:
\begin{equation}
q^{\rm coarse} = r^{-d} \sum q^{\rm fine}_{i, j,k}
\end{equation}
where $r$ is the refinement factor, and the sum is over all fine cells
that ``cover" the coarse cell. The exponent, $d$, refers to the weight
of the average. For cell-centered quantities, this is the
dimensionality of the problem. For face-centered quantities, this is
one less than the dimensionality, and for edge-centered quantities
this is one.
\subsection{Flux correction}
\label{sec:flux_correction}
One of the significant advantages of flux-conservative methods is, of
course, their ability to maintain conserved quantities to machine
precision. Unfortunately, the projection step upsets this flux
conservation since, at the boundary of a refined region, the coarse
and fine cells on either side of the border are updated with different
estimates of the flux across the boundary. At one refinement level
this comes from the coarse grid computation, and in the other from the
fine grid solution. Conservation can be restored by correcting all
coarse cells that lie outside the boundary of a fine region with the
difference between the coarse and fine estimates of the flux across
the boundary:
\begin{equation}
q^{\rm coarse} = \tilde{q}^{\rm coarse} - \Delta t \left( F^{\rm
coarse} - \sum_{j,k} F^{\rm fine}_{j,k} \right)
\end{equation}
where $F$ is the flux of quantity $q_i$ in the $i$-direction through
the cell interface. We use $\tilde{q}$ to indicate the uncorrected
quantity on the coarse grid and the sum is over the $r^{d-1}$ fine
cell interfaces in the perpendicular dimensions that share a face with
the coarse cell being corrected. This correction is carried out for
all coarse cells that share a face with a refined cell (and are not
themselves covered with more refined cells).
\subsection{Hierarchy reconstruction}
\label{sec:hierarchy_reconstruction}
Since the addition of more highly refined grids is adaptive, the
conditions for refinement must be specified. Many different criteria
can be simultaneously specified for a given simulation, and these
refinement criteria are discussed in more detail in the following
section (Section~\ref{sec:refinement_criteria}). Once all cells in a
given grid that require additional refinement have been flagged,
rectangular boundaries are determined that minimally encompass the
flagged cells. This is done using an iterative machine-vision
technique, as described in \cite{Berger91}.
We first find the smallest region (or ``proto-subgrid") that covers
all of the refined cells on a grid. We compute an efficiency
$\epsilon$ for this grid, defined as the ratio of flagged cells to
total cells. If this efficiency is sufficiently large (typically over
30\%), or if the grid is small enough, then we adopt the region as a
new grid and move on to the next region (if any). If the region is
not acceptable, then we split it into two using the following
algorithm. First, we determine the longest dimension and compute the
signature along that dimension. The signature is found by summing the
number of refined cells with the same index along the longest
dimension. Edges are typically indicated by inflection points in this
signature, so we calculate the second derivative using a simple
three-point finite difference estimator and look for the maximum in
the second derivative, with ties going to the index closest to the
center. If no inflection points are found along the longest axis, the
next-longest axis is searched. If no inflection points are found, the
longest axis is cut in half. The original region is then split into
two new regions, or proto-subgrids. This process is then re-applied
to each of the new regions.
This algorithm is not necessarily optimal, but experience shows it
does a reasonable job of generating an efficient covering of the
refined cells. Note that some cells are unnecessarily refined with
the SAMR approach, resulting in wasted resources (unlike in
cell-splitting AMR schemes). However, it has a number of advantages.
First, traditional grid-based hydro methods can be simply added with
only minor modifications. Second, it is easier to accommodate larger
stencils (as in PPM, which requires three boundary zones), and third,
it can be more efficient for some machines since there are more
predictable and linear memory accesses, which more efficiently use a
CPU's cache.
\subsection{Refinement criteria}
\label{sec:refinement_criteria}
In this section, we turn to the refinement criteria themselves.
Although generalized refinement criteria are available based on
estimators for the error, such as the Richardson method
\citep{AtkinsonHan2004}, we have focused generally on methods that use
physical estimators based on the assumption that the simulator has a
specific quantity or quantities that indicate the need for refinement.
The flagging methods currently built into \enzo\ are:
\begin{enumerate}
\item{\em Slope.} If the normalized slope $(q_{i+1} - q_{i-1})/ (2
q_i)$ of baryon quantities between adjacent cells is larger than a
specified value (typically 0.3), then the cell is flagged.
\item{\em Baryon mass.} This refinement criterion is designed to
mimic a Lagrangian method in that it tries to keep a fixed mass
resolution. In this method, a cell is refined if the baryonic mass
in the cell ($M_g = \rho\, (\Delta x)^d$) is larger than a specified value:
\begin{equation}
M_g > \rho_{\rm flag}\, (\Delta x_{\rm root})^d r^{\epsilon_l l},
\end{equation}
where $\rho_{\rm flag}$ is the equivalent density on the root grid
required for refinement, $\Delta x_{\rm root}$ is the root grid cell
spacing, $r$ is the refinement factor, $l$ is the level and
$\epsilon_l$ is a parameter that can be used to make the refinement
more aggressive ($\epsilon_l < 0$) or less aggressive ($\epsilon_l >
0$) than pure Lagrangian refinement --- in other words,
super-Lagrangian or sub-Lagrangian refinement.
\item{\em Shocks.} In this method, we identify and refine cells that
contain shocks using the following criteria:
\begin{eqnarray}
(p_{i+1} - p_{i-1})/\min(p_{i+1}, p_{i-1}) > 0.33, \qquad
u_{i-1} - u_{i+1} > 0, \quad\text{and} \qquad
e_i / E_i > 0.1. \nonumber
\end{eqnarray}
We note that the parameters $0.33$ and $0.1$ in the equations above
were determined empirically, and can be modified by the user to refine
more or less aggressively, as the situation warrants.
\item{\em Particle mass.} This refinement criterion is precisely the
same as the baryon mass criterion except that it uses the gridded
(cloud-in-cell, or CIC) particle density. In general, this criteria
is used in cosmological simulations, and thus refines on dark matter
and stellar density.
\item{\em Jeans length.} Following \cite{Truelove98}, the code can be
forced to refine the Jeans length by a fixed number of cells, or,
more specifically, whenever the following criterion is not met:
\begin{equation}
\Delta x < \left( \frac{\gamma \pi k_B T}{N_J^2 G \rho \mu \mh} \right)^{1/2},
\end{equation}
where $N_J$ is the required number of cells per Jeans length (4 by default).
\item{\em Cooling time.} Often in cosmological simulations, cooling
is a rapid process and we do not necessarily want to be limited by
the cooling time. However, in some cases it is desirable to resolve
the cooling process, in which case we flag cells that have a cooling
time shorter than their sound crossing time. The cooling time is
given by $t_{\rm cool} = 1.5 k_BT / (n\, \Lambda(T))$ and the sound
crossing time is $t_{\rm cross} = \Delta x / c_s$, where $c_s$ is
the sound speed.
\item{\em Must-refine particles.} In order to allow for refinement in
moving regions, it is possible to generate ``must-refine" particles
that ensure local refinement of the 8 nearest cells to a specific
level. The particles feel gravitational forces and their
trajectories are followed as for any other collisionless particle.
This feature can be used to ensure that a specific Lagrangian region
stays refined to a specified level of resolution.
\item{\em Shear.} Originally used in \citet{Kritsuk06} and later
modified into a more general form, turbulence calculations and
Kelvin-Helmholtz instabilities can be successfully modeled with AMR if
the shear is used as a refinement parameter; in particular, refinement
occurs when the finite difference version of the following inequality
holds:
\begin{equation}
\left( \frac{\partial v_x}{\partial y} + \frac{\partial v_y}{\partial x} \right)^2 +
\left( \frac{\partial v_z}{\partial y} + \frac{\partial v_y}{\partial z} \right)^2 +
\left( \frac{\partial v_x}{\partial z} + \frac{\partial v_z}{\partial x} \right)^2
> \left(\epsilon_s \frac{c_s}{\Delta x}\right)^2
\end{equation}
where $\epsilon_s$ is a dimensionless user-defined parameter and $\Delta x$
is the local cell size at the level where refinement is occurring.
\item{\em Optical depth.} This refinement criterion, typically used
for simulations using radiation transport, flags cells for refinement
if they have an \ion{H}{1} optical depth lower than unity. The
optical depth of a cell is estimated to be:
\begin{equation}
\tau = \sigma_{\rm HI} n_{\rm HI} \Delta x
\end{equation}
where $\sigma_{\rm HI}$ is the neutral atomic hydrogen absorption
cross section, $n_{\rm HI}$ is the proper \ion{H}{1} number density,
and $\Delta x$ is the cell width (which is assumed to be the same
along each axis). If $\tau > 1$, the cell is refined further, up to
the maximum level of refinement.
\item{\em Resistive length.} In a way similar to the Jeans length
criterion described above, the code can be forced to locally resolve
the resistive length scale by a minimum of a fixed number of cells.
This is done by ensuring that the resistive length scale, defined as
\begin{equation}
L_R = |B| / |\curl B|
\end{equation}
is always refined by at least N$_R$ cells locally (i.e.~we refine a
given cell further if $L_R \leq N_R \Delta x$). Note that \enzo\ does
not actually solve the resistive MHD equations -- this is the
``characteristic'' local resistive length, rather than one calculated
within the MHD solver.
\item{\em Must refine region.} This refinement criterion is used to
ensure that all cells in a given subvolume of the simulation are
refined to {\it at least} some minimum level of refinement. The
subvolume is user-specified, and may evolve over time (e.g., to
follow a structure of interest that is traveling through the
simulation volume).
\item{\em Metallicity.} This refinement criterion ensures that all
cells with a metallicity (defined as $Z \equiv \rho_{Z} / \rho_{b} /
0.022$, where $\rho_{Z}$ is the density of metals in the cell,
$\rho_{b}$ is the total baryon density (including metals), and 0.022
is the solar mass fraction of metals) above a user-defined value is
refined to {\it at least} some minimum level of refinement.
\item{\em Second Derivative.} This is similar to the refinement criteria used
in Paramesh and was originally developed by \cite{1987nasa.reptQ....L}. This
method calculates the second derivative of a user-defined field normalized by
the first derivative, and sets a threshold for refinement. More background
information can be found in the FLASH4 user guide\footnote{\url{http://flash.uchicago.edu/site/flashcode/user\_support/flash4\_ug/}},
but we reproduce the relevant equation here for completeness. The
multi-dimensional version of the criteria takes in to account all
cross-derivatives, and is expressed by,
\begin{equation}
S = \left\{ \frac{\sum_{ij}\left( \frac{\partial^2f}{\partial x_i \partial x_j} \right)^2}{\sum_{ij}\left[ \frac{1}{2\Delta x_j}\left( \abs{\frac{\partial f}{\partial x_i}}_{x_i+\Delta x_i} + \abs{\frac{\partial f}{\partial x_i}}_{x_i-\Delta x_i} \right) + \epsilon \frac{\abs{\bar{f_{ij}}}}{\Delta x_i \Delta x_j} \right]^2} \right\}^{1/2},
\label{eq:second-deriv_analytical}
\end{equation}
where $f$ is the user-defined field, $\abs{\bar{f_{ij}}}$ is an average of
$f$ over cells in the $i$ and $j$ dimensions, and $\epsilon$ is
a user-controlled value that provides a method of ignoring small
fluctuations, set by default to $10^{-2}$. If $S$ is larger than a specified
threshold (bounded by [0,1]) a cell is marked for refinement. This threshold
is chosen at runtime and can vary for each of the fields under consideration.
\end{enumerate}
Finally, it is also possible to specify a rectangular region that is
statically refined to a certain level. These static refined regions
are in place throughout the simulation regardless of the grid
quantities and can be used for testing purposes, or for
multiply-zoomed initial conditions (in cosmology simulations, for
example) that would not otherwise be refined.