A new adiabatic kernel for the MC2 model

Atmosphere-Ocean ISSN: 0705-5900 (Print) 1480-9214 (Online) Journal homepage: http://www.tandfonline.com/loi/tato20 A new adiabatic kernel for the M...
Author: Molly Curtis
9 downloads 0 Views 527KB Size
Atmosphere-Ocean

ISSN: 0705-5900 (Print) 1480-9214 (Online) Journal homepage: http://www.tandfonline.com/loi/tato20

A new adiabatic kernel for the MC2 model S.J. Thomas , C. Girard , R. Benoit , M. Desgagné & P. Pellerin To cite this article: S.J. Thomas , C. Girard , R. Benoit , M. Desgagné & P. Pellerin (1998) A new adiabatic kernel for the MC2 model, Atmosphere-Ocean, 36:3, 241-270, DOI: 10.1080/07055900.1998.9649613 To link to this article: http://dx.doi.org/10.1080/07055900.1998.9649613

Published online: 19 Nov 2010.

Submit your article to this journal

Article views: 58

View related articles

Citing articles: 9 View citing articles

Full Terms & Conditions of access and use can be found at http://www.tandfonline.com/action/journalInformation?journalCode=tato20 Download by: [37.44.207.111]

Date: 26 January 2017, At: 08:04

A New Adiabatic Kernel for the MC2 Model S.J. Thomas1 , C. Girard, R. Benoit, M. Desgagne and P. Pellerin Recherche en prevision numerique (RPN), Environnement Canada, 2121, route Transcanadienne, Dorval, Quebec, Canada, H9P 1J3

[Original manuscript received 7 January 1997; in revised form 20 April 1998]

ABSTRACT Traditional semi-implicit formulations of nonhydrostatic compressible models may not be stable in the presence of steep terrain when pressure gradient terms are split and lagged in time. If all pressure gradient terms and the divergence are treated implicitly, the resulting wave equation for the pressure contains off-diagonal cross-derivative terms leading to a highly nonsymmetric linear system of equations. In this paper we present a more implicit formulation of the Mesoscale Compressible Community (MC2) model employing a Generalized Minimal Residual (GMRES) Krylov iterative solver and a more efþcient semiLagrangian advection scheme. Open boundaries now permit exact upwind interpolation and the ability to reproduce simulations to machine precision is illustrated for one-way nesting at equivalent resolution. Numerical simulations of hydrostatic and nonhydrostatic mountain waves demonstrate the stability and accuracy of the new adiabatic kernel. The computational efþciency of the model is reported for 1D Jacobi and 3D Alternating Direction Implicit (ADI) line relaxation preconditioners implemented with a parallel data transposition strategy.  RESUM E Les formulations semi-implicites traditionnelles des modeles compressibles non hydrostatiques ne sont pas toujours stables en presence d'une topographie accentuee lorsque les termes du gradient de pression sont fractionnes et dephases. Si tous les termes du gradient de pression et la divergence sont traites de fa¾con implicite, l'equation d'onde resultante pour la pression comprend des termes de derivees croisees hors diagonale menant a un systeme d'equations lineaires fortement non symetrique. Dans cet article, nous presentons une formulation plus implicite du modele de mesoechelle compressible communautaire (MC2) utilisant un schema d'advection semi-lagrangien plus efþcace et un solutionneur iteratif de type Krylov cherchant a minimiser le residu de l'equation elliptique ( «Generalized Minimal Residual - GMRES »). Les frontieres ouvertes permettent maintenant une interpolation exacte a la position amont et nous donnent l'occasion ainsi de reproduire des simulations a la precision de la machine dans le cas du pilotage unidirectionnel a resolution e quivalente. Les simulations d'ondes orographiques hydrostatiques et non hydrostatiques demontrent la stabilite et la precision du nouveau coeur adiabatique. On rapporte l'efþcacite computationnelle du modele lorsqu'on utilise un preconditionnement de type Jacobi unidimensionnel

||||| 1 Corresponding author: Scientiþc Computing Division, National Center for Atmospheric Research (NCAR), P.O. Box 3000, Boulder, CO 80307, U.S.A.

ATMOSPHERE-OCEAN 36 (3) 1998, 241–270 0705-5900/98/0000-0241$1.25/0 © Canadian Meteorological and Oceanographic Society

242 / S.J. Thomas et al. ainsi qu'une technique de relaxation en ligne de type ADI tridimensionnel introduits avec une strategie de transposition de donnees en parallele.

1 Introduction The early development of numerical models for weather prediction was concerned with large-scale synoptic dynamics. These ows are characterized by slow-moving Rossby waves and quasi-geostrophic enstrophy-cascade energy spectra. Traditionally, atmospheric models were based on incompressible governing equations and the hydrostatic approximation. Atmospheric models generally avoid using the fully compressible governing equations since high-frequency acoustic waves impose severe time step restrictions. Acoustic and gravity waves carry very little energy in comparison with the rotational Rossby modes associated with large-scale atmospheric motion. In fact, gravity waves are regarded as noise in synoptic-scale global weather prediction models. However, rotational-wave mode interactions are important in meso and microscale models. In the atmosphere, this occurs at scales below 300{400 km where phenomena such as gravity wave breaking become important. In order to study these effects a new generation of nonhydrostatic limited area atmospheric models based on the anelastic or fully compressible Euler/Navier-Stokes equations has been developed by the atmospheric modelling community. The Mesoscale Compressible Community (MC2) model is a fully compressible nonhydrostatic limited area atmospheric model used in Canadian Universities and at Environment Canada for mesoscale and microscale atmospheric research (Benoit et al., 1997). A detailed description of the original numerical formulation is given by Laprise et al. (1997). In particular, the model employs a fully 3D semi-implicit semi-Lagrangian time discretization scheme and a generalized terrain-following coordinate system. The origins of MC2 can be traced to mesoscale and microscale models developed over the past two decades. In particular, Clark (1977) introduced the terrain-following vertical coordinate transformation of Gal-Chen and Sommerville (1975) into an anelastic model for small-scale studies. This transformation is now widely used in nonhydrostatic mesoscale models. It was also implemented in a compressible model for the study of moist mountain waves described by Durran and Klemp (1983) who use the Klemp-Wilhelmson (1978) split-explicit time integration scheme. A semi-implicit scheme was þrst applied in a compressible limited area model by Tapp and White (1976) to treat acoustic waves and was later extended by Cullen (1990) to retard the fastest-moving gravity waves. A two-timelevel semi-Lagrangian time integration scheme was introduced into this model by Golding (1992). The work of Ikawa (1988) illustrates that stability problems may arise with traditional semi-implicit formulations in the presence of steep terrain and Skamarock et al. (1997) were thus motivated to treat all horizontal pressure gradient terms implicitly. The resulting elliptic problem for the pressure contains cross-derivative terms with variable coefþcients and is solved with the nonsymmetric generalized conjugate residual GCR(k) Krylov iterative method described

A New Adiabatic Kernel for the MC2 Model / 243 in Smolarkiewicz and Margolin (1994, 1997) and Smolarkiewicz et al. (1997). The new formulation of the MC2 model described herein now employs the mathematically equivalent Generalized Minimal Residual (GMRES) Krylov solver of Saad and Schultz (1986). 2 Model formulation The MC2 model is an extension of the fully compressible limited-area model developed by Tanguay et al. (1990). The model employs a non-orthogonal coordinate system with a generalized terrain-following height coordinate based on the transformation   z h(XÙ Y ) ζ(XÙ Y Ù z) › H H h(XÙ Y ) introduced by Gal-Chen and Sommerville (1975), where h(XÙ Y ) is the height of the topography. A generalized vertical coordinate is obtained by applying a second mapping to ζ, thus allowing for an arbitrary spacing of vertical levels. The mapping Z(ζ) is strictly a function of ζ and is monotonic, increasing from the bottom ζ › 0 to the top ζ › H of the atmosphere. Following standard conventions, the Jacobian G0 and metric coefþcients GIJ of the transformation are denoted

 G0 ›

∂z ∂ζ



 ∂ζ Ù ∂Z

 G13 ›

∂Z ∂ζ



 ∂ζ Ù ∂X

 G23 ›

∂Z ∂ζ



∂ζ ∂Y



By introducing a polar stereographic projection at reference latitude φ0 with map factor m › (1 + sin φ0 )Û(1 + sin φ), S › m2 , and Coriolis parameter f › 2Ω sin φ, the compressible governing equations in projected X › (X Ù Y Ù z) coordinates then become

DU › fV Dt

DT DT Dt Dt

DU DV Dt DV Dw Dt Dt Dw Dt RT Dq RT Dq cp Dt cp Dt cv Dq cp Dt

K

∂S ∂X

RT

∂q + FU ∂X

∂q ∂q ∂S∂S RTRT + F+U FV › fVfU K K ∂X∂ Y ∂X∂ Y ∂q ∂S › fU K ∂q + FV RT Y Y ∂ ∂ › g RT + Fw ∂z ∂q › g RT + Fw ∂z Q › Q › cp cp   Q ∂U ∂V ∂w › S Ø + + X ∂Y cp T ∂∂X ∂∂zz

(1) (1)

244 / S.J. Thomas et al. Partial derivatives appearing above are to be interpreted for terrain-following coordinates X › (X Ù Y Ù Z) as follows       ∂ ∂ ∂ ∂ 13 › › +G ∂X ∂X z ∂X Z ∂Z X       ∂ ∂ ∂ ∂ › › + G23 ∂Y ∂Y z ∂Y Z ∂Z Y   1 ∂ ∂ ∂ › › G 0 ∂Z ∂z ∂z z and the Lagrangian derivative in transformed coordinates is

D ∂ › + SU Dt ∂t



∂ ∂X



 + SV

Z

∂ ∂Y

 +W Z

∂ Ø ∂Z

(2)

The contravariant vertical velocity W is related to the covariant velocity components by the equation, W › G0 1 w + S(G13 U + G23 V ). Potential temperature is Θ › Te κq , and Π › ( p Ûp00 )κ is the Exner function, where T › ΠΘ, q › ln( pÛp00 ), p00 › 1000 mb. R and cp are the gas constant and heat capacity for dry air at constant pressure, κ › RÛcp . U, V and w are the wind images in projected (X Ù Y Ù z) coordinates and g is the gravitational acceleration. K › (U 2 + V 2 )Û2 is the pseudo kinetic energy per unit mass. Momentum ( FU Ù FV Ù Fw ) and heat Q sources or sinks are also included. Lateral boundary conditions are obtained either from a global model or as part of a one-way nesting strategy within a limited area domain. The new formulation of MC2 now employs in ow/out ow or open lateral boundary conditions as opposed to free-slip or solid walls to close the system of partial differential equations (PDEs) and thus obtain a well-posed problem according to Oliger and Sundstrom (1978). The numerical implementation of lateral as well as consistent top and bottom boundary conditions is presented in the appendices. Over-speciþcation of boundary data is still a potential source of errors. A Davies type boundary relaxation scheme is applied at the end of a time step, where model variables computed on the interior of the computational domain are relaxed to environmental or driving model values such as Ue near the lateral boundaries (Davies, 1976). Such a scheme is sometimes referred to as a gravity wave `absorber' since the overall effect is to damp spurious wave re ections off the lateral boundaries due to a change in grid resolution or over-speciþcation of boundary data.

3 Semi-implicit semi-lagrangian scheme To overcome the severe numerical stability constraints imposed by gravity and acoustic waves in explicit time-stepping schemes, a fully 3D semi-implicit semiLagrangian time discretization is employed. A semi-Lagrangian time integration

A New Adiabatic Kernel for the MC2 Model / 245

scheme approximates the material derivative (2) of a scalar þeld ψ(XÙ t), along trajectories deþned by dX › SU (XÙ t)Ù dt

dY › SV (XÙ t)Ù dt

dZ › W (XÙ t) dt

(3)

A centred-in-time approximation of D ψÛDt on a trajectory originating at the upwind departure point (X Ù tn 1 ) and terminating at a grid point (X+ Ù tn+1 ) is Dψ ψ(X+ Ù tn+1 ) ψ(X Ù tn 1 )  Dt ψ › Ø Dt 2∆t

(4)

Interpolation is used to compute ψ(X Ù tn 1 ) at the upwind point. In essence, the method depends on the accurate backward integration of (3) to obtain displacements a › (αÙ βÙ γ) where X › X+ 2a and X0 › X+ a are the departure point and midpoint. To integrate (3), Robert (1981) combined a þrst-order estimate of the displacements with two iterations of a second-order Runge-Kutta or midpoint integration method. Since the velocity is interpolated between grid points, this approach is very costly in 3D. Smolarkiewicz and Margolin (1997) demonstrate that Adams predictor-corrector type methods similar to the one proposed by Malevsky (1996) provide the same accuracy and are far more efþcient. In this case, the velocity is obtained by linear extrapolation

U (X+ Ù tn+1 ) › 2U (X+ Ù tn )

U (X+ Ù tn 1 )

V (X+ Ù tn+1 ) › 2V (X+ Ù tn )

V (X+ Ù tn 1 )

W (X Ù tn+1 ) › 2W (X Ù tn )

W (X Ù tn 1 )

+

+

(5)

+

followed by one iteration of a midpoint corrector to obtain displacements

α › ∆tSU (X+

∆tSU (X+ Ù tn+1 )Ù tn )

β › ∆tSV (X+

∆tSV (X+ Ù tn+1 )Ù tn )

γ › ∆tW (X+

∆tW (X+ Ù tn+1 )Ù tn )

Semi-Lagrangian advection schemes can be interpreted as Eulerian þnite difference schemes that have been shifted to the upwind grid cell containing the foot of a characteristic curve. Interpolation of a þeld ψ within this cell includes all high order cross terms in the Taylor series expansion of ψ about the departure point. Consider a regular Cartesian grid with uniform spacing ∆x. At the upwind grid cell [xi 1 Ù xi ] containing the departure point, the piecewise cubic polynomial interpolating ψ is p(x) › ψi 1 (1 where ψxx i › ψi+1

x^ ) + ψi x^ + βi 1 ψxx i

2ψi + ψi 1 , x^ › (x

1

xi 1 )Û∆x and

+ βi ψxx i

(6)

246 / S.J. Thomas et al.

βi

1

›

1 x^ (1 6

x^ )(2

x^ )Ù

βi ›

1 x^ (1 6

x^ )(1 + x^ )Ø

Computational costs can be signiþcantly reduced in 2D and 3D by neglecting higher-order terms which do not affect the global truncation error. For example, Ritchie et al. (1995) construct `truncated' Lagrange polynomials by dropping O (∆x 2 ∆y 2 ) terms. A truncated form of the Newton polynomial is described in Malevsky and Thomas (1997). When combined with a predictor-corrector ordinary differential equation (ODE) solver, this resulted in a 50% improvement in efþciency over the original semi-Lagrangian algorithms implemented in MC2. Despite the signiþcant improvement in computational efþciency, semi-Lagrangian advection may not be cost-effective at the mesoscale in the presence of topography, since the time step must respect the advective timescale (Bartello and Thomas, 1996). Indeed, numerical simulations of stationary hydrostatic mountain waves with MC2 indicate that for an advective Courant number C Ü 1 the numerical solutions become severely distorted (Hereil and Laprise, 1996). Further comparisons of the efþciency and accuracy of Eulerian versus semi-Lagrangian schemes are thus warranted at meso and microscale resolutions for different ow regimes. A semi-implicit, semi-Lagrangian time discretization of the compressible governing equations implies that the system of prognostic equations is integrated according to

Dψ + L(ψ(XÙ t)) › R(ψ(XÙ t)) + F(XÙ t) Dt

(7)

In discrete form, Dt ψ + L¼ t › R¼ x + F and þelds ψ appearing in the linear terms L are averaged in time L¼ t using either a centred or off-centred approximation. The effect of time averaging is to reduce the phase speed of the gravity waves and sound waves associated with these terms. A linear stability analysis of the semi-implicit scheme applied to the Euler equations is presented in the appendix of Tanguay et al. (1990). Nonlinear terms are normally evaluated at the trajectory midpoint R(ψ(X0 Ù t)). However, a spatial average R¼ x is applied along trajectories to stabilize the numerical response of the leap-frog scheme to nonlinear forcing (Tanguay et al., 1992). The combination of these space and time averages is sometimes referred to as an Eulerian treatment of mountains. The overall effect is to introduce mild numerical dissipation which damps the spurious resonant response of the semiimplicit semi-Lagrangian scheme to stationary forcing (see Ritchie and Tanguay, 1996). In the original numerical formulation of the MC2 model, pressure gradient terms were split into horizontal (treated implicitly) and vertical (treated explicitly) components. Existing mesoscale models have tended to split and lag the `off-diagonal' pressure gradient terms in time to avoid having to solve highly nonsymmetric systems of equations containing cross-terms. Such simplifying assumptions can lead to signiþcant errors in the presence of steep terrain (see Skamarock et al., 1997;

A New Adiabatic Kernel for the MC2 Model / 247

Ikawa, 1988). Indeed, with the advent of robust iterative Krylov methods for nonsymmetric systems of linear equations this is no longer necessary. Following the original model formulation, thermodynamic þelds are still decomposed into a basic state and perturbation, T › T  + T 0 and q › q + q0 . The basic state is a stationary isothermal atmosphere in hydrostatic equilibrium, ∂q › ∂z

g Ù RT 

q (z) › q0

gz RT 

with T  and q0 constant. In the semi-implicit scheme, linear terms responsible for acoustic and gravity waves are isolated on the left-hand-side of the governing equations (1) as in the system of ODEs given by (7). However, all pressure gradient terms are now treated implicitly. In addition, W › G0 1 w + S(G13 U + G23 V ) is now regarded strictly as a diagnostic relation.

DU ∂q0 + RT  › R U + FU Dt ∂X DV ∂q0 + RT  › R V + FV Dt ∂Y Dw ∂q0 + RT  Dt ∂z

g

T0 › R w + Fw T

(8)

DT 0 RT  Dq0 T  2 + N w › RT + FT Dt cp Dt g    cv Dq0 g ∂U ∂V ∂w +S w › Rq + Fq + + cp Dt c2 ∂X ∂Y ∂z where the Brunt-Vaisalla frequency N and sound speed c are deþned by N2 ›

g2 Ù cp T 

c2 ›

cp RT  Ø cv

The nonlinear right-hand side R terms are given by RU › fV

K

RV ›

fU

Rw ›

RT 0

RT › Rq › 0

∂S ∂X K

∂S ∂Y

RT 0

∂q0 ∂X

RT 0

∂q0 ∂Y

∂q0 ∂z     0 RT ∂w ∂U ∂V + + S cv ∂X ∂Y ∂z

(9)

248 / S.J. Thomas et al. and the remaining forcing F terms are   RT 0 Q Ù FT › 1 + cv T cp

Fq ›

Q Ø cp T

Each equation is discretized according to [ψ + ∆tL]+ › [ψ

∆tL + 2∆tF] + 2∆tR¼ x › Qψ

and the resulting time discretized form of the governing equations is therefore



∂q0 U + ∆tRT ∂X 

+ › QU

+ ∂q0 › QV V + ∆tRT ∂Y   + 0 T0  ∂q › Qw g  w + ∆t RT T ∂z   T 2 + RT  0 0 N w › QT q + ∆t T cp g    +  cv 0 ∂U ∂V + + ∆tD2 (w) › Qq q + ∆tS cp ∂X ∂Y 

where

 D1 ›

∂ ∂z

g cp T 





 Ù

D2 ›

∂ ∂z

g c2

(10)

 Ø

A complete dynamical time step of the adiabatic kernel of MC2 can be summarized as follows. First, components of the right-hand side terms Qψ are interpolated at departure points. Taking the divergence of momentum and then eliminating the remaining variables, an elliptic problem for q0 is obtained. In order to solve the resulting wave equation at time t + ∆t to machine precision rather than to truncation error it is essential to apply elimination to the space discretized versus the time discretized form of the above equations. The discrete form of the wave equation now employed in the model is detailed in Appendix B and is presented in the next section. Having solved for q0 , the remaining þelds are updated via back substitution. A weak Robert time þlter is then applied to damp the computational solution introduced by the leapfrog scheme (Asselin, 1971). 4 Space discretization The governing equations are discretized using centred second order þnite-differences on an Ni Nj Nk Arakawa `C' type grid (Laprise et al., 1997) with uniform spacing

A New Adiabatic Kernel for the MC2 Model / 249 (∆XÙ ∆Y Ù ∆Z). Variable spacing in the vertical direction is made possible by the introduction of the generalized co-ordinate Z . The standard convention in atmospheric modelling is to denote þelds located at grid cell centres by q0i jk . Staggered wind images are denoted Ui 1Û2Ù jÙk , in the X-direction, ViÙ j 1Û2Ùk , in the Y -direction, and wiÙ jÙk 1Û2 , in the Z-direction. The space discretized form of equations (10) is summarized below along with the main þnite difference operators which now explicitly include the Gal-Chen transformation. h i+ 6 Nj › [QU ]i 1 Ù jÙk j › U + ∆tRT  δ~ X q0 1

h   

i 2 Ù jÙk

V + ∆tRT  δ~ Y q0

 w + ∆t RT  δ~ Z q0

T0

g

T0 T

i+

iÙ j 12 Ùk

2

› [QV ]iÙ j

1 Ùk 2

6 Ni i›

+

T 2 RT  N w µZ q 0 + ∆t cp g 

iÙ jÙk

1 2

iÙ jÙk

1 2

+

cv 0 ~ 2 (w) q + ∆tS(δ~ X2 U + δ~ Y 2 V ) + ∆t D cp

+

› [Qw ]iÙ jÙk

1 2

› [QT ]iÙ jÙk

1 2

› [Qq ]iÙ jÙk

6 Ni Ù j › 6 Nj i›

(11)

6 Ni Ù j › 6 Nj i› 6 Ni Ù j › 6 Nj . i›

iÙ jÙk

By combining Qw and QT as follows, [Qw ]i Ù jÙk

1 2

›

h i 1 g Q + Q ∆t w T 1 + ∆t 2 N2 T i Ù j Ùk

T 0 is eliminated from the above equations  ++ ∆tRT  ~ 00 (q )) D1(q w+ D N22 1 1 + ∆t 22N iiÙÙjjÙÙkk

11 22

› [Q [Qww ]iiÙÙjjÙÙkk ›

11 22

1 2

Ù

Ø

Eliminating U, V and w by taking the divergence of (QU Ù QV ) and then combining with D~ 2 (Qw ) and Qq , leads to   1 + ∆t 2 N2  2 [∆tD~ 2 (Qw ) + ∆tS(δ~ X2 QU + δ~ Y 2 QV ) Qq ]i Ù jÙk Ø [Qq ]iÙ jÙk › ∆Z RT  ∆t 2

The discrete form of the elliptic operator for [L q0 ]i jk is thus obtained   +  1 0 2 ~ ~ 2 2 2 ~ › [Qq ]iÙ jÙk Ø q0 ∆Z D2 D1 (q ) + (1 + ∆t N ) S = ∆t 2 c2 i Ù j Ùk

(12)

These equations are only valid strictly away from the boundaries. Finite difference operators are summarized below.

250 / S.J. Thomas et al.

1 δZ Ù δ~ Z › G0

g µZ Ù cp T 

~ 1 › δ~ Z D

~ 2 › δ~ Z D

g µZ Ù c2

δ~ X › δX + G13 µXZ δZ Ù

δ~ X2 › δX

F 1 µX +

1 µZ δZ G1 µX Ù G0

δ~ Y › δY + G23 µYZ δZ Ù

δ~ Y 2 › δY

F 2 µY +

1 µZ δZ G2 µY Ù G0

2 =~ › δ~ X2 δ~ X + δ~ Y 2 δ~ Y

where δX and µX are the standard difference and averaging operators. X

Y

G1 › G0 G13 Ù G2 › G0 G23 Ù F1 › The horizontal derivative operators δ~ X2

1 XYZ

YZ

δX G0 Ù F2 ›

1 XYZ

XZ

δY G0 Ø

G0 G0 and δ~ Y 2 and related discrete divergence

div(V) › S[δ~ X2 U + δ~ Y 2 V ] + δ~ Z w are part of the original model formulation (Laprise et al., 1997). The divergence div(V), metric factors G1 and G2 , along with the F1 and F2 terms are presented in Appendix A. Several alternative implementations of semi-Lagrangian advection are possible on a staggered Arakawa `C' grid. The MC2 model adopts the approach described in the staggered staggered wind wind images imagesare areinterpolated interpolatedtotogrid gridcell cellcentres centrs Robert (1993), where the and displacements a are computed there. A stencil analogous to (6) in 3D is then computed to obtain the upwind components of Qψ . Displacements are interpolated onto the corresponding grid before components of the right-hand side terms Qψ are advected from their upwind positions at time tn 1 to an Eulerian grid point at time tn+1 . Interpolation of the displacements may result in a loss of accuracy. Smolarkiewicz and Margolin (1997) were thus motivated to use a non-staggered Arakawa `A' grid in an Eulerian/semi-Lagrangian formulation of a Lipps-Hemler anelastic model. 5 Boundary conditions Lateral boundaries are now speciþed as open with in ow/out ow determined by the normal components of the velocity (as opposed to solid walls and a freeslip condition). Gradients of the remaining variables are then speciþed to close the problem. Consistent lateral as well as top and bottom boundary conditions require some care when applying the Gal-Chen transformation. To a large extent the staggered Arakawa `C' grid and the semi-implicit scheme determine the precise numerical form of boundary conditions. In the horizontal direction, the normal velocity components U 1 Ù jÙk in the west, UNi 1 Ù jÙk in the east, ViÙ 1 Ùk in the south and 2

2

2

A New Adiabatic Kernel for the MC2 Model / 251 ViÙNj 1 Ùk in the north intersect the boundary. For the discrete system of equations 2 (11), the relations [U ]+1 Ù jÙk › [QU ] 1 Ù jÙk › [Ue ] 1 Ù jÙk Ù

[U ]+N

[V ]+iÙ 1 Ùk › [QV ]iÙ 1 Ùk › [Ve ]iÙ 1 Ùk Ù

[V ]+iÙN

2

2

2

2

2

2

i

1Ù 2 jÙk

j

1 Ùk 2

› [QU ]Ni

› [QV ]iÙNj

1Ù 2 jÙk 1 Ùk 2

› [Ue ]Ni

› [Ve ]iÙNj

1Ù 2 jÙk 1 Ùk 2

along with the implied Neumann boundary conditions below for the semi-implicit wave equation, effectively close the numerical problem. [δ~ X q0 ]+1 Ù jÙk › [δ~ X q0 ]+N 2

i

1Ù 2 jÙk

› [δ~ Y q0 ]+iÙ 1 Ùk › [δ~ Y q0 ]+iÙN 2

j

1 Ùk 2

› 0Ø

Semi-Lagrangian advection implies upwind interpolation across lateral boundaries for in ow regions. Upwind information is therefore required outside of the domain for all variables (including normal wind components), at internal grid points immediately adjacent to the lateral boundaries. Assuming an advective Courant number C Ú 1, values of all variables at times t and t ∆t are speciþed at three additional grid rows surrounding the computational domain. Thus, advection for external upwind points is computed in exactly the same way as for internal upwind points. In Eulerian advection schemes, this is equivalent to specifying a þnite difference at the boundary, which in turn constitutes a numerical approximation of the gradient across the boundary at in ow points. For semi-Lagrangian advection, gradients 6 0Ù [δ~ X q0 ] 1 Ù jÙk › 2

[δ~ X q0 ]Ni

1Ù Ù 2 jk

› 6 0Ù

6 0Ù [δ~ Y q0 ]iÙ 1 Ùk › 2

[δ~ Y q0 ]iÙNj

1Ù 2 k

› 60

at times t and t ∆t must differ from zero at lateral boundaries if mass is to be properly advected into or out of the domain (in contrast with the Neumann condition above). In the vertical direction, at levels 1Û2 and Nk +1Û2, the normal velocity component corresponding to the contravariant component of the vertical motion is set to W › 0. At the model lid, this implies that the covariant component wiÙ jÙNk + 1 itself vanishes 2 but at the surface, [G0 1 w + S(G13 µX U + G 23 µY V )]+iÙ jÙ 1 › 0Ø 2

Finally, at or near the vertical boundaries, some operators must be speciþed in the absence of grid points. In particular, this is the case in the above relation since the horizontal wind components are not deþned at level 1Û2. The most cumbersome part of the new numerical formulation of the MC2 model is the elliptic operator L , which results in a highly nonsymmetric linear system of equations to solve every time step. The GMRES iterative elliptic solver now employed in MC2 is presented in the next section. The þnite difference form of the elliptic operator ~ 2 ), and including boundary conditions is detailed in Appendix B (horizontal part = ~ 1D ~ 2 ). The major differences between the original in Appendix C (vertical part D

252 / S.J. Thomas et al.

MC2 model formulation and the new or modiþed MC2 should now be apparent. Fundamentally, no special treatment is accorded metric terms arising from the terrain-following transformation. Simpliþcation of the above linearization process and numerical stability in steep terrain are achieved at the expense of a more complex elliptic problem to solve every time step. The added complexity is partially compensated for by the fact that the equation relating W and w is now purely diagnostic and no longer averaged along trajectories. 6 Elliptic solvers The original implementation of the MC2 model employed a 3D variant of the Alternating Direction Implicit (ADI) method to solve the elliptic boundary value problem for the log pressure perturbation q0 . The solver was implemented as a Richardson iteration with Peaceman and Rachford (1955) cyclic acceleration parameters and ADI preconditioner. The elliptic operator arising from the new semiimplicit scheme described above contains off-diagonal cross-derivative terms and results in a highly nonsymmetric linear system of equations. In order to solve these equations, a robust iterative Generalised Minimal Residual (GMRES) Krylov type solver has been introduced. The original 3D ADI scheme is still retained, but is now employed as a preconditioner for the GMRES iteration, since the original elliptic operator is still contained within L . Smolarkiewicz and Margolin (1994) analyzed a class of iterative Krylov subspace solvers for the problem L (φ) › R on a domain Ω with non self-adjoint negative deþnite elliptic operator L . The authors derive Krylov solvers from variational principles applied to the pseudo-time integration of a high-order wave equation to steady state. The particular solvers they describe are essentially of the Generalized Conjugate Residual (GCR) type, but also allow for a more robust and exible `rightpreconditioning' strategy (see below for further details). The GCR algorithm is a minimal residual Krylov method which minimizes the L2 (Ω) norm of the residual r › L (φ) R at every iteration for negative deþnite elliptic operators L (Eisenstat et al., 1983; Elman, 1982; Elman, 1984). Saad and Schultz (1986) proposed the closely related GMRES algorithm. GMRES is mathematically equivalent to the GCR algorithm when the symmetric part of the elliptic operator L is negativedeþnite. In this case, the two algorithms produce an identical sequence of iterates. In general, GCR may break down when the underlying elliptic operator L is indeþnite which is a rare case in computational uid dynamics (CFD) applications. To minimize model execution time, a preconditioner must be found which accelerates the solver convergence rate without incurring a large computational overhead. The design of a suitable preconditioner is usually problem dependent. Preconditioners can be based on operator splittings such as the classical iterations (Richardson, Jacobi, Successive over Relaxation (SOR), ADI, etc.) or more sophisticated techniques such as multigrid can be employed. Rather than being viewed as competing solver technology, the excellent convergence rates achieved by multigrid methods can be improved by Krylov accelerators. The use of multigrid or multi-

A New Adiabatic Kernel for the MC2 Model / 253 level type preconditioners in the context of Krylov subspace methods is facilitated by variable preconditioning. Flexible variants of GMRES and GCR allow the preconditioner to vary at each inner iteration, a feature that is quite useful in domain decomposition type algorithms or in any parallel computing implementation. This also permits the introduction of a secondary iterative procedure as a right preconditioner. In standard non- exible techniques, these inner solutions must be exact or highly accurate within each subdomain. For exible Krylov solvers this does not have to be the case. The exible variant of the GMRES algorithm (FGMRES) developed by Saad (1993) is given below. Consider a linear system of the form Ax › b where A is a large sparse nonsymmetric real matrix of size n  m. The GMRES algorithm is a method which minimizes the L2 norm of the residual vector kb Axk2 by projecting the initial residual r0 › b Ax0 onto the Krylov subspace Km › Spanfr0 Ù Ar0 Ù . . . Ù Am 1 r0 g

When `right-preconditioning' is applied to the above linear system, a preconditioned linear system AM 1 (Mx) › b is solved implicitly. FGMRES allows the right preconditioner M to be different at each step j. a Algorithm for Flexible GMRES (FGMRES) 1. Start: Choose x0 and dimension m of the Krylov subspace. Deþne an (m+ 1)  m matrix H¼ m and initialize all its entries hiÙ j to zero. 2. Arnoldi process: (a) Compute r0 › b Ax0 , β › kr0 k2 and v1 › r0 Ûβ. (b) For j › 1Ù . . . Ù m do  Compute zj :› Mj 1 vj  Compute w :› Azj   hiÙ j :› (wÙ vi )  For i › 1Ù . . . Ù j, do w :› w hiÙ j vi  Compute hj+1Ù j › kwk2 and vj+1 › wÛhj+1Ù j . (c) Deþne Zm :› [z1 Ù . . . Ù zm ]. 3. Form the approximate solution: Compute xm › x0 + Zm ym where ym › argminy kβe1 H¼ m yk2 and e1 › [1Ù 0Ù . . . Ù 0]T . 4. Restart: If satisþed stop, else set x0 xm and goto (2).

The Arnoldi loop constructs an orthogonal basis of the preconditioned Krylov subspace Km › Spanfv1 Ù AM1 1 v1 Ù . . . Ù AMm 1 1 vm 1 g

(13)

254 / S.J. Thomas et al.

by a modiþed Gram-Schmidt process, where each new vector to be orthogonalized is generated from all previous basis vectors. When the preconditioner is constant, Mj › M for j › 1Ù . . . Ù m, the method is equivalent to the standard GMRES algorithm, right-preconditioned with M . The approximate solution xm obtained from this modiþed algorithm minimizes the residual norm kb Axm k2 over x0 +SpanfZm g. Simple and effective preconditioners to improve the convergence rate of a Krylov type iteration such as the FGMRES algorithm described above can be based on classical stationary methods such as the point Jacobi or Gauss-Seidel iterations. These methods are derived from a splitting of the coefþcient matrix A › D E F, where D is the diagonal of A, E is the strictly lower triangular part and F is the strictly upper triangular part of the matrix. Unless otherwise stated, the natural ordering of grid points is assumed. The Jacobi iteration is the simplest method and also avoids the possible introduction of the directional bias inherent to Gauss-Seidel Dx k+1 › (E + F)x k + b

(14)

Splitting methods such as this are linear þxed-point iterations of the form x k+1 › M

1

Nx k + M

1

b › (I

M

1

A)x k + M

1

b › xk + M

1

(b

Ax k ) (15)

where A › M N is a matrix splitting. It follows that M › D, N › A D for the Jacobi iteration, whereas M › D E, N › A D in Gauss-Seidel. Thus, (15) can be viewed as a `preconditioned' þxed-point iteration for the system of equations M 1 Ax › M 1 b, where M is the preconditioner. These same preconditioners can equally be applied to accelerate Krylov methods, where the matrix I M 1A 1 appearing in (14) is replaced by a matrix polynomial P(M A) 2 Km where Km is a Krylov subspace. The discretized discretised elliptic operator in a nonhydrostatic pressure solver will be dominated by the vertical terms when the aspect ratio ∆X Û∆Z is large. Therefore, an effective preconditioning strategy is to invert the vertical components of the elliptic operator. Skamarock et al. (1997) applied this strategy in a GCR Krylov type solver for nonhydrostatic models by using a vertical ADI line relaxation preconditioner. The ADI method of Peaceman and Rachford (1955) was originally proposed as a method for solving parabolic partial differential equations such as the heat conduction equation ut › ∆u + r with forcing term r. When this PDE is discretized by the method-of-lines, the resulting system of þrst-order ordinary differential equations with constant coefþcient matrix A is given by u0 › Au + r. In two dimensions, an ADI time integration scheme is based on the splitting A › H + V , where the H and V represent the horizontal and vertical components of the discrete elliptic operator based on centred second-order þnite differences,

 I

∆t H 2

 u

k+1Û2

  ∆t ∆t › I+ V uk + r 2 2

(16)

A New Adiabatic Kernel for the MC2 Model / 255

 I

∆t V 2

 k+1

u

  ∆t ∆t › I+ H uk+1Û2 + rØ 2 2

(17)

An iteration for solving elliptic problems can be derived if the time step ∆tÛ2 is replaced with an acceleration parameter β. By cycling through a sequence of β across the spectrum of A, fast convergence rates may be achieved and the original MC2 solver was based on a stable generalization of this algorithm to three dimensions. To construct a vertical ADI line relaxation preconditioner for the n  n linear system Ax › b, the equivalent of (17) is applied with uk+1Û2 › uk and u0 › 0, βV )x k+1 › (I + βH )x k

(I

βbØ

(18)

The largest possible acceleration parameter β is chosen so that the above integration scheme remains stable. A slightly more implicit line relaxation scheme can be constructed by a generalization of the point Jacobi iteration (14) to a vertical Jacobi line relaxation scheme k + xik Txik+1 › xi+1

1

i › 1Ù . . . Ù n

+ bÙ

(19)

where the index i represents an entire line of grid points. For the Poisson problem ∆u › r on the 2D unit square, the matrix T has the following block structure 2A

11

6 A21 6 6 6 T›6 6 6 4

3

A12 A22

A23

A32

A33 

A34 





 AnÙn

where Ai+1Ùi › Ai

1Ùi ,

Ai

1Ùi

1

An 1Ùn Ann

7 7 7 7 7 7 7 5

and Aii › Ti ,

2 4 6 1 6 6 6 Ti › 6 6 6 4

1 4 1

3

1 4 

1  

 

1

7 7 7 7 7Ø 7 7 15

4

Generalization of the line Jacobi scheme to the 3D elliptic operator described in Appendix B is straightforward. The vertical ADI scheme (18) splits and weights (with β) the diagonal terms of the discrete operator, whereas the line Jacobi scheme

256 / S.J. Thomas et al.

inverts the diagonal and vertical off-diagonal terms of the operator. We have implemented both vertical ADI and line Jacobi relaxation preconditioners for FGMRES in MC2 and found that the line Jacobi scheme can be more efþcient, resulting in lower wall-clock execution times (see also Thomas et al., 1997). For isotropic grids, the original 3D ADI solver implemented in MC2 is now employed as a preconditioner since the original elliptic operator is contained within the more implicit problem. The solution of tridiagonal linear systems of equations implies global data dependencies, thus a parallel data transposition strategy has been adopted in a distributed-memory implementation of the elliptic solver. A more detailed description of the transpose and the parallel performance of the 3D ADI preconditioner will be presented elsewhere. The computational efþciency of the model using 1D vertical Jacobi and 3D ADI line preconditioners is presented in the next section. 7 Numerical experiments a Mountain Waves in Steep Topography To demonstrate that the new more implicit formulation of the MC2 model is stable and accurate in steep topography, we þrst simulate a 2D hydrostatic nonlinear mountain wave problem similar to the one displayed in Fig. 1 of Skamarock et al. (1997) (see also Pinty et al., 1995). The nonhydrostatic fully compressible model used in this simulation was discretized using a forward-in-time or two-time-level semi-implicit scheme with 2nd order Crowley advection, whereas the MC2 model is based on a centred-in-time or three-time-level leapfrog time discretization scheme. Topography is a bell-shaped mountain,

h(X) ›

h0 Ø 1 + (X Ûa)2

The hydrostatic mountain wave is displayed in Fig. 1, just before wave breaking occurs at the normalized time t › tU0 Ûa › 8. The mountain has a half-width of a › 50 km and a height of h0 › 1 km. In ow velocity is constant at U0 › 20 m s 1 with a constant atmospheric stability of N › 0.02 s 1 . The computational domain is a 101  45 grid, with ∆X › 10 km, ∆Z › 420 m. The time step is ∆t › 50 sec resulting in an advective Courant number of C › 0Ø1. The ow is quasihydrostatic with inverse Froude number Fr 1 › Nh0 ÛU0 › 1Ø00. The steepness of the topography is characterized by ∆G › h0 Û2a › 0Ø01. Skamarock et al. (1997) note that 1D vertical line preconditioners are most effective for the anisotropic grids employed in the hydrostatic mountain wave problem, however, the number of GMRES iterations scales roughly with the time step size, implying a doubling of the number of iterations required in the leapfrog scheme as opposed to the forward-intime scheme. Indeed, our simulation results for this problem would tend to conþrm this observation (see convergence rate in Fig. 2). The solver convergence criteria is currently based on the relative residual norm ε Ú krk k2 Ûkr0 k2 . For the hydrostatic problem above, the solution does not change below ε › 10 6 . Perhaps the best measure of solver convergence is the rms divergence or an estimate thereof, since

A New Adiabatic Kernel for the MC2 Model / 257

Fig. 1 Hydrostatic mountain wave simulation at t › 8. Potential temperature Θ contoured in grey, contour interval of 10.0 K. Vertical velocity contoured with an interval of 0.1 ms 1 . Topography is not displayed.

it is an indication of whether or not the numerical form of the Gauss divergence theorem has been satisþed (see Appendix A). In the above simulation it was found that the rms divergence does not decrease any further beyond ε › 10 5 . To examine the effect of resolution on the preconditioner, a second simulation was run in the nonhydrostatic regime at a higher horizontal resolution of ∆X › 1 km. The mountain half-width is also reduced by a factor of 10 to a › 5 km. The Courant number is maintained at C › 0Ø1, with time step now ∆t › 5 sec. A plot of the nonhydrostatic wave is given in Fig. 3 just prior to breaking. Twice the number of GMRES iterations are now required to reduce the relative residual with the vertical line-Jacobi preconditioner as the grid becomes more isotropic (see

258 / S.J. Thomas et al.

Fig. 2 GMRES(10) convergence rate for hydrostatic mountain wave. Relative residual versus iterations. Vertical Jacobi line relaxation preconditioner (one iteration).

Fig. 4). Nevertheless, the solver is able to handle the large metric terms GIJ resulting from the steeper terrain slope ∆G › h0 Û2a › 0Ø2 and the model remains stable throughout the entire integration. b Parallel Computational Efþciency For a distributed-memory message-passing model of parallel computation, the Ni  Nj  Nk computational grid is partitioned across a PX  PY logical processor mesh. A domain decomposition in the horizontal direction is employed due to the strong vertical coupling in physical parametrization packages and since the number of grid points in the vertical direction is typically one order of magnitude less than in the horizontal. Each processor therefore contains Ni ÛPX  Nj ÛPY  Nk points, resulting in a near optimal surface to volume grid point ratio for semi-Lagrangian advection and application of the elliptic operator in the GMRES solver. For both algorithms the communication overhead associated with boundary data exchanges between subdomains is minimal when compared with computations. The 1D vertical ADI and Jacobi line relaxation preconditioners are also well-suited to a horizontal decomposition, since the only global data dependency is in the vertical

A New Adiabatic Kernel for the MC2 Model / 259

Fig. 3

Nonhydrostatic mountain wave simulation at t › 30. Potential temperature Θ contoured in grey, contour interval of 10.0 K. Vertical velocity contoured with an interval of 0.1 ms 1 . Topography is not displayed.

direction within tridiagonal solvers. However, the 3D ADI preconditioner requires global data in each of the three coordinate directions in order to solve tridiagonal linear systems of equations during each ADI sweep. Thus, the right-hand side b and solution x k must be re-mapped to perform line relaxations in each coordinate direction in turn. Such a re-mapping takes the form of a data transposition algorithm requiring all-to-all communication of O (N 3 ) grid points. Vertical sweeps in the Z direction are performed using the domain decomposition described above. Each processor contains Ni  Nj ÛPY  Nk ÛPX grid points for sweeps in the X direction and Ni ÛPY  Nj  Nk ÛPX points in the Y direction. ADI sweeps progress

260 / S.J. Thomas et al.

Fig. 4 GMRES(10) convergence rate for nonhydrostatic mountain wave. Relative residual versus iterations. Vertical Jacobi line relaxation preconditioner (one iteration).

from left to right and then right to left as indicated below with arrows representing communication steps,

Nj Ni   Nk PX PY

()

Ni 

Nj Nk  PY PX

()

Ni Nk  Nj  Ø PY PX

To compare the computational efþciency of the model using the parallel 1D Jacobi and 3D ADI line relaxation schemes, a quasi-hydrostatic test case was run on up to 16 processors of of an a SGI/Cray SGI/CrayOrigin Origin 2000 2000 computer computer using the MPI message-passing library. The purpose of our test was to determine if the communication overhead associated with the data transposition strategy would adversely affect the computational efþciency as the number of processors is increased. A 120  120  35 grid at 2.5 km horizontal resolution with model lid set at 23 km was employed in a mesoscale forecast over the British Columbia lower mainland. The 30-hour forecast using the MC2 model was run with version 3.5 of the RPN physics package including radiation and stratiform condensation parametrizations. The integration consisted of 1800 time steps of length ∆t › 60 sec using both 1D and 3D preconditioners and the results are summarized in Table 1. Despite the fact that the

A New Adiabatic Kernel for the MC2 Model / 261 TABLE 1. MC2 execution time (hours:minutes) on Cray/SGI Origin 2000 PX  PY

11

22 24 34 44

1D Jacobi 3D ADI

29:09 32:51

7:18 8:13

3:34 3:38

2:37 2:42

2:00 2:19

3D ADI preconditioner results in a much faster convergence rate for the GMRES solver, the overall model execution times are very close. Since the grid aspect ratio ∆XÛ∆Z for this problem is O (10), the 1D line Jacobi scheme is still competitive. However, in both cases the wall-clock execution times decrease linearly and the data transposition overhead appears not to adversely affect performance up to 16 processors. c Validation of One-Way Nesting with Open Boundaries One þnal numerical experiment was designed in order to validate that boundary conditions (open lateral, closed top and bottom) are consistent with the discretized governing equations and that the one-way nesting strategy functions correctly. In particular, it should be possible to solve the implicit wave equation to machine precision if the boundary conditions for the elliptic problem are handled properly and if the discrete form of the Gauss divergence theorem is respected (implying that there are no spurious sources or sinks of mass). The approach is simple and straightforward; the model is run on a large outer computational domain and then run once again using one-way nesting on a smaller interior domain at exactly the same resolution. To make the problem more difþcult (and interesting), the interior domain is speciþed in such a way that lateral boundaries pass over steep topography. Gravity wave absorbing devices near lateral boundaries are not applied. Despite the fact that in real applications the model would normally be run at higher resolution in a cascade, this test is useful for demonstrating that the boundary conditions are correct as is the discrete form of the semi-implicit wave equation. Moreover, the

ow simulation produced by a limited-area model on the interior domain should not degrade when the same resolution is employed (in fact the same grid) and without recourse to any time or space interpolation at the boundaries (see for comparison Warner et al., 1997). Interior domain grid points correspond to those of the larger exterior domain and the wind components on the interior boundary are obtained at each time step from their corresponding computed values at the same grid points on the larger domain. It should then be possible to exactly reproduce the solution computed on the exterior domain within the interior domain. The chosen test problem begins with a hydrostatic hemispheric scale control run at 250-km horizontal resolution over an outer domain containing 30  30  5 grid points with the model lid set at 14 km, implying ∆XÛ∆Z › 90. The computational domain is displayed in Fig. 5 with topography. The integration is for 24 hours with ∆t › 3600 sec or one

262 / S.J. Thomas et al.

Fig. 5

Validation of one-way nesting with open boundaries. Exterior problem domain for hydrostatic simulation. Topography contoured every 200 metres.

hour time steps. A 23  23  5 grid is used as the interior grid with the boundary passing over Greenland. In the control run the solver convergence criterion is set to ε › 10 12 and these results are viewed as being exact for the interior nested run. RMS divergence is plotted in Fig. 6 versus the solver stopping tolerance ε. GMRES(10) iterations with line Jacobi preconditioning are plotted in Fig. 7 versus ε. The log pressure perturbation q0 converges to machine precision when ε is between 10 6 and 10 8 . 8 Conclusions In this paper we have described a new semi-implicit formulation of the MC2 model. All horizontal pressure gradient terms are now treated implicitly and the resulting elliptic problem is therefore highly nonsymmetric. Thus, we chose to employ a

A New Adiabatic Kernel for the MC2 Model / 263

Fig. 6 Validation of one-way nesting with open boundaries. RMS divergence s convergence parameter ε.

1

at 24 hours versus

robust GMRES Krylov iterative solver. In contrast with the original model implementation described in Laprise et al. (1997), we implement open in ow/out ow boundary conditions instead of free-slip or solid wall conditions. In fact, the model can now be run without a Davies type gravity wave absorber and yet still reproduce

ow simulations to machine precision on an interior nested domain at equivalent resolution. Several problems and inconsistencies with lateral as well as top and bottom boundary conditions have now been addressed. In particular, it was previously impossible to impose the correct bottom boundary condition on W since the relation W › G0 1 w + S(G13 U + G23 V ) relating the contravariant and covariant vertical velocities was treated as a prognostic equation and consequently time-averaged along trajectories. This equation is now regarded strictly as a diagnostic relation and immediately leads to the correct boundary condition. With open boundaries, the model must respect the numerical form of the Gauss divergence theorem over the computational domain. The solver convergence criterion must therefore be directly linked to the divergence of the ow þeld. Problems with sources and/or sinks of mass in the original model may now be attributed to incorrect speciþcation of boundary conditions in the solver (i.e., Neumann conditions are required), an inadequate solver convergence criterion and closed boundaries.

264 / S.J. Thomas et al.

Fig. 7 Validation of one-way nesting with open boundaries. GMRES(10) iterations per time step (line Jacobi preconditioner) at 24 hours versus convergence parameter ε.

The primary objective in the construction of a preconditioner for the GMRES iterative solver is to minimize the model execution time. A preconditioner should be inexpensive to apply but at the same time accelerate the solver convergence rate. However, an increased convergence rate and computational efþciency are in direct con ict since the preconditioning matrix M should closely approximate the original coefþcient matrix A. In the new formulation of the MC2 model, relatively simple line relaxation preconditioners have been implemented. Thus, the convergence rate of the solver may not yet be optimal. Both 1D vertical Jacobi and 3D ADI line relaxation preconditioners prove to be scalable on the distributed-shared memory SGI/Cray Origin 2000 supercomputer, thus implying that both anisotropic hydrostatic and isotropic nonhydrostatic ow regimes can be simulated efþciently on current generation parallel computer architectures. Acknowledgements The authors are most grateful to the anonymous referees for their detailed comments and reviews. In particular, we would like to thank Piotr Smolarkiewicz at NCAR for his interest and suggestions which have greatly improved the original manuscript. An A-grid version of the line Jacobi scheme was þrst implemented in the EULAG

A New Adiabatic Kernel for the MC2 Model / 265 model (Smolarkiewicz and Margolin, 1997) during a visit to NCAR/MMM. We would also like to thank Stephane Belair and Peter Bartello for an internal review of our work.

Appendix A: Discretized divergence and horizontal operators The following formulation of the discretized divergence, written in terms of the contravariant wind components, respects the Gauss divergence theorem in discrete form.  i  o 1 n h  YZ  XZ XY Ø div(V) › S G U + G V + G W δ δ δ 0 0 0 X Y Z XYZ G0

In particular, we are guaranteed that Nk X k›1

  XY G0 W δZ

k

 1 2

  XY ∆Z › G0 W

 Nk

1 2

G0

XY



W

1 2

› 0Ø

G0 is assumed to be deþned at the corners of a grid cell with div(V) at the centre and normal wind components at the centre of each face. Introducing w through the relation: h Zi X Y XY G0 W › w + S G1 U¼ X + G2 V¼ Y Ù G1 › G0 G13 Ù G2 › G0 G23 and rearranging we obtain the discrete divergence in terms of its covariant components: div(V) › S[δ~ X2 U + δ~ Y 2 V ] + δ~ Z w having deþned the discrete horizontal derivative operators # " 1 F1 › δ~ X2 U › δX F1 µX + XYZ µZ δZ G1 µX U Ù G0 # " ~δY 2 V › δY F2 µY + 1 µZ δZ G2 µY V Ù F2 › XYZ G0

1 G0

XYZ

1 G0

XYZ

  YZ δX G0   XZ Ø δY G0

It is convenient (see Appendix B.) to separate horizontal and vertical components of the metric factors F1 , F2 , G1 , G2 . Given the following horizontal coefþcients:   X Y 1 1 ∂z Ù g 13 › g0 › δX hÙ g 23 › δY hÙ g¼ 13 › g 13 Ù g¼ 23 › g 23 g0 g0 ∂ζ and having formally deþned         ∂z ζ H ∂z ∂ζ › › H ∂h ∂ζ ∂h

266 / S.J. Thomas et al. we instead write g¼ 13 Ù H

F1 › and obtain

F2 ›

g¼ 23 Ù H



G1 › g0 g¼

13

∂z ∂h



 Ù

G2 › g0 g¼

23

∂z ∂h



    13 ~δX2 › δX + g¼ µX + ∂Z µZ δZ ∂z g¼ 13 µX H ∂ζ ∂h     g¼ 23 µY ∂Z ∂z g¼ 23 µY Ø + µ Z δZ δ~ Y 2 › δY + H ∂ζ ∂h

Similarly, we may rewrite the þrst pair of discrete horizontal operators:     ∂z ~δX › δX + G13 µXZ δZ › δX + g 13 ∂Z µXZ δZ ∂ζ ∂h     ∂z ~δY › δY + G23 µYZ δZ › δY + g 23 ∂Z µYZ δZ ∂ζ ∂h where G0 › g 0



 ∂ζ Ù ∂Z



G13 › g 13

∂Z ∂ζ



∂z ∂h



 Ù

G23 › g 23

∂Z ∂ζ



∂z ∂h



Ø

Appendix B: Discretized horizontal elliptic operator The discrete form of the horizontal operator 2 =~ › δ~ X2 δ~ X + δ~ Y 2 δ~ Y

is expanded, using relations from Appendix A for the two pairs of horizontal difference operators. After grouping terms      Q2 ∂Z ∂z 2 0 2 0 1 2 3 ~ (Q + Q ) + µ Z δZ = q ›= q +Q + H ∂ζ ∂h with 0 2 0 = q › (δX δX + δY δY )q     g¼ 13 g¼ 23 Q 1 › δX + µX Q 13 + δY + µY Q 23 H H

Q 2 › (g¼ 13 µX δX + g¼ 23 µY δY )q0 Q 3 › (g¼ 13 µX Q 13 + g¼ 23 µY Q 23 )

A New Adiabatic Kernel for the MC2 Model / 267

and Q 13 › [G13 µXZ δZ q0 ]Ù

Q 23 › [G23 µYZ δZ q0 ]Ø

In the last term, it will also be convenient to deþne   ∂z 4 (Q 2 + Q 3 )Ø Q › ∂h Lateral boundary conditions for the elliptic operator are fully taken care of by the following conditions. [δ~ X q0 ] 1 jk › 0

(west)Ù

[δ~ Y q0 ]i 1 k › 0

(south)Ù

2

2

[δ~ X q0 ]Ni

1 2

[δ~ Y q0 ]iNj

jk

›0

(east)Ù

›0

(north)Ø

1k 2

These are equivalent to setting: (δX q0 ) 1 jk › [Q 13 ] 1 jk › 0Ù

(δX q0 )Ni

1 2

(δY q0 )i 1 k › [Q 23 ]i 1 k › 0Ù

(δY q0 )iNj

1k 2

2

2

2

2

jk

› [Q 13 ]Ni › [Q 23 ]iNj

1 2

jk

1k 2

› 0Ù › 0Ø

Near the top and bottom, speciþcally for levels 1 and Nk , all terms involving vertical derivatives (Q 13 and Q 23 and the last term) require special treatment. In the expressions for Q 13 and Q 23 the operators (δZ q0 ) are undetermined for levels 1Û2 and Nk+ 1 . However, the hydrostatic approximation is employed for the corresponding 2 explicit terms (to the right-hand sides of the momentum equations), these operators are absent here and we are left with:

1 13 [G µXZ δZ q0 ]i 2 1 › [G13 µXZ δZ q0 ]i 2

Qi13 1 j1 › 2

Qi13 1 jN 2

k

1 j3 2 2

1 2

Ù

jNk

Qi23j 1 2

Ù

11 2

Qi23j

1 23 [G µYZ δZ q0 ]i j 1 3 Ù 2 2 2 1 23 [G µYZ δZ q0 ]i j 1N › 2 2 k

›

1N 2 k

1 2

Ø

For the last term, straight extrapolation is performed to obtain Qi4j 1 › Qi4j1 at the surface and at the top Q 4 vanishes since   ∂z › 0Ø ∂ h Nk + 1 2

Therefore [µZ δZ Q 4 ]i j1 › and

1 [δZ Q 4 ]i j 3 2 2

2

268 / S.J. Thomas et al. [µZ δZ Q 4 ]i jNk ›

1 [µZ Q 4 ]i jNk ∆Z

1 2

Ø

Discretised vertical elliptic operator Appendix C: Discretized Away from vertical boundaries, expanding    g g ~ ~ ~ ~ D 1 D 2 › δZ µ Z δZ µZ cp T  c2 results in   ~ 1 D~ 2 q0 › Ci1jk q0i jk+1 + Ci4jk q0i jk ∆Z 2 D i jk

1

(Ci2jk + Ci3jk )q0i jk

where the coefþcients C 1 , C 2 , C 3 and C 4 are easily obtained. To implement the bottom boundary condition 

 wi+j 1 2

Sg0 µZ

›

∂z ∂h



 (g¼ µX U + g¼ µY V ) 13

23

i j 12

wind components must þrst be estimated at level 1Û2. Straight extrapolation of the right-hand side from level 1 to 1Û2 is applied, as was the case in Appendix B when computing Q 4 , which leads to     ∂z + 13 23 Ø (g¼ µX U + g¼ µY V ) wi j 1 › Sg0 2 ∂h i j1  For time t + ∆t, (U Ù V ) are implicitly given in terms of q0 and QU Ù QV , hence     ∂z  + 4 + 13 23 Ø (g¼ µX QU + g¼ µY QV ) wi j 1 ∆tRT [Sg0 Q ]i j1 › Sg0 2 ∂h i j1 Comparing this equation with the general equation +  ∆tRT  ~ 0 › [Qw ]iÙ j Ù 1 (q ) D w+ 1 2 1 + ∆t2 N2 1 i Ù jÙ 2

gives us the operator (dropping superscripts) ~ 1 q0 ] 1 › [D ij 2

and the right-hand side term [Qw ]i j 1 2

›

(1 + N2 ∆t 2 )[Sg0 Q 4 ]i j1

    ∂z 13 23 Ø (g¼ µX QU + g¼ µY QV ) Sg0 ∂h i j1

A New Adiabatic Kernel for the MC2 Model / 269 Hence, for k › 1, ~ 2 q0 ]i j1 › Ci1j1 q0i j2 ~ 1D ∆Z [D 2

Ci2j1

q0i j1 + (1 + N2 ∆t 2 )



1 ∆Z g + G0 2 c2



 Sg0 Q

Ø

4 i j1

~ 1) The top boundary condition wi jNk + 1 › 0 amounts to setting (D i jNk + 1 › 0, hence 2

2

~ 2 q0 ]i jN › Ci4jN q0i jN ~ 1D ∆Z 2 [D k k k

1

Ci3jNk q0i jNk

References 1971. Frequency þlter for time integrations. Mon. Weather Rev. 100: 487{490. BARTELLO, P. and S.J. THOMAS. 1996. The costeffectiveness of semi-Lagrangian advection. Mon. Weather Rev. 124: 2883{2897. ASSELIN, R.A.

BENOIT, R.; M. DESGAGNE, P. PELLERIN, S. PELLERIN, Y. CHARTIER and S. DESJARDINS. 1997. The Canadian MC2: A semi-Lagrangian, semi-implicit wide-band atmospheric model suited for þnescale process studies and simulation. Mon. Weather Rev. 125: 2382{2415. CLARK, T.L. 1977. A small-scale dynamic model using a terrain-following coordinate transformation. J. Comp. Phys. 24: 186{214. CULLEN, M.J.P. 1990. A test of a semi-implicit integration technique for a fully compressible nonhydrostatic model. Q. J. R. Meteorol. Soc. 116: 1253{1258. DAVIES, H.C. 1976. A lateral boundary formulation for multi-level prediction models. Q. J. R. Meteorol. Soc. 102: 405{418. DURRAN, D.R. and J.B. KLEMP. 1983. A compressible model for the simulation of moist mountain waves. Mon. Weather Rev. 111: 2341{2361. EISENSTAT, S.C.; H.C. ELMAN and M.H. SCHULTZ. 1983. Variational iterative methods for nonsymmetric systems of linear equations. SIAM J. Numer. Anal. 2: 345{357. ELMAN, H.C. 1982. Iterative Methods for Large, Sparse, Nonsymmetric Systems of Linear Equations. Ph.D Thesis, Dept. of Computer Science, Research Report 229, Yale University, 180 pp. |||. 1984. Iterative methods for non selfadjoint elliptic problems. In: Elliptic Problem Solvers II, G. Birkhoff and A. Schoenstadt (Eds.), Academic-Press, New York, pp. 271{ 283.

and R.C. SOMMERVILLE. 1975. On the use of a coordinate transformation for the solution of the Navier-Stokes equations. J. Comp. Phys. 17: 209{228. GOLDING, B.W. 1992. An efþcient nonhydrostatic forecast model. Meteorol. Atmos. Phys. 50: 89{ 103.  HEREIL, P. and R. LAPRISE. 1996. Sensitivity of internal gravity waves to the time step of a semiimplicit semi-Lagrangian nonhydrostatic model. Mon. Weather Rev. 124: 972{999. IKAWA, M. 1988. Comparison of some schemes for nonhydrostatic models with orography. J. Meteorol. Soc. Jpn. 66: 753{776. KLEMP, J.B. and R. WILHELMSON. 1978. The simulation of three-dimensional convective storm dynamics. J. Atmos. Sci. 35: 1070{1096.  LAPRISE, R.; D. CAYA, G. BERGERON and M. GIGUERE. 1997. The formulation of the Andre Robert MC2 (Mesoscale Compressible Community) model. In: Numerical Methods in Atmospheric and Oceanic Modelling ATMOSPHERE-OCEAN Special Vol., pp. 195{220. MALEVSKY, A.V. 1996. Spline-characteristic method for simulation of convective turbulence. J. Comp. Phys. 123: 466{475. ||| and S.J. THOMAS. 1997. Parallel algorithms for semi-Lagrangian advection. Int. J. Numer. Methods Fluids, 24: 1{19.  OLIGER, J. and A. SUNDSTROM. 1978. Theoretical and practical aspects of some initial boundary value problems in uid dynamics. SIAM J. Appl. Math. 35: 419{446. PEACEMAN, D. and H. RACHFORD. 1955. The numerical solution of elliptic and parabolic equations. J. SIAM, 3: 28{41. PINTY, J.P.; R. BENOIT, E. RICHARD and R. LAPRISE. GAL-CHEN, T.

270 / S.J. Thomas et al. 1995. Simple tests of a semi-implicit semiLagrangian model on 2D mountain wave problems. Mon. Weather Rev. 123: 3042{3058. RITCHIE, H. and M. TANGUAY. 1996. A comparison of spatially averaged Eulerian and semiLagrangian treatments of mountains. Mon. Weather Rev. 124: 167{181. RITCHIE, TEMPERTON, SIMMONS, HORTAL, C. C. TEMPERTON, A. A. SIMMONS, M. M. HORTAL, T. |||; H., M.HAMRUD. HAMRUD. 1995. ImpleT. DAVIES, DENTand andM. DAVIES, D.D. DENT mentation of the semi-Lagrangian method in a high-resolution version of the ECMWF forecast model. Mon. Weather Rev. 123: 489{514. ROBERT, A. 1981. A stable numerical integration scheme for the primitive meteorological equations. ATMOSPHERE-OCEAN ,19: 19:35{46. 35{46. |||. 1993. Bubble convection experiments with a semi-implicit formulation of the Euler equations. J. Atmos. Sci. 50: 1865{1873. SAAD, Y. 1993. A exible inner-outer preconditioned GMRES algorithm. SIAM J. Sci. Stat. Comput. 14: 461{469. ||| and M. SCHULTZ. 1986. GMRES: A generalized minimal residual algorithm for solving nonsymmetric linear systems. SIAM J. Sci. Stat. Comput. 7: 856{869. SKAMAROCK, W.C.; P.K. SMOLARKIEWICZ and J.B. KLEMP. 1997. Preconditioned conjugate-residual solvers for Helmholtz equations in nonhydrostatic models. Mon. Weather Rev. 125: 587{ 599. SMOLARKIEWICZ, P.K. and L.G. MARGOLIN. 1994. Variational solver for elliptic problems in atmospheric ows. Appl. Math. Comp. Sci. 4: 527{ 551.

||| and |||. 1997 On forward-in-time differencing for uids: An Eulerian/semiLagrangian nonhydrostatic model for stratiþed

ows. In: Numerical Methods in Atmospheric and Oceanic Modelling, ATMOSPHERE-OCEAN Special Vol., pp. 127{152.  and L.G. V. GRUBISI C and L.G. MARGOLIN. |||; 1997MAROn SMOLARKIEWICZ, P.K., V. GRUBISI C forward-in-time for uids: StopGOLIN. 1997 Ondifferencing forward-in-time differencing ping criteriaStopping for iterative solutions of anelastic for uids: criteria for iterative sopressure Weather Rev. Mon. 125: lutions ofequations. anelastic Mon. pressure equations. 647{654. Weather Rev. 125: 647{654. TANGUAY, M.; A. ROBERT and R. LAPRISE. 1990 A semi-implicit semi-Lagrangian fully compressible regional forecast model. Mon. Weather Rev. 118: 1970{1980. |||; E. YAKIMIW, H. RITCHIE and A. ROBERT. 1992. Advantages of spatial averaging in semiimplicit semi-Lagrangian schemes. Mon. Weather Rev. 120: 113{123. TAPP, M.C. and P.W. WHITE. 1976 A nonhydrostatic mesoscale model. Q. J. R. Meteorol. Soc. 102: 277{296.  R. BENOIT, THOMAS, S.J.; A.V. MALEVSKY, M. DESGAGNE,

and M. VALIN. 1997. Massively parallel implementation of the mesoscale compressible community model. Parallel Comput. 23: 2143{2160. WARNER, T.T.; R.A. PETERSON and R.E. TREADON. 1997. A tutorial on lateral boundary conditions as a basic and potentially serious limitation to regional numerical weather prediction Bull. Am. Meteorol. Soc. 78: 2599{2616. P. PELLERIN