A probabilistic analysis of wind gusts using extreme value statistics

Open Access Article Meteorologische Zeitschrift, Vol. 18, No. 6, 615-629 (December 2009) c by Gebr¨uder Borntraeger 2009 A probabilistic analysis o...
Author: Gwenda Wilcox
10 downloads 0 Views 2MB Size
Open Access Article

Meteorologische Zeitschrift, Vol. 18, No. 6, 615-629 (December 2009) c by Gebr¨uder Borntraeger 2009

A probabilistic analysis of wind gusts using extreme value statistics 2 ¨ P ETRA F RIEDERICHS1 ∗ , M ARTIN G OBER , S ABRINA B ENTZIEN1 , A NNE L ENZ1 and R EBEKKA 1 K RAMPITZ 1 Meteorological Institute, University 2 Deutscher Wetterdienst, Offenbach,

of Bonn, Germany Germany

(Manuscript submitted November 14, 2008; in revised form August 12, 2009 ; accepted August 27, 2009)

Abstract The spatial variability of wind gusts is probably as large as that of precipitation, but the observational weather station network is much less dense. The lack of an area-wide observational analysis hampers the forecast verification of wind gust warnings. This article develops and compares several approaches to derive a probabilistic analysis of wind gusts for Germany. Such an analysis provides a probability that a wind gust exceeds a certain warning level. To that end we have 5 years of observations of hourly wind maxima at about 140 weather stations of the German weather service at our disposal. The approaches are based on linear statistical modeling using generalized linear models, extreme value theory and quantile regression. Warning level exceedance probabilities are estimated in response to predictor variables such as the observed mean wind or the operational analysis of the wind velocity at a height of 10 m above ground provided by the European Centre for Medium Range Weather Forecasts (ECMWF). The study shows that approaches that apply to the differences between the recorded wind gust and the mean wind perform better in terms of the Brier skill score (which measures the quality of a probability forecast) than those using the gust factor or the wind gusts only. The study points to the benefit from using extreme value theory as the most appropriate and theoretically consistent statistical model. The most informative predictors are the observed mean wind, but also the observed gust velocities recorded at the neighboring stations. Out of the predictors used from the ECMWF analysis, the wind velocity at 10 m above ground is the most informative predictor, whereas the wind shear and the vertical velocity provide no additional skill. For illustration the results for January 2007 and during the winter storm Kyrill are shown. Zusammenfassung Die r¨aumliche Variabilit¨at von Windb¨oen ist vermutlich a¨ hnlich groß wie die von Niederschlag, allerdings ist das entsprechende Beobachtungsnetz f¨ur Windb¨oen wesentlich d¨unner. Der Mangel an fl¨achendeckenden Beobachtungen erschwert die Verifikation von B¨oenwarnungen. Daher werden in diesem Artikel Methoden zur Erstellung von probabilistischen Analysen von Windb¨oen in Deutschland entwickelt und verglichen. Eine solche Analyse bestimmt Wahrscheinlichkeiten f¨ur das Auftreten von B¨oen oberhalb einer Warnstufe. Hierzu stehen 5 Jahre an st¨undlichen Beobachtungen des mittleren Windes und der Windspitzen an ungef¨ahr 140 Wetterstationen des Deutschen Wetterdienstes zur Verf¨ugung. Methodisch basieren die Verfahren auf der statistischen Modellierung mittels generalisierten, linearen Modellen, Extremwertstatistik und Quantilregres¨ sion. Wahrscheinlichkeiten f¨ur das Uberschreiten von B¨oen-Warnstufen werden u¨ ber verschiedene Pr¨adiktoren gesch¨atzt. Dazu geh¨oren unter anderem der beobachtete mittlere Wind oder die operationellen Analysen der Windgeschwindigkeit in 10 m u¨ ber dem Boden des Europ¨aischen Zentrums f¨ur Mittelfristvorhersagen (EZMw). Die Untersuchung zeigt, dass Verfahren, welche auf den Differenzen zwischen mittlerem Wind und Windspitze als Modellvariable basieren, bessere Wahrscheinlichkeiten sch¨atzten, als Verfahren, die nur auf die Windspitzen oder den B¨oenfaktor wirken. Der Vorteil der Extremwertstatistik als die am besten angepasste und theoretisch konsistente Methode der statistischen Modellierung wird klar herausgestellt. Neben der Windgeschwindigkeit in 10 m H¨ohe der EZMW Analysen stellen auch die beobachteten B¨oen an den umliegenden Stationen informative Pr¨adiktoren dar, w¨ahrend die Windscherung oder die Vertikalgeschwindigkeit keine zus¨atzlichen Verbesserungen erzeugen. Zur Illustration werden die Ergebnisse f¨ur Januar 2007 und w¨ahrend des Wintersturms Kyrill gezeigt.

1 Introduction Gust warnings and thunderstorm warnings including gusts are by far the most frequent type of weather warnings issued by the German weather service Deutscher Wetterdienst (DWD). They are released one order of ∗ Corresponding

author: Petra Friederichs, Meteorological Institute, University of Bonn, Auf dem H¨ugel 20, 53121 Bonn, Germany, e-mail: [email protected]

DOI 10.1127/0941-2948/2009/0413

magnitude more often than all other kinds of warnings, e.g. intense or prolonged rain, snow, fog etc. According to M UNICH R E G ROUP (2005) storms were the most frequent, deadliest and most costly natural disasters in Germany. On the other hand, gusts are one of the most poorly observed atmospheric variables, given their small spatial and temporal scales, which are similar to those of precipitation. In Germany, hardly one gust measurement is taken every 1000 km2 . However, there is about one

0941-2948/2009/0413 $ 6.75 c Gebr¨uder Borntraeger, Berlin, Stuttgart 2009

616

P. Friederichs et al.: A probabilistic analysis of wind gusts

station per 100 km2 measuring daily precipitation, one per 300 km2 for hourly accumulations and a complete rain radar coverage. Furthermore, no remote or in-situ observing system is envisioned at the moment, which could provide a sufficiently dense coverage of gust observations in the foreseeable future. Reliable forecasts of wind gusts offer the potential to mitigate the destruction and human losses caused by gusts and to better plan the time of disruption and the following clean-up operations. Key users of gust warnings are emergency managers, air and rail traffic, energy companies and the general public. Thus it is of great importance to improve the quality of gust warnings. An integral part of the improvement process is the verification of the warnings, which have been issued on a county scale since 2003. Traditionally, such categorical forecasts have been verified ”deterministically”, i.e. a strict threshold is applied to the observations and warnings. For instance, if a warning of gusts above 24 m/s has been issued and the one or maximally two stations in the county observed only 23 m/s, then the event is classified as a false alarm, although the probability is large, that such a wind gust of above 24 m/s has occurred somewhere in the county. It is the aim of this study to estimate this probability for each desired point given all available information, i.e. to derive a probabilistic analysis of wind gusts over Germany. This would enable a probabilistic verification of wind gust warnings that meets the stochastic nature of wind gusts (as one example for extremes in general). G RAY (2003) derives a probabilistic forecast algorithm for convective gusts. He statistically models wind gusts using a Gaussian distribution and relates mean and variance to mean wind speed and estimates for wind gust maxima. The estimates for the wind gust maxima are derived by the algorithm of NAKAMURA et al. (1996), where the gust maximum is the sum of the advection of horizontal momentum and the convection of potential vorticity by the downdraft. As wind gusts are measured as the maximum wind speed observed over a fixed period, their distribution follows at least in an asymptotic limit the generalized extreme value distribution (C OLES, 2001). Furthermore, a wind gust warning is given for different classes of wind gust speed, i.e. an exceedance of a certain threshold. Such exceedances can be statistically described at least in an asymptotic limit using a generalized Pareto distribution. For risk analysis, such as performed by H ENEKA et al. (2006) or K LAWA and U LBRICH (2003), it is desired to estimate probabilities of events that never or very seldomly occurred in the history. Here, extreme value theory provides theoretically consistent probability distributions that aim at modeling the behavior in the extremes. For example, P ERRIN et al. (2006) show that the assumption that wind speeds are approximately Weibull distributed leads to incorrect estimates in the tails of the distributions. It is thus obvious, that a wind

Meteorol. Z., 18, 2009

gust analysis should consider extreme value statistics. The approaches used in this study are based on linear statistical modeling using extreme value theory (EVT; G UMBEL (1958)), but also using generalized linear models (GLMs; M C C ULLAGH and N ELDER (1999)) and quantile regression (KOENKER and BAS SETT, 1978). Logistic regression, which is a GLM for binomial response variables, is widely used in model output statistics (MOS) and performs best among the methods investigated in A PPLEQUIST et al. (2002) in the context of quantitative precipitation forecasting. The approaches presented here are also applicable within a probabilistic MOS. Likewise, empirical downscaling methods that use very similar approaches are applied in a climatological context (P RYOR et al., 2005). Furthermore, a probabilistic analysis as presented here offers the opportunity of an automatic control of the data quality by rating the likelihood of the observations given the estimated probabilities (M ATHES et al., 2008). Those probabilities are estimated from the large scale atmospheric flow and the observed values at the neighboring stations using the GLMs. Different parameters are supposed to determine the probability of a wind gust at a certain station. Firstly, meteorological parameters of the current weather situation determine the individual (in time) probability of a wind gust. Such parameters might be mean wind, pressure, pressure tendency and curvature or vertical stability. They are either available area-wide (from an analysis) or only at the station location. The interplay of those parameters mimics the physical processes thought to be responsible for the generation of gusts (B RASSEUR, 2001). Secondly, stationary parameters of the geographical environment, such as surface roughness or topography, have to be considered (V ERKAIK, 2000). They determine the climatology of wind at a particular location (G ERTH and C HRISTOFFER, 1994). Here, we focus on the weather dependent gust probabilities. The data are introduced in section 2. Seven approaches that are used to estimate the probability of a wind gust to exceed a warning level are presented in section 3. The section also discusses the estimation and verification procedures (section 3.3). The results are discussed in section 4, where first the different approaches are compared and the performance of various predictors is assessed (section 4.2). Section 4.3 discusses the character of wind gusts, and an illustration of the probabilistic analysis is shown in section 4.4. Some guidance towards an area-wide analysis is offered in section 4.5. The results are concluded and an outlook is given in section 5. Finally, some background in EVT is presented in an appendix.

2 Data The wind gust analysis is based on a data set from the observing network of the DWD, where we use the data

Meteorol. Z., 18, 2009

P. Friederichs et al.: A probabilistic analysis of wind gusts

Table 1: Wind gust warnings of the German Weather Service (DWD). Wind gusts at 10 m height over plane and free ground. type of gust near gale gale storm violent storm hurricane force

threshold in m/s > 14 18 − −24 25 − −28 29 − −32 > 33

Table 2: Response variables and methods applied to estimate the probability P (fx > wl) that wind gusts exceed a warning level. glmLR denotes the logistic regression, glmG a GLM with gamma distributed errors, glmLN a log-normal GLM. Response variable and method fx fx-ff gf (1) glmLR (2) glmG (3) glmLN (4) GEVfx (5) GEVfx-ff (6) GEV-gf (7) POTfx

from 139 stations measuring hourly wind maxima (fx) and mean wind (ff) over 10 minutes preceeding the observation date. The resolution of fx and ff is 1 m/s. In the linear model context, such an artificial discretization leads to discrete estimates. Better results are obtained, when the artificial discretization is removed by adding random noise to the measurements. Note, that this randomization is only applied to ff. The observations cover the period from April 1st, 2003 to December 31, 2007. The DWD performs a basic quality control of the incoming data. As additional and area-wide information about the atmospheric circulation we use 10 m wind, wind velocity in 925 hPa, relative vorticity and vertical velocity in 850 hPa, the convectively available potential energy (CAPE) and the available potential energy of a downdraft (DCAPE) from the ECMWF medium range forecasts system (E UROPEAN C ENTRE FOR M EDIUM R ANGE W EATHER F ORECASTS, 2007) over Germany. The ECMWF forecast system provides an analysis on a 1◦ × 1◦ grid. In order to account for the seasonal cycle, we separated the data set into a cold (winter) season, November to March (NDJFM), and a warm (summer) season, May to September (MJJAS). The DWD gives wind gust warnings separated by 5 warning levels (wl) as displayed in Tab. 1.

3 Methods Here, we explore seven approaches to estimate the probability of a wind gust to exceed a warning level (wl) given the information at hand. Five methods are applied to three response variables, which are the wind gusts (fx), the differences between the observed wind gusts

617

and the mean wind (fx-ff), and the gust factor (gf) as used for instance by W EGGEL (1999) and J UNGO et al. (2002). It is defined by the normalized maximum wind gust speed as

gf =

fx − 1. ff

(3.1)

The seven approaches are as follows (Tab. 2). (1) A semi-parametric approach uses logistic regression to model the conditional probability of the binary event that a wl is exceeded (glmLR). (2) A generalized linear model estimates the conditional gamma distribution of the differences fx-ff, which should always be positive (glmG). (3) Another positive response variable is gf. W EGGEL (1999) assumes that gf follows a lognormal distribution. So a GLM with log-normal error terms is applied to model gf (glmLN). Those three approaches employ GLMs to derive the probability of the exceedance of a warning level. The next four approaches use extreme value theory to derive exceedance probabilities. (4) As the wind gusts are block (1h) maxima they are assumed to follow a non-stationary generalized extreme value distribution (GEVfx). Likewise, this assumption is also used for the (5) differences fx-ff (GEVfx-ff) and for the (6) gust factor gf (GEVgf). (7) A last approach models threshold excesses of fx using extreme value theory in the sense of a non-stationary peak-over-threshold (POTfx) approach. All approaches are used to derive the conditional exceedance probabilities of P (fx > wl) given a predictor variable. The performance of the seven approaches is compared using the Brier skill score (B RIER, 1950; J OL LIFFE and S TEPHENSON , 2003). The seven approaches are now described in more detail.

3.1 Generalized linear models GLMs are generalized regression models. The generalization with respect to standard linear regression models has two aspects. The first is the distributional assumption. It assumes that the response variable Yi belongs to the exponential family (FAHRMEIR and T UTZ, 2001) with natural parameter θi and constant scale parameter Φ. i is a time or spatial index. The second is a structural assumption, i.e. the conditional expectation µi = E[Yi |Xi ] is modeled through a function µi = h(ηi ) of the linear predictor ηi = Z(Xi )T β. Xi is the predictor variable and Z(Xi ) is a design vector which is a function of the (multivariate) predictor variable X, e.g. Z(Xi ) = (1, X1i , . . . , Xmi )T , where m is the dimension of X. The inverse function h−1 (µi ) = ηi is called link function. If h−1 (µi ) is the so called canonical link function, then the linear predictor is the natural parameter θ of the exponential distribution and θi = h−1 (µi ). A GLM is then formulated as E[Yi |Xi ] = h(ηi ) = h(Z(Xi )T β).

(3.2)

618

P. Friederichs et al.: A probabilistic analysis of wind gusts

Meteorol. Z., 18, 2009

Together with the distributional assumption, the probabilistic behavior of Yi is fully determined. For further details the reader is referred to FAHRMEIR and T UTZ (2001). This study employs three GLMs: the logistic regression (glmLR), a GLM assuming a gamma distribution (glmG), and a GLM assuming a log-normal distribution for the error terms (glmLN). The glmLR models the conditional probability πi = P rob(fxi > wl) of a wind gust exceeding a warning level. µi = πi is the expectation value of a Bernoulli distributed random variable (i.e. the binary response variable: warning level exceeded). The canonical link function of the Bernoulli distribution, which is used as link function in this context, is the logit function, so that πi = h(ηi ) =

exp ηi . 1 + exp ηi

(3.3)

In this respect, the glmLR estimates the 1 − F (wl) distribution function of fx at a a-priori defined wl, where F (wl) is the conditional distribution function of fx at wl. As the model is parametric, but no parametric distribution is assumed for the wind gust process, the approach is denoted as a semi-parametric approach to derive the probability P (fx > wl|Xi ) = πi .

(3.4)

Note, that the glmLR has to be estimated for each warning level wl separately. The gamma GLM (glmG) provides a parametric model for the differences fx-ff. Those differences are assumed to follow a gamma distribution with the conditional expectation value µi = h(ηi ) =

1 . ηi

(3.5)

The canonical link function of the gamma distribution is the inverse function. The exceedance probabilities 1 − FΓ (wl) are estimated from the gamma distribution at wl − ffi with expectation value µi , scale parameter Φ and P (fx > wl|Xi ) = 1 − FΓ (wl − ffi |µi ; Φ).

(3.6)

In cases, where ff is larger than wl the exceedance probability is set to one, which means that in this case the wind gust exceeds the warning level with probability one. The glmLN approach assumes that the response variable gf follows a conditional log-normal distribution. It can be easily implemented into a GLM by modeling log(gf) using the Gaussian GLM. The canonical link function is the identity. Here the exceedance probabilities 1 − FN (wl) follow from the normal distribution at ′ T log(1 − wl/ffi ) with expectation √ value µi = Z(Xi ) β and standard deviation σ = Φ P (fx > wl|Xi ) = 1−FN (log(1−wl/ffi )|µ′i ; σ). (3.7)

Figure 1: BSS of probability estimates for each station during the winter. The stations are ordered such that the BSS increases with station number. The training was performed for 12UTC, the estimation and verification for 11-13UTC using cross-validation. The 95 % uncertainty interval of the BSS for GEVfx-ff as estimated by the bootstrap method is indicated by the shaded area. The warning levels are a) 14 m/s, b) 18 m/s, and c) 25 m/s. For the abbreviations of the methods see Tab. 2.

3.2 Extreme value theory As the gusts fx are defined as maxima over a fixed time period (block), here of 1 hour, it is assumed that EVT provides an appropriate distribution for wind gusts. EVT proves under very general conditions that maxima

P. Friederichs et al.: A probabilistic analysis of wind gusts

Meteorol. Z., 18, 2009

619

where Z is again the design vector function. The shape parameter is assumed to be stationary. This assumption is suggested by WALSHAW (1994) and seems also reasonable in our case, as discussed later. The hyperparameter vectors γ and ρ are estimated using the maximum likelihood method and the exceedance probability is estimated from P (fx > wl|Xi ) = 1 − FGEV (wl|αi , σi ; ξ).

(3.9)

Analogously, the exceedance probabilities using fx-ff or gf are P (fx > wl|Xi ) = 1 − FGEV (wl- ffi |α′i , σi′ ; ξ ′ ) (3.10) or P (fx > wl|Xi ) = 1 − FGEV (1-wl/ff |α′′i , σi′′ ; ξ ′′ ). (3.11) Note, that the model parameter estimates are different for the three response variables. The other EVT approach is a non-stationary peak over threshold approach (POTfx). The POTfx models excesses over a threshold u using the GPD which has two parameters, a scale parameter σu and the shape parameter ξ. Again ξ is held constant and σu is nonstationary with Figure 2: Same as Fig. 1 but during the summer. The warning levels are a) 14 m/s and b) 18 m/s.

of large samples are asymptotically distributed following the generalized extreme value distribution (GEV). Likewise, excesses over large thresholds asymptotically follow a generalized Pareto distribution (GPD). For a very short introduction to EVT see the appendix. For further insight the reader is referred to C OLES (2001), who gives an introduction to the subject. PALUTIKOF et al. (1999) discuss methods to calculate extreme wind speeds. We explore two ways of statistical modeling of wind gusts using extreme value theory. The first (GEVfx) assumes fx to be in the asymptotic limit of the GEV. GEVfx-ff and GEVgf apply to fx-ff and gf, respectively. The GEV has three parameters denoted as location parameter α, scale parameter σ and shape parameter ξ, where ξ determines the character of the extremes. Those parameters are non-stationary as the distributions of the wind gusts fx and likewise fx-ff and gf strongly vary over time. In order to account for this non-stationarity, the GEV parameters are assumed to depend on a predictor variable X. This dependency is modeled by a linear regression such that

(3.12)

The threshold is not stationary, neither, as the extremal limit for wind gusts also strongly varies in time. Following F RIEDERICHS (submitted) the threshold is defined as the quantile function at an a-priori defined probability τ . The τ quantile now linearly depends on the predictor variable with ui = Qfx (τ ) = Z(Xi )T βτ

(3.13)

and P (fx ≤ ui |Xi ) = τ . The coefficients of the conditional quantile model βτ are estimated using quantile regression (KOENKER, 2005; F RIEDERICHS and H ENSE, 2007, 2008). So the POTfx approach models the distribution of fx at wl ≥ ui as P (fx ≤ wl|Xi ) = P (fx ≤ wl|fx > ui , Xi )P (fx > ui |Xi ) +P (fx ≤ ui |Xi ) = FGP D (wl − ui |σui ; ξ)(1 − τ ) + τ. (3.14) In case of wl < ui , (3.14) is not defined and the value of glmLR is used instead. For more details on the POTfx approach the reader is referred to F RIEDERICHS (submitted).

3.3 Estimation and verification

αi = Z(Xi )T γ σi = Z(Xi )T ρ ξi = ξ

σui = Z(Xi )T ρu .

(3.8)

In order to derive the ’best’ wind gust analysis, we have to detect the ’best’ statistical model. Here, the quality of a model is assessed by the Brier skill score (BSS).

620

P. Friederichs et al.: A probabilistic analysis of wind gusts

Meteorol. Z., 18, 2009

Figure 4: Same as Fig. 3 but during the summer. The warning levels are a) 14 m/s and b) 18 m/s.

Figure 3: Same as Fig. 1, but now the conditional probabilities are estimated and verified for 1–23 UTC. The warning levels are a) 14 m/s, b) 18 m/s, and c) 25 m/s.

where the indicator function I(A) is 1 if the argument A is true, and zero if A is false. Sampling uncertainty and confidence intervals of the Brier score are derived in B RADLEY et al. (2008). In order to account for serial dependencies, we use the bootstrap method to estimate sampling uncertainty (see below). The Brier score BSpre uses the statistical model prediction P (fx > wl|Xi ) and BSref uses a reference model. The reference here is chosen as the marginal distribution of the wind gusts at the respective station, or in other words the station climatology. A skill score is then defined as BSS = 1 −

The Brier score was introduced by B RIER (1950) and is a quadratic error measure defined as E[(ˆ p − Y )2 ], where pˆ is the forecast probability and Y ∈ {0, 1} is a binary response variable. The Brier score is a proper scoring rule (G NEITING and R AFTERY, 2007), i.e. is minimized for a perfect forecast where pˆ = 1(= 0) if Y = 1(= 0). It is estimated by a summation over the verification sample with

BS =

X i

[I (fxi > wl) − P (fx > wl)]2 ,

(3.15)

BSpre . BSref

(3.16)

The value of the BSS ranges between −∞ and 1 for a perfect forecast. It is given in % and represents the relative gain of a probability estimate against a reference forecast. A useful decomposition into reliability, resolution and uncertainty of the observations is given in M URPHY (1973). The modeling approaches separate into the training of the statistical model, where the model parameters are estimated, and the prediction, where the model is applied to the predictor variables and a probability P (fx > wl|Xi ) is derived. The training and prediction should be made on independent data. We thus use cross-validation

Meteorol. Z., 18, 2009

P. Friederichs et al.: A probabilistic analysis of wind gusts

where we take out one year of data during the training of the statistical model and then derive the predicted probabilities for this withheld year. By successively taking out each year of the time sequence 2003-2007 we derive P (fx > wl|Xi ) for the complete period. These predicted probabilities are then used for verification by means of the BSS. Furthermore, the training is performed only on the data at one hour of the day, namely 0 h, 6 h, 12 h, or 18 h, whereas the predictions are derived for each full hour of the day. The reason for this procedure is primarily to reduce the time dependencies in the training data. B RABSON and PALUTIKOF (2000) demonstrate that correlated extreme events distort the shape parameter estimate leading to anomalously high gust speed estimates. Although there exists a temporal dependency from one day to another, it is assumed here, that this dependency is contained in the predictor variable and that the residuals are independent; or in other words that the hourly gusts are conditionally independent. Another reason is that we want to perform the training of the statistical models on a significantly smaller data set than the verification, in order to emphasize the differences between the methods. It turned out that the 12 h and 18 h provide the most skilful models also for the other observation times (not shown), and that the 0 h training time showed the lowest BSS. Variations in the diurnal cycle are ignored here.

4 The probabilistic analysis of wind gusts 4.1 The predictive skill – models We start with an investigation of the seven approaches presented in section 3. The different statistical models are applied to each station separately, so we obtain a BSS for each of the 139 stations. The stations are ordered such that the BSS increases, in oder to permit the visual separability of the graphs. The BSS is thus not compared for each station separately. Rather, Figs. 1 and 2 represent estimates of the distribution function of the BSS over the stations for different methods (see Table 2) and for the warning levels 14 m/s, 18 m/s and 25 m/s. However, if the differences are significant then this is the case for a large majority of the stations. The training is performed for the 12h date, and the prediction and verification is based on the 11 h–13 h dates. As there are only few wind gusts above 25 m/s during summer which hampers the verification for most of the stations, the summer plot for the BSS of the 25 m/s warning level is omitted. Wind gusts above even higher warning levels are too rare to be meaningfully verified. The sampling uncertainty of the BSS is estimated using the bootstrap method (E FRON and T IBSHIRANI, 1993), where 1000 bootstrap samples are drawn out of the predicted probabilities and observed fx. In order to account for the serial dependencies, we resampled while keeping data of two

621

successive days together. The 95 % sampling interval of the BSS is indicated by gray shading and is shown only for the predicted probabilities using GEVfx-ff and ff as predictor. The differences between the methods is remarkable. For wl = 14 m/s, the methods that apply to the differences fx-ff perform better in terms of the BSS than those using the gust factor gf. Least skill is obtained with the approaches that apply directly to fx. The superiority of GEVfx-ff becomes evident at more extreme levels, whereas the glmG strongly degrades. The gamma distribution assumes an exponential decay of the probabilities for higher levels comparable to a GEV with a zero shape parameter. However, the shape parameter estimates of the GEV are negative for most stations which has the consequence that the wind gusts are indeed bounded. The shape parameter is discussed in more detail thereinafter. The performance of the semiparametric glmLR is very limited and strongly degrades for higher warning levels. This indicates the advantage of an appropriate parametric model, where the model training relies on the complete range of the values. In contrast, for glmLR the uncertainty in the estimation of the model parameter largely increases the higher the warning level and the less events (fxi > wl) occur. In order to enlarge the size of verification data, Figs. 3 and 4 show the BSS for the probabilities estimated for 123UTC with the statistical model trained at 12UTC. The BSS of GEVfx-ff only slightly decreases indicating the model estimated from the 12 o’clock date is representative for the other times of the day. Even less sensitive are the approaches using the gust factor. Here, the BSS remains almost constant or even increases for the higher levels due to the larger sample. The methods using fx as response variable are less portable for other times of the day. Thus, GEVfx-ff is the most appropriate statistical model to estimate the exceedance probabilities at large values. Results based on the ignorance score (ROUL STON and S MITH , 2002) confirm this superiority (not shown) which is very robust for all warning levels and during winter and summer. In order to assess the goodness-of-fit for the GEVfx-ff we investigated the residual quantile plots (C OLES, 2001) for each station separately (not shown). Although the largest few quantiles are underestimated by the GEVfx-ff model for many stations, the goodness-of-fit is reasonably good. As mentioned before, the shape parameter is not allowed to depend on the covariate. F RIEDERICHS (submitted) shows that the estimation of a variable shape parameter introduces large uncertainties to the model parameter estimates. And indeed, a variable shape parameter did not increase prediction skill, in contrast, the uncertainty of the parameter estimates impinges the Brier score for all stations (not shown).

622

P. Friederichs et al.: A probabilistic analysis of wind gusts

Table 3: Predictor variables and abbreviations. ff

observed mean wind at a station

V10

wind velocity at 10 m above ground

V925

wind velocity at 925 hPa

Vmean

vertical mean wind ((V 10 + V 925)/2)

Vshear

wind shear (V 10 − V 925)

W850

vertical wind velocity at 850 hPa

VO850

relative vorticity at 850 hPa

CAPE

convectively available potential energy

DCAPE

available potential energy of a downdraft

Prox ff

mean wind observations at stations located in a radius of 100 km

Prox fx

wind gust observations at stations located in a radius of 100 km

Table 4: Correlation between the BSS of the probability estimates at the station using V10 as predictor and its seasonal mean wind, latitude and altitude, and the correlation with the GEVfx-ff hyperparameter estimates of γ0 , γ1 , ρ0 , ρ1 and ξ (see Eq. 3.8).

BSS (14 m/s) BSS (18 m/s) BSS (25 m/s) BSS (14 m/s) BSS (18 m/s) BSS (25 m/s) BSS (14 m/s) BSS (18 m/s) BSS (25 m/s) γ0 γ1 ρ0 ρ1 ξ

Winter Summer ff mean 0.72 0.82 0.78 0.78 0.34 0.53 Latitude 0.08 0.27 0.20 0.17 0.08 -0.09 Altitude 0.23 0.17 0.25 0.18 0.20 0.14 0.35 0.51 0.75 0.41 0.14

0.22 0.08 0.61 0.20 0.05

4.2 The predictive skill – predictors Due to the superiority of GEVfx-ff, the different predictor variables are now investigated only with GEVfx-ff. Figs. 5 and 6 display the BSS for different predictor variables. The predictor variables are given in Table 3. The ECMWF operational analysis (see Tab. 3 for abbreviations) of V10, V925, VO850 and W850 are taken at the nearest grid point to the station. CAPE and DCAPE are calculated from the temperature and relative humidity profile for the nearest grid point column. CAPE is the maximum buoyancy that a parcel experiences if it is lifted from 1000 hPa with constant equivalent potential temperature, likewise, DCAPE is

Meteorol. Z., 18, 2009

the maximum acceleration that a parcel experiences if it is moved downward from 500 hPa with constant pseudo equivalent potential temperature. With ff as the only predictor the BSS for wl = 14 m/s ranges between 20 % and 80 % (Figs. 5 and Fig. 6). The reason why V10 performs better than ff, at least for the lower warning levels, might be due to the quality of the observations of ff: firstly, they are raw observations and not a smoothed analysis and secondly, the original resolution of ff is 1m/s, whereas V10 is quasi continuous. No additional skill is obtained when V925, Vmean or Vshear are added as predictors (not shown), rather the model becomes more complex and skill is slightly reduced for high warning levels. CAPE and DCAPE, VO850 and W850 fail to provide an informative predictor. Nevertheless, they might provide some information, and for winter, CAPE and DCAPE are more skilful than W850 or VO850. However, the skill enhancement by including CAPE and DCAPE as a predictor in addition to ff + Prox fx is not significant (not shown). The best results in terms of the BSS are obtained with a combined predictor containing ff at the respective station and fx at the stations located in a radius of 100 km (Prox fx). The BSS (Figs. 1 to 6) shows large differences across the stations which largely exceed the sampling uncertainty. One reason for those large differences might be the data quality. Another reason are local influences that weaken the relationship between large scale circulation and gusts, e.g surface roughness might increase turbulence and hence reduce predictability. Interactions with the orography might induce dependencies from wind directions over complex terrain that are ignored in the statistical model presented here. The correlations between the BSS and the altitude are indeed different from zero and vary between 0.14 and 0.25, which is marginally significant. The dependency of the BSS on the latitude is even less significant. The BSS is also related to the climatology of the mean wind. Tab. 4 displays the correlations between the seasonal mean wind (i.e. the mean over all 10 min mean values ff for winter and summer, separately) and the BSS, which amounts to 0.72 (wl = 14 m/s, winter) and 0.82 (wl = 14 m/s, summer). The correlations are smaller for wl = 25 m/s with 0.34 (winter) and 0.53 (summer). If the mean wind is larger, then the probability to observe a wind gust above a certain warning level increases. Thus the less rare an event the better is the estimate of an exceedance probability. Unlike the BSS, which is less dependent on the altitude of the station location, the parameter estimates of the non-stationary GEV model show large dependencies. The location and scale parameters show significant correlations with the altitude of the station, which is stronger during winter than during summer. Particularly the correlations to γ1 and ρ1 are consistent with the correlations between BSS and altitude1 . γ1 (ρ1 ) is 1 Note,

that the predictor variable (here ff) is normalized.

1.0

ff V10 W850 VO850 CAPE + DCAPE Prox fx ff + Prox fx

BSS 0.4 0.6 0.2

BSS 0.4 0.6

0.0

0.2 0.0

b)

40

60 80 Station

100

120

20

b)

40

60 80 Station

100

120

140

100

120

140

ff V10 W850 VO850 CAPE + DCAPE Prox fx ff + Prox fx

0.0

0.0

0.2

0.2

BSS 0.4 0.6

0.8

ff V10 W850 VO850 CAPE + DCAPE Prox fx ff + Prox fx

BSS 0.4 0.6

0

140 1.0

20

0.8

1.0

0

0

20

c)

40

60 80 Station

100

120

140

ff V10 W850 VO850 CAPE + DCAPE Prox fx ff + Prox fx

BSS 0.4 0.6

0.8

1.0

a)

0.8

ff V10 W850 VO850 CAPE + DCAPE Prox fx ff + Prox fx

0.8

1.0

a)

623

P. Friederichs et al.: A probabilistic analysis of wind gusts

Meteorol. Z., 18, 2009

0

20

40

60 80 Station

Figure 6: Same as Fig. 5 but during summer. The warning levels are a) 14 m/s and b) 18 m/s.

0.2

be modeled explicitly, which is shortly discussed in the conclusions section 5.

0.0

4.3 The character of wind gusts 0

20

40

60 80 Station

100

120

140

Figure 5: BSS of probability estimates for each station using GEVfx-ff with different predictors during winter. The training was performed for 12 UTC the estimation and verification for 1–23 UTC using cross-validation. The 95 % uncertainty interval of the BSS for GEVfx-ff using ff as predictor as estimated by the bootstrap method is indicated by the shaded area. The warning levels are a) 14 m/s, b) 18 m/s, and c) 25 m/s. For the abbreviations of the predictors see Tab. 3.

the hyperparameter responsible for the slope of the location (scale) parameter of the GEV with respect to the predictor ff. The investigation of the local variations of the GEV parameters is an important aspect but beyond the scope of this study. However, it becomes indispensable when the station data are pooled together in order to obtain more stable estimates particularly for the higher warning levels. Then those local dependencies have to

The shape parameter shows no significant correlation with the altitude. However, as it is the essential parameter for the character of the extremes it is displayed in Fig. 7. The shape parameter estimates as derived with fx-ff (black dots) as well as with fx (gray dots) are shown. With only a few exceptions, the shape parameter estimates are negative and vary between zero and –0.3. For fx the shape parameter estimates are even more negative than for fx-ff. If the shape parameter of the GEV is zero, then the support of the GEV is [−∞, ∞]. However, for ξ 6= 0 the GEV is defined for y satisfying 1 + ξ y−µ > 0 (comσ pare Appendix, Eq. 6.3), where y is the response variable ff-fx. Thus, if ξ is negative then y has an upper end point with w = σ/|ξ| + µ. In Fig. 8 the upper bound w is displayed against the predictor ff. The shaded areas indicate the upper bound estimates for 95 % of the stations for winter and summer, respectively. Additional lines give the estimates of the outermost stations. The mean wind ff reaches higher values during winter with about 39 m/s compared to the summer with highest values of about 30 m/s. For most stations the upper bound

624

P. Friederichs et al.: A probabilistic analysis of wind gusts

Meteorol. Z., 18, 2009

−0.3

Shape −0.1

a)

0

20

40

60 80 Station (ordered by ff)

100

120

140

20

40

60 80 Station (ordered by ff)

100

120

140

−0.3

Shape −0.1

0.0

b)

0

Figure 7: GEV shape parameter estimated using GEVfx-ff (black dots) and GEVfx (gray dot) and 1σ confidence interval for a) winter and b) summer. Here, GEVfx-ff is applied to fx-ff and GEVfx to fx with ff as predictor.

has no practical consequence as values above 100 m/s are very unlikely to ever occur. However, for the stations with the lowest upper bound estimates it is definitely relevant and ranges between 10 m/s for ff = 0 and 45 m/s for ff = 15 m/s. Note, however, that the upper bound estimates themselves are uncertain due to the uncertainty of the parameter estimates which largely increase with ff. The physical suggestion of such negative shape parameter estimates is that wind gusts are bounded by an upper limit. This contrasts physical considerations which would assume wind gusts to display a heavy tail distribution, as they are consequences of large scale turbulence and dissipation. Why the data exhibit bounded distributions remains an open question. It might be due to the observing process. But also non-negligible serial dependencies might be responsible for the negative shape parameter estimates for the wind gusts, which are indeed observed in several studies, i.e. hourly maxima at High Bradfield, UK (FAWCETT and WALSHAW, 2006b), daily maximum wind speed at a German gauging station ¨ W¨urzburg (PAYER and K UCHENHOFF , 2004) or hourly gusts from Shetland and Scotland (B RABSON and PA LUTIKOF, 2000)2 The most probable reason for the negative shape parameter, however, is that hourly wind maxima are not within the ultimate asymptotic limit of the block maxima distribution. F URRER and K ATZ (2008) describe 2 There

are two common conventions in extreme value theory for the representation of the shape parameter and B RABSON and PALUTIKOF (2000) use the κ parameter with κ = −ξ.

the concept of penultimate approximations: although the process limit is the Gumbel distribution, the more accurate approximation for the respective block maxima distribution has a non-zero shape parameter. They provide an estimate of the shape parameter of the penultimate approximation for a Weibull distributed random variable with shape parameter k as 1−k , (4.1) k ln n where n is the block size (or effective degrees of free¨ dom). VAN DEN B RINK and K ONNEN (2008) estimated the shape parameter k of the Weibull distribution for ERA40-reanalysis 10 m wind velocity and obtained values between k = 1.4 and k = 2.4 (VAN DEN B RINK ¨ and K ONNEN , 2008, auxiliary material). When assuming an effective number of 10 (5) degrees of freedom within one month and k = 1.7 then the penultimate GEV shape parameter amounts to ξn = −0.14 (−0.21). These values correspond well to those obtained for the hourly wind gusts. Furthermore, the shape parameter estimate becomes less negative with increasing block size. ξn =

4.4 Example We continue with an example for the GEVfx-ff approach using ff and Prox fx as predictors. Fig. 9 shows the linearly interpolated probability estimates (gray shading) for the 17h3 date of the day of the winter storm Kyrill. 3 After

17 h, some stations reported missing values.

P. Friederichs et al.: A probabilistic analysis of wind gusts

Meteorol. Z., 18, 2009

625

100

Upper endpoint in m/s 200 300

400

tions even wind gusts above 18 m/s were recorded. Likewise, the estimates of the exceedance probabilities for the 14 m/s are almost everywhere above 0.95 or even 1, which is the case when the mean wind already amounts to 14 m/s. The probabilities for the 18 m/s warning level are smaller but still large over wide areas. An exceedance of the 25 m/s warning level is still recorded at more than half of the stations. Regions with high probabilities and observed exceedances largely agree reflecting the goodness of the probability estimates. Fig. 10 shows the estimated probabilities at wl = 14 m/s (gray bars) and the wind gust estimates for the Frankfurt/Main station for each hour during the month January 2007. If, for example, an event is assumed to occur at a probability P (f x > 14 m/s) > 0.1, the hit rate is 0.95 together with a false alarm rate of about 0.05. The BSS amounts to 0.53 (not cross-validated), the Brier score amounts to 0.046 and splits into an uncertainty component of about 0.0264, a resolution component of about 0.0163 and a reliability component of 0.0045. Frankfurt/Main station has 7 neighboring stations within a radius of 100 km denoted by a superscript, and together with the predictor ff the statistical model reads P (fx > wl|Xi ) = 1 − FGEV (wl − ffi |αi , σi ; ξ) (4.2) and

Winter

0

Summer 0

5

10

15 20 ff in m/s

25

30

Figure 8: Upper bound w against predictor ff for winter and summer estimated with GEV applied to fx- ff using ff as predictor. The shaded area contains the upper end point estimates for 95 % of the stations (not cross-validated) for winter and summer, respectively. The additional solid (dashed) lines give the estimates of the outermost stations during winter (summer), i.e. the stations with the lowest or highest upper end points.

The linear interpolation is the easiest way to derive an area-wide analysis of wind gusts. A discussion on how to derive more appropriate probability estimates is provided in the conclusions section 5. Fig. 9 also indicates by bullets whether the respective warning level was exceeded by the observations. Formed over Newfoundland on January 15, 2007, Kyrill evolved into the most severe winter storm that hit Europe since Lothar in 1999. It produced its largest damages in Germany during the late afternoon. Wind gusts as high as 40 m/s (144 km/h) in D¨usseldorf and 54 m/s (194 km/h) on the Brocken in the Harz mountains were recorded by the DWD. At 17h, all stations registered wind gusts above 14 m/s, and except for 13 sta-

αi = 2.67(±0.05) + 0.09(±0.12)ffi − 0.02(±0.09)fx 1i

− 0.05(±0.08)fx 2i + 0.41(±0.10)fx3i + 0.04(±0.11)fx 4i + 0.16(±0.10)fx 5i + 0.16(±0.09)fx 6i + 0.27(±0.03)fx7i

σi = 1.10(±0.03) − 0.10(±0.7)ff i + 0.10(±0.6)fx 1i − 0.06(±0.7)fx 2i + 0.02(±0.5)fx 3i − 0.01(±0.7)fx 4i + 0.04(±0.8)fx 5i + 0.10(±0.8)fx 6i + 0.04(±0.6)fx 7i ξ = −0.05(±0.02),

(4.3)

with fx at stations 1 Bendorf, 2 Bad Marienberg, 3 Giessen, 4 Hahn, 5 Michelstadt, 6 Weinbiet, and 7 Mannheim. The uncertainty of the parameter estimates is indicated by ± the standard error. Bold coefficients are significantly different from zero at the 95 % significance level. The dependence on ff is not significant, both for the location α and the scale σ parameter of the non-stationary GEV. The largest contribution to the non-stationarity results from the wind gusts at the stations Giessen and Mannheim. However, they only affect the location parameter, whereas the contribution to the non-stationary scale parameter is weak. In order to obtain a more stable model, a regression screening could be applied to filter out the less informative neighboring stations.

4.5 Towards an area-wide analysis The results displayed in Figs. 5 and 6 give an upper bound for the predictability of the wind gusts given

626

P. Friederichs et al.: A probabilistic analysis of wind gusts

a)

b)

1.0

54

0.8

52

0.4 50

8

10

12

14

0.4 50

0.2

6

0.6 lat

lat

lat

0.6

0.4

fx14m/s

0.8

52

0.6

48

1.0

54

0.8

50

c)

1.0

54

52

Meteorol. Z., 18, 2009

0.2

48

fx18m/s

0.0 6

8

lon

10

12

0.2

48

fx25m/s

0.0

14

6

lon

8

10

12

0.0

14

lon

Figure 9: Estimated wind gust probabilities (gray shading) for a) 14 m/s, b) 18 m/s and c) 25 m/s warning level and observed gusts (dots) during the Kyrill storm on January 18th , 2007 at 17 UTC. The probabilities are estimated using GEVfx-ff and ff and Prox fx as predictors.

1.0 25

0.6

20 15

0.4

10

0.2

5

0.0 1Jan,12h

10Jan,12h

20Jan,12h

Wind gusts in m/s

Probability fx>14m/s

30

0.8

0 30Jan,12h

Figure 10: Estimated wind gust probabilities (bars) for 14 m/s warning level and observed gusts (dots) for the Frankfurt/Main station during January 2007. The probabilities are estimated using GEVfx-ff and ff and Prox fx as predictors.

the predictor variables ff and Prox fx. However, if an area wide analysis is desired, then the observed mean wind ff is missing at most locations. As the probability estimates are much better with fx-ff instead of ff, ff has to be replaced by some area wide predictor which is the case for the ECMWF analysis of the 10 m wind (V10). Additionally to fx-V10 we also tested fx-Prox ff, where Prox ff is the mean of the mean wind ff observed at the stations within a radius of 100 km. The results for different predictor variables are displayed in Fig. 11 for the 14 m/s warning level. Clearly less skill is obtained when fx-V10 instead of fx-ff is used as response variable. If only V10 is used as predictor variable, the BSS is significantly lower than the approach from Figs. 5 and 6 with the predictor ff. Including Prox fx as predictor increases the skill of the probability estimates, and during winter reaches almost the skill of ff with fx-ff. Further improvement is obtained by including the mean ff observed at the stations within a radius of 100 km (Prox ff). No skill is lost if Prox ff is used instead of V10, in contrast it seems to improve the

estimates during summer. Thus even if ff is unknown, fx-V10 and fx-Prox ff give skilful probability estimates at least for the winter gusts. During summer it seems that the knowledge of ff is more important. All estimates that are derived using the differences fx-ff, fx-V10 or fx-Prox ff provide significantly more skill than estimates derived from the wind gusts fx only (not shown).

5 Conclusions Several approaches based on linear statistical modeling using extreme value theory, generalized linear models and quantile regression are applied in order to derive conditional probabilities of the occurrence of wind gusts above a certain warning level. The study shows that the approaches that use the differences fx-ff perform best in terms of the Brier skill score. Those statistical approaches that model directly fx show a stronger diurnal cycle, with the consequence that the performance largely degrades when the statistical model is trained for one time of the day and applied to another. Here,

1.0

a)

ff V10 (fx−V10) Prox fx (fx−V10) V10 + Prox fx (fx−V10) Prox ff + V10 + Prox fx (fx−V10) Prox ff + V10 + Prox fx (fx−Prox ff)

BSS 0.4 0.6 0.2 0.0

0.0

0.2

BSS 0.4 0.6

b)

0.8

ff V10 (fx−V10) Prox fx (fx−V10) V10 + Prox fx (fx−V10) Prox ff + V10 + Prox fx (fx−V10) Prox ff + V10 + Prox fx (fx−Prox ff)

0.8

1.0

627

P. Friederichs et al.: A probabilistic analysis of wind gusts

Meteorol. Z., 18, 2009

0

20

40

60 80 Station

100

120

140

0

20

40

60 80 Station

100

120

140

Figure 11: BSS of probability estimates for each station using the non-stationary GEV with different predictors. In contrast to Figs. 5 and 6, the GEV is applied to fx-V10 or fx-Prox ff, respectively. The training was performed for 12 UTC, the estimation and verification for 1–23 UTC using cross-validation. The 95 % uncertainty interval of the BSS as estimated by the bootstrap method is indicated by the shaded area. The warning levels are 14 m/s for a) winter and b) summer.

using fx-ff and particularly gf results in statistical models that are much less dependent on the training hour. The non-stationary GEV model for fx-ff performs much better than the other approaches, particularly for higher warning levels. So this study clearly shows the benefit from using extreme value theory as the most appropriate and theoretically consistent statistical model. It indicates that logistic regression, which is generally used in MOS, is less appropriate for high warning levels, i.e. in the tails of the distributions. The results encourage the use of extreme value distributions also within a probabilistic MOS. The most informative predictor is the ECMWF 10 m wind velocity, whereas the vertical wind in 850hPa or the wind shear are no informative predictors. The best results in terms of the BSS are obtained with a combined predictor containing ff at the respective station and fx at the stations located in a radius of 100 km (Prox fx). In cases where an area wide analysis is desired, the observed mean wind ff is missing at most locations. V10 from the ECMWF operation analysis or the mean over the mean wind recorded at the stations within a radius of 100 km are skilful replacements for ff. Even better results might be obtained using predictors from high resolution model analysis, provided that the training data is long enough. The analysis of wind gusts presented here constitutes a first step towards an area-wide analysis of wind gust probabilities. Further aspects have to be included into the statistical model approach: First, local characteristics such as altitude, surface roughness or topography need to be included in the sense that the statistical model not only captures the non-stationary response to a predictor variable but also the locally varying but stationary influence of surface parameters, or even interactions between local and time-varying conditions. Secondly, a better representation of the seasonal and diurnal cycle

is desired. This study largely ignores the diurnal cycle and the seasonal cycle is accounted for by separating the data set into a warm and a cold season. A better representation has the potential to increase the skill of the statistical model. We also circumvent the problem of temporal dependence by using only one time of the day for the training of the statistical model. A powerful tool to model serial dependence is a first-order Markov model (FAWCETT and WALSHAW, 2006b). A third step is needed for the verification of warnings that are given for a certain area. In order to estimate the probability to observe a wind gust at least once somewhere in a county or region, an estimate of the covariance structure of wind gust observations is needed. Whether a data basis consisting of 139 observations over an area as large as Germany suffices for this purpose is an open question. Here we see the necessity to explore multivariate distributions such as copulas (N ELSEN, 2006; S CHOELZEL and F RIEDERICHS, 2008) and geostatistical methods, in order to model the space-time behavior of wind gusts. The more complex the statistical model, the less solid is the maximum likelihood procedure to estimate the model parameters. The estimation procedure might employ Bayesian hierarchical modeling instead, as it enables to successively increase the model complexity (FAWCETT and WALSHAW, 2006a).

6 Appendix: Extreme value theory Extreme value theory is based on a limit law that was first described by Fisher and Tippett in 1928. Let us define a maximum Mn of a finite sequence of length n of identically and independently distributed (i.i.d.) random variables Zi as Mn = max (Z1 , . . . , Zn ).

(6.1)

628

P. Friederichs et al.: A probabilistic analysis of wind gusts

The Fisher-Tippett theorem says that there exists a sequence of constants an > 0 and bn such that the disn tribution function of Mna−b converges to a generalized n extreme value (GEV) distribution Mn − bn P rob( ≤ y) → G(y) as n → ∞. an

(6.2)

This theorem leads to the limiting distribution for sample maxima. The GEV distribution reads  −1/ξ ), ξ 6= 0 exp(−(1 + ξ y−α σ ) G(y) = , (6.3) y−α ξ=0 exp(− exp(− σ )), with 1 + ξ y−α σ > 0 for ξ 6= 0. It is represented by three parameters, the location α, the scale σ and the shape ξ parameter. If the maxima of a process Z follow a GEV, then the process Z is said to be in the domain of attraction of an extreme value distribution Z ∈ D(Gξ ). This is the general condition under which EVT applies. The shape parameter ξ characterizes the behavior of the extremes. If ξ = 0, the process Z is in the domain of attraction of a Gumbel-type GEV (type I) with a distributional tail that has an exponential decay. For ξ > 0, Z belongs to a Fr´echet-type GEV (type II). It’s tail decays following a power law, and hence does not converge to zero for z → ∞. This implicates that the probability to observed an extreme even with a very large value never tends to zero. In contrast, a Weibull-Type GEV (type III) with ξ < 0 is bounded by an upper end point and the probability of an extreme to occur beyond this end point is zero. Another way of assessing an extremal process is the peak-over-threshold approach. It only differs in the representation, not in the underlying extreme value theory. EVT proves that excesses Yi over a large threshold u with Yi = Zi − u and Zi > u asymptotically follow a generalized Pareto distribution (GPD)     1 − 1 + ξ y −1/ξ , ξ 6= 0  σu  H(y) = , (6.4)  1 − exp − y , ξ=0 σu

defined on y > 0 and (1 + ξ σyu ) > 0. The parameters of the GPD are the threshold u, which has to be determined a-priori, a scale parameter σu = σ + ξ(u− α) depending on u, and the shape parameter ξ which only depends on the domain of attraction D(Gξ ), not on the representation of the extremal process (i.e., GEV and GPD have an identical ξ). For further insights, the reader is referred to the text books on extreme value theory (e.g. C OLES, 2001; B EIRLANT et al., 2004)

Acknowledgments The authors are grateful to Armin M ATHES, Andreas ¨ H ENSE and Christian S CH OLZEL for helpfull discussions.

Meteorol. Z., 18, 2009

References A PPLEQUIST, S., G. G AHRS, R. P FEFFER, X. N IU, 2002: Comparison of methodologies for probabilistic quantitative precipitation forecasting. – Wea. Forecast. 17, 783–799. B EIRLANT, J., Y. G OEGEBEUR, J. S EGERS, J. T EUGELS, 2004: Statistics of Extremes. – Wiley, Chichester, 490 pp. B RABSON , B., J. PALUTIKOF, 2000: Tests of the generalized pareto distribution for predicting extreme wind speeds. – J. Appl. Meteor. 39, 1627–1640. B RADLEY, A., S. S CHWARTZ, T. H ASHINO, 2008: Sampling uncertainty and confidence intervals for the brier score and brier skill score. – Wea. Forecast. 23, 992–1006. B RASSEUR , O., 2001: Development and application of a physical approach to estimating wind gusts. – Mon. Wea. Rev. 129, 5–25. B RIER , G. W., 1950: Verification of forecasts expressed in terms of probability. – Mon. Wea. Rev. 78, 1–3. C OLES , S., 2001: An Introduction to Statistical Modeling of Extreme Values – Springer Series in Statistics. SpringerVerlag, London, 208 pp. E FRON , B., R.J. T IBSHIRANI, 1993: An Introduction to the Bootstrap – Chapman & Hall, 436 pp. E UROPEAN C ENTRE FOR M EDIUM -R ANGE W EATHER F ORECASTS, 2006-2007: ECMWF operational analysis data, [Internet]. – British Atmospheric Data Centre, available at http://badc.nerc.ac.uk/data/ecmwf-op/. FAHRMEIR , L., G. T UTZ, 2001: Multivariate Statistical Modelling Based on Generalized Linear Models Springer series in statistics. – Springer, New-York, 548 pp. FAWCETT, L., D. WALSHAW, 2006a: A hierarchical model for extreme wind speeds. – J. Roy. Statist. Soc. Ser. C 55, 631–646. FAWCETT, L., D. WALSHAW, 2006b: Markov chain models for extreme wind speeds. – Environmetrics 17, 795–809. F RIEDERICHS , P., submitted: Statistical downscaling of extreme precipitation using extreme values theory. – Extremes. F RIEDERICHS , P., A. H ENSE, 2007: Statistical downscaling of extreme precipitation events using censored quantile regression. – Mon. Wea. Rev. 135, 2365–2378. —, —, 2008: A probabilistic forecast approach for daily precipitation totals. – Wea. Forecast. 23, 659–673. F URRER , E., R.W. K ATZ, 2008: Improving the simulation of extreme precipitation events by stochastic weather generators. – Water Resour. Res. 44, W12439. G ERTH , W., J. C HRISTOFFER, 1994: Windkarten von Deutschland. – Meteorol. Z. 3, 67–77. G NEITING , T., A. E. R AFTERY, 2007: Strictly proper scoring rules, prediction, and estimation. – J. Amer. Statist. Assoc. 102, 359–378. G RAY , M.E.B., 2003: The use of a cloud resolving model in the development and evaluation of a probabilistic forecasting algorithm for convective gusts. – Meteorological Applications 10, 239–252. G UMBEL , E.J., 1958: Statistics of extremes. – Columbia University Press, New York, 375 pp. H ENEKA , P., T. H OFHERR, B. RUCK, C. KOTTMEIER, 2006: Winter storm risk of residential structures – model development and application to the german state of Baden-W¨urttemberg. – Nat. Hazards Earth Syst. Sci. 6, 721–733.

Meteorol. Z., 18, 2009

P. Friederichs et al.: A probabilistic analysis of wind gusts

J OLLIFFE , I. T., D. B. S TEPHENSON, 2003: Forecast Verification: A Practitioner’s Guide in Atmospheric Science – John Wiley and Sons, Chichester, 240 pp. J UNGO , P., S. G OYETTE, M. B ENISTON, 2002: Daily wind gust speed probabilities over Switzerland according to three types of synoptic circulation. – Int. J. Climatol. 22, 485–499. K LAWA , M., U. U LBRICH, 2003: A model for the estimation of storm losses and the identification of severe winter storms in Germany. – Nat. Hazards Earth Syst. Sci. 3, 725– 732. KOENKER , R., 2005: Quantile regression, volume 38 of Econometric Society Monographs. – Cambridge University Press, 349 pp. KOENKER , R., B. BASSETT, 1978: Regression quantiles. – Econometrica 46, 33–49. M ATHES , A., P. F RIEDERICHS, A. H ENSE, 2008: Towards a quality control of precipitation data. – Meteorol. Z. 17, 733–749. M C C ULLAGH , P., J. N ELDER, 1999: Generalized Linear Models, volume 37 of Monographs on Statistics and Applied Probability. – Chapman&Hall/CRC, 511 pp. M UNICH R E G ROUP, 2005: Topics Geo – Annual review: Natural catastrophes 2004. – available at www.munichre.com, 56 pp. M URPHY, A.H., 1973: Hedging and skill scores for probability forecasts. – J. Appl. Meteor. 12, 215–223. NAKAMURA , K., R. K ERSHAW, N. G AIT, 1996: Prediction of near-surface gusts generated by deep convection. – Meteor. Appl. 3, 157–167. N ELSEN , R. B., 2006: An Introduction to Copulas. – Springer Verlag, New-York, 269 pp.

629

PALUTIKOF, J., B. B RABSON, D. L ISTER, S. A DCOCK, 1999: A review of methods to calculate extreme wind speeds. – Meteor. Appl. 6, 119–132. ¨ PAYER , T., H. K UCHENHOFF , 2004: Modelling extreme wind speeds at a german weather station as basic input for a subsequent risk analysis for high speed trains. – J. Wind Engineer. Indust. Aerodynam. 92, 241–261. P ERRIN , O., H. ROOTZEN, R. TAESSLER, 2006: A discussion of statistical methods for estimation of extreme wind speeds. – Theor. Appl. Climatol. 85, 203–215. P RYOR , S.C., J.T. S CHOOF, R.J. BARTHELMIE, 2005: Empirical downscaling of wind speed probability distributions. – J. Geophys. Res. 110, D19109. ROULSTON , M., L. S MITH, 2002: Evaluating probabilistic forecasts using information theory. – Mon. Wea. Rev. 130, 1653–166. S CHOELZEL , C., P. F RIEDERICHS, 2008: Multivariate nonnormally distributed random variables in climate research – introduction to the copula approach. – Nonlin. Processes Geophys. 15, 761–772. ¨ VAN DEN B RINK , H.W., G.P. K ONNEN , 2008: The statistical distribution of meteorological outliers. – Geophys. Res. Lett. 35, L23702. V ERKAIK , J.W., 2000: Evaluation of two gustiness models for exposure correction calculations. – J. Appl. Meteor. 39, 1613–1626. WALSHAW, D., 1994: Getting the most from your extreme wind data: A step by step guide. – J. Res. Natl. Stand. Technol. 99, 399–411. W EGGEL , J.R., 1999: Maximum daily wind gusts related to mean daily wind speed. – J. Struct. Eng. – ASCE 125, 465–468.

Suggest Documents