Caribbean Storm Surge Return Periods:
Final Report

Organization of American States
General Secretariat
Unit for Sustainable Development and Environment
USAID-OAS Caribbean Disaster Mitigation Project

December 1997


This report was prepared by
Mark E. Johnson
Institute of Statistics, University of Central Florida

 

1. Introduction

Storm surge return periods are useful to land use planners and emergency managers for assessing the likelihood of extreme water depths associated with tropical cyclones. At a given location, it is desirable to determine sound statistical estimates of return periods (typically for 10, 25- and 50-year epochs) and corresponding assessments of uncertainty. The goal is to determine statistical estimates that provide the scientific best prediction of return period. Uncertainties include both limitations in the historical record and effects of parameter estimation. The corresponding standard error embodies uncertainties associated with limitations in the historical record and parameter estimation. In this report, a methodology is given for forecasting maximum storm surges using the TAOS model (Watson, 1985). This general approach is illustrated with Doctor’s Point in Jamaica. Guidelines on applying this approach to other Caribbean locations are given. An Appendix discusses some data issues relevant to computer production runs.

2. Classical Approach via Storm Selection

The classical approach taken to estimate the 25-year storm surge has been to identify a candidate extreme storm likely to appear once in 25 years and to compute storm surge values associated with it. With a 25-year storm as the baseline, increasing the severity of this storm leads to a less likely but possible 50-year storm; similarly, reducing the severity of the 25-year baseline storm offers a candidate representative for a 10-year storm. Since changes in a particular storm’s characteristics (track, intensity, forward speed and so forth) can have considerable impact on storm surge, the present approach lends itself to considerable debate.

A storm that struck Montego Bay in 1912 provides a good example of the traditional approach to storm return time estimation. The 1912 storm had an "unusual" north-heading track. It was the sixth and final storm of the season, making landfall on Jamaica on November 18, 1912. This storm is the strongest to strike Jamaica during the 1871–1997 recorded history, a category 4 storm on the Saffir-Simpson scale. A more typical track for the 1912 storm would have had a more westerly heading. A storm on such a track would have generated a larger storm surge than that produced from the historical track (based on estimates from the TAOS model, with current coastal conditions.) Adjustment of the storm track to create a more typical storm, however, further undermines the already tenuous probabilistic basis for selection of this storm for return time analysis.

Intuitively reasonable arguments may support the selection of a specific storm as the "25-year" storm, but an assessment of the uncertainty in the storm surge associated with this storm is unclear. A further difficulty emerges, if the 25-year time frame were replaced by a 50-year or 100-year time frame. How must the extreme 25-year storm be intensified to correspond to a longer time frame? Despite these difficulties, a storm-specific approach is a reasonable start at identifying the issues and forcing the development of explicit, formal definitions of the storm surge distribution. Such definitions are absolutely critical to make full and intelligent use of the historical storm set in conjunction with a model, which can estimate storm surges for a given location.

3. Return Period — Definitions and Issues

We start with the notion of a "25-year return period," recognizing that the descriptions and definitions should be adaptable to other time horizons. Imagine that at a given site, 2.54 meters is assumed to be the 25-year return period storm surge height. The following represent potential interpretations of 2.54 m and hence, suggest potential definitions.

  1. The probability is 0.04 (i.e., 1/25) that the maximum storm surge height next year will exceed 2.54 m.
  2. In the long run, we expect the 25-year maximum storm surge average to converge to 2.54 m.
  3. We expect to see in the next 25 years at least one storm that produces at least a 2.54-meter storm surge.
  4. The most likely maximum storm surge height in the next 25 years is 2.54 meters.
  5. The interval of time between successive observances of 2.54 m or higher is 25 years on average.
  6. It is extremely unlikely that any storm surge height will exceed 2.54 meters in the next 25 years.

The six descriptions given above constitute plausible "layman" definitions of a 25-year storm surge return period. However, many of the above definitions are actually fairly loose verbal descriptions and, moreover, are not necessarily compatible or consistent. To make progress in estimating storm surge return periods, it is essential to identify a single, tight definition of the concept that captures the notion (critical criteria), and precludes subsequent confusion (an implied auxiliary benefit) resulting from a more vague definition. Definition 1 will be used for now as the basic definition of 25-year return period for storm surge. Besides sharing the spirit of the other definitions, definition 1 has great practical utility, as will be seen in the forthcoming formal statistical estimation development procedure.

To set the stage for the proposed approach, suppose that, looking at one location prospectively, we would eventually (that is, after 25 years) obtain realizations of the following random variables:

Ni number of tropical cyclones in year i, where i = 1, 2,..., 25

Xij storm surge height in year i, storm j within year i, where j = 1, 2,..., Ni

Yi maximum storm surge height in year i

Y maximum storm surge height over the next twenty-five years.

Based on these definitions, we have

Yi = maximum(Xi1 , Xi2 , ..., XiNi )

Y = maximum(Y1 , Y2 , ..., Y25 )

Initially, it might appear that the random variable Y is of sole interest for the 25-year return period. Knowledge of Y, such as its future realization (not available, of course) or its distribution (estimated from past information) is precisely what would be required for planning purposes, assuming a twenty-five year time horizon. Conceptually, this framework directs our efforts toward estimating the probability distribution of Y. Natural estimates could be envisaged that look at 25-year moving windows through the historical data set, capturing maxima for statistical analysis. However, a short record length (111 years) and dependence in overlapping windows of 25-year width preclude this approach. Fortunately, under some mild assumptions Yi can be used as the key distribution, which is equivalent to Y via functional transformation. Notice that the choice of focusing on the distribution of Yi versus Y is driven by the definition of return period. It is useful to re-visit the earlier six definitions/interpretations and couch them in the above probabilistic framework.

Defn. 1. The probability is 0.04 (i.e., 1/25) that the maximum storm surge height next year will exceed 2.54 m.

Implies: The upper 0.96 percentile of the distribution of Yi is 2.54 m.

Defn. 2. In the long run, we expect the 25-year maximum storm surge average to converge to 2.54 m.

Implies: The expected value or mean of Y, denoted E(Y), is 2.54 m.

Defn. 3. We expect to see in the next 25 years at least one storm that produces at least a 2.54-meter storm surge.

Implies: The probability of the right-hand tail of the distribution of Y exceeding 2.54 m is a number close to 1, but how close is not apparent from the definition.

Defn. 4. The most likely maximum storm surge height in the next 25 years is 2.54 meters.

Implies: The mode of the distribution of Y is at 2.54.

Defn. 5. The interval of time between successive observances of 2.54 m or higher is 25 years on average.

Implies: The mean of an "inter-arrival time" distribution is 25 years, where this distribution is determined from a discrete-time stochastic process whose outcomes depend on the value 2.54 m. This definition is awkward at best to understand, let alone apply.

Defn. 6. It is extremely unlikely that any storm surge height will exceed 2.54 meters in the next 25 years.

Implies: The upper tail part of the distribution of Y (that is, P[Y > 2.54] ) is a number close to zero, but how close is not specified. This definition corresponds to the complement of definition 3.

In each of the above six cases, it is apparent that the basic mathematical formulation lends itself to an explicit and precise statistical definition to the extent that the wording of the interpretation of 25-year return period of storm surge is sufficiently precise. Virtually any interpretation should be amenable to this framework. This feature of the framework is critical for resolving mis-understandings! The framework (at the expense of belaboring the point for now) also suggests other possibilities for 25-year return period interpretations while highlighting other issues.

A natural alternative to the mean or expected value of Y is the median of Y. Suppose we knew the median of Y — would a land use planner base decisions on this characteristic of the distribution of maximum storm surge height? From a gambling perspective, the interpretation of the median value is that roughly half the time the storm surge will exceed the median and half the time the observed storm surge will come up short of this figure. If one were to pick a single value, then the median might be a candidate. However, a "conservative" planner might prefer a larger percentile from the distribution of Y (such as the upper 90th percentile). Many possibilities could be considered with the statistical calculations driven by the distribution of annual maxima (Yi) or the distribution of the maximum over an extended number of years (Y).

A critical assumption to address is the plausibility that Y1, Y2, ..., Y25 can be considered independent and identically distributed (i.i.d.). Year-to-year stochastic independence of the maxima seems more plausible than within year storm-to-storm stochastic independence, which in turn is more plausible than day-to-day storm surges. This issue arises naturally here since, if the Yi values are independent and identically distributed, then the distribution function of Yi, say Fi, is related to the distribution function of Y, say F, via:

F(y) = [Fi(y)]n

In other words, we can infer the distribution of Y from the distribution of Yi and vice versa. This correspondence is useful to know if the return period definition is driven by Y rather than Yi. It turns out that Yi having distribution function Fi is the ideal construct for handling the estimation of return period storm surges. The details will emerge in the description of the proposed approach given next.

4. Proposed Approach

The goal of this study is to use the historical storm set data (HURDAT) in conjunction with the TAOS model to predict the storm surge height distribution for given time frames. HURDAT is a database of historical tropical storms in the Atlantic Ocean; this database was developed by the US National Hurricane Center. TAOS is a computer-based numerical model that produces estimates of maximum sustained winds at the surface and still water surge height and wave height at the coastline for any coastal area in the Caribbean basin. We initially start with a single site (Doctor’s Point in Montego Bay) for a 25-year time period with the understanding that multiple sites on a grid will eventually be considered. Other time periods for return can be handled by a simple adjustment of the methodology below.

We first address the problem of estimating the storm surge height for which the probability of exceeding this height in the next year is 0.04 (from definition 1, which is equivalent to estimating the 96th percentile of the Yi distribution). We proceed, as follows:

  1. Extract relevant historical storm information from HURDAT covering the period 1886-1996 (a total of 951 storms over 111 years).
  2. Run the TAOS model for each storm in the full data set and record the observed storm surge at the selected site, say xij, where the subscript i refers to the year and the subscript j refers to the number of storm within year i.
  3. For the set of storms within each year, compute the observed maximum, say yi. (There has not been an occasion in the historical record with a season with no tropical cyclones, but if it were to occur, the maximum tidal value could be employed. A value of zero storm surge of zero could also be used, but this would place positive mass on a single point, which causes complications.)
  4. The set of values {y1, y2, ..., yn} offer realizations from the Yi distribution. Use these values to fit an appropriate distribution. Candidate distributions (continuous with positive support) that can be readily fit using the package Expert Fit (Law and Vincent, 1997) include: exponential, extreme value, gamma, lognormal, inverse Gaussian, and Weibull. This package produces maximum likelihood estimates (MLEs) and has the advantage of providing numerous auxiliary diagnostic calculations.
  5. Find a value, say s, from the fitted distribution function F satisfying F(s) = 0.96. This is the estimated storm surge for the 25-year epoch.
  6. Characterize the uncertainty associated with the value s. In particular, compare the fitted distribution to the empirical distribution function. Secondly, determine the impact of parameter estimation relative to uncertainty in storm surge estimates.

The details associated with the above scheme are illustrated in the example using Doctor’s Point in Montego Bay. For the 10- and 50-year return periods, the only adjustment is in step 5, where the two equations are F(s) = 0.90 and F(s) = 0.98, respectively.

5. Assumptions Associated with the Proposed Approach

For completeness, it is appropriate to state explicitly the assumptions underlying computations producing the 25-year storm surge height s. (These same assumptions apply for other time frames.) In particular:

  1. We are assuming that future storms resemble past storms. Since there are many of them (951), we are not dealing with a "small" data set. On the other hand, we are not generating storm surges with the TAOS model for a hypothetical category 5 storm making a direct hit on Montego Bay. Indirectly, we obtain such storm surges via the upper tail of the distribution of Yi.
  2. Since the analysis weighs the various years’ data equally (the same weight is given to the contributing values from pre-1900 as is given to recent storms), it is natural to assess the impact of the increasing quality of meteorological data over time. To gauge this effect, a sensitivity analysis is performed to assess this effect.
  3. We are effectively assuming that the set of values {y1, y2, ..., yn} constitutes a random sample from the storm surge distribution Yi. Thus, we are assuming that the year-to-year maxima are independent and identically distributed. The distribution of annual maxima is constant, but, of course, the realizations from this distribution may vary from year to year. Equivalently, we are assuming that there are no meteorological trends in storm generation. It is reassuring to note that low order autocorrelations were computed for year-to-year maxima in this study and found to be insignificant.

6. Case Study — Doctor’s Point, Jamaica

An extensive analysis of the 10-, 25- and 50-year return periods for Doctor’s Point in the Montego Bay area of Jamaica has been conducted. Experience with this prototype site suggests that the approach should be adaptable to other sites and other time frames. Of course, consideration of additional sites would provide further confidence in this assertion while revealing unanticipated difficulties and streamlining the calculations.

There are many components that make up storm surge. These can include pressure setup, wind setup, wave setup, wave runup and astronomical tides. For the Doctor’s Point location used to illustrate the statistical methodology, the storm surge height consists of wave, wind and pressure setups and the astronomical tidal component. Since location used by the model to measure surge at Doctor’s Point is offshore, the storm surge values do not include a wave runup component.

6.1. Data set description.

The relevant historical storm set consists of 951 North Atlantic tropical cyclones for the period 1886-1996. C. Watson (personal communication, 1997) prepared a production run of the TAOS model that generated the maximum storm surge at Doctor’s Point for each of these storms simulated on the current coastal topography and bathymetry of Montego Bay. The statistical analysis needs the maximum storm surge in each year generated by the season’s storms. Table 1 summarizes the results of the TAOS model calculations. The 111 annual maximum storm surges are the essential data elements for fitting the model, as described in the next section. [Note: For many of the storms, the local storm surge is attributable entirely to wave set-up — typically for storms which remained in the Atlantic, usually achieving hurricane strength where Jamaica is shielded by other land masses. Fourty-nine of the 111 storms responsible for annual maxima are from wave setup alone. The other sixty-two storms responsible for the annual maxima include other storm surge components, such as pressure and wind setups.

6.2. Fit to the annual maxima.

The data set of 111 maxima was subjected to fits by a number of different probability distributions, using the package ExpertFitTM (Law and Vincent, 1997). Using the method of maximum likelihood, estimates of the parameters from a variety of distributions were obtained (listed earlier in Section 4). Among the distributions considered, ExpertFit suggested that the Weibull distribution provided the most reasonable representation of the data. The Weibull density function is given by:

where x is the location parameter, ß is the scale parameter, and a is the shape parameter. If x is set to zero (i.e., it is not estimated), we have the two-parameter Weibull. If estimated, ExpertFit uses the minimum data value in the sample as its estimate. Although this threshold parameter could also be estimated by maximum likelihood, there are complex consequences relative to automatic estimation and the asymptotic variances that result. Since our interest is in the right-hand tail of the distribution (i.e., extreme high storm surges), it seems prudent for now to proceed with this simpler scheme. We proceed with consideration of the 2- and 3-parameter Weibull distributions as the primary models for Doctor’s Point. The maximum likelihood estimates of the shape and scale parameters for the two Weibull distributions are:

3-parameter 2-parameter
shape parameter estimate a = 0.60034 a = 0.69699
scale parameter estimate b = 0.24624 b = 0.29176

A graphical view of the quality of the fit for the 3-parameter Weibull is given in Figure 1 — the 2-parameter Weibull fit is similar in appearance. It appears that the Weibull distribution represents a plausible "smoothed" representation of the raw data, as depicted by the histogram. Chi-square and other goodness-of-fit tests also support this contention. All the statistical results described here are produced readily by the ExpertFit package. To write computer code to produce the equivalent material would be an onerous task.

Given the fitted distributions, we can extract estimates of maximum surge for selected return periods particularly easily for the Weibull distribution, since the distribution function has a simple closed-form inverse. (Other distributions that are not so convenient can be handled with standard numerical analysis schemes, although the calculations can be tedious.) In particular, the 100p-percentile of the Weibull distribution is obtained from the expression:

From this representation, we have the following estimates of storm surges:

 

Estimate based on MLE fit of Weibull distribution

Estimate based on the empirical distribution function

Return Period

2-parameter

3-parameter

10 year

0.965 m

0.999 m

1.203 m

25 year

1.561 m

1.737 m

1.576 m

50 year

2.065 m

2.400 m

2.215 m

100 year

2.610 m

3.146 m

2.840 m

The figures provided are given with excessive precision to facilitate arithmetic verification. The uncertainty associated with these estimates is addressed in the next section. The values for the Weibull distribution are obtained from the expression given for Xp and the parameter estimates. The 10-year period corresponds to a p of 0.10, the 25-year period corresponds to p equal to 0.04, the 50-year period corresponds to p equal to 0.02, the 100-year period corresponds to p equal to 0.01.

For comparison, the sample percentiles of the raw 111 data values are also provided above. The raw values are not smoothed, as is done in fitting the Weibull or other distributions. A pure percentile approach provides estimates of return periods (by inverting the sample distribution function) but has limitations in handling uncertainty. Difficulties arise in consideration of the right hand tail of the empirical distribution function, which is primarily driven by the overall maximum of 2.92 m, corresponding to the devastating 1912 storm.

6.3. Uncertainty in Estimates.

The point estimate of the 96th percentile of the annual maximum storm surge distribution represents a best estimate of the maximum 25-year storm surge outcome according to definition 1. The actual outcome is unlikely to be precisely this value — it could be somewhat higher or lower. Hence, it is prudent to base decisions not on the best estimate alone but on a conservative interval that is likely to contain the future value with high probability. This interval would naturally contain values somewhat larger than the best estimated value. This section outlines a simple, simulation-based approach for determining an appropriate interval. This approach is reminiscent of a Monte Carlo testing procedure (Jöckel, 1986), although here we focus on estimation instead of hypothesis testing. Alternative methods for estimating uncertainty will be described in section 8, but further research is needed to determine if these are superior to the approach given here (those listed are certainly much more complicated and computer intensive). We illustrate the proposed approach by continuing the Doctor’s Point example.

In the course of determining maximum likelihood estimates, ExpertFit computes second derivatives of the log-likelihood function corresponding to the Weibull distribution (as well as comparable calculations for the other distributions). At convergence of the estimation procedure, these values are the basis of an asymptotic covariance matrix of the parameter estimates (see, for example, Cox and Hinkley, 1974). For sample sizes on the order of 111, it is reasonable to assume further that asymptotic normality of these estimates holds approximately. This suggests that since the parameter estimates incorporate the uncertainty, we take advantage of the situation by generating realizations from suitably chosen normal variates as the Weibull parameters and then computing the corresponding 96th percentile of the Weibull. The approach just sketched is a tremendous shortcut over generating a full sample from the fitted Weibull distribution, estimating its parameters via maximum likelihood and computing the associated percentile. We can readily generate a thousand realizations, sort them and obtain percentiles of the simulated percentiles, say the 90th, 95th and 99th percentile reflecting increasing caution in predicting extreme storm surges above the best estimate. Again, we illustrate the situation with the Doctor’s Point location. The asymptotic covariance matrix produced by ExpertFit is

 

The non-zero off-diagonal value (i.e., the covariance 0.000570181) implies some dependence between the components. The corresponding correlations (which are easier to interpret) are

Although this (asymptotic) correlation is small it is not negligible. It is easy to incorporate this information into the uncertainty assessment. For later reference, the asymptotic standard deviations are given by

(0.00168038)1/2 = 0.040992
(0.00197392)1/2 = 0.044429 for the 3-parameter Weibull
(0.00175010)1/2 = 0.041834
(0.00266060)1/2 = 0.051581 for the 2-parameter Weibull.

To simulate parameter estimates, it suffices to generate independent normal variates and transform them to have the appropriate means, variances and dependence structure. The basic algorithm is, as follows:

  1. Generate two independent standard normal variates (mean zero, variance one), say Z1 and Z2.
  2. Introduce dependence (where the correlation is denoted r), as follows:r
  3. X1 = Z1
    X2 = r Z1 + (1 - r2 )1/2 Z2.
    (For Doctor’s point, r= 0.313.)

  4. Apply the linear transformation
  5. ß = s1 X1 + µ1 (0.040992 X1 + 0.24624 for the 3-parameter)
    a = s2 X2 + µ2 (0.044429 X2 + 0.60034 for the 3-parameter)

  6. Compute the 96th percentile using

where ß and a are from step 3 and x is either 0.0114 or 0.0 . (For Doctor’s Point and a 96th percentile, X0.96 = )

The above algorithm can be repeated a large number of times (1000 is adequate) to produce multiple simulated percentiles from the storm surge distribution. These simulated values are then sorted and the desired sample percentile is extracted to provide an upper prediction limit on the 25-year storm surge. The 900th ordered value among the 1000 generated provides an upper 90% limit, the 950th provides an upper 95% limit, and so forth. Following this approach, we obtain the results given below (including 10- and 50-year return periods):

 

3-parameter Weibull distribution

2-parameter Weibull distribution

 

10-yr

25-yr

50-yr

100-yr

10-yr

25-yr

50-yr

100-yr

"MLE"

0.999m

1.737m

2.400m

3.146m

0.965m

1.561m

2.065m

2.610m

90% limit

1.217m

2.179m

3.078m

4.097m

1.149m

1.901m

2.564m

3.282m

95% limit

1.286m

2.367m

3.383m

4.591m

1.202m

2.042m

2.794m

3.622m

99% limit

1.459m

2.678m

3.838m

5.224m

1.346m

2.278m

3.112m

4.066m

Obviously, the greater the confidence level sought, the larger the limit relative to the "best guess" as given by the maximum likelihood estimate (MLE.) These calculations have obvious value from an insurance industry perspective, as they capture the risks associated with catastrophic events. As another application, structural engineers could use these limits to determine safety factors in the design of structures to withstand static and dynamic water forces. Finally, land use planners could use this information to promote coastal land use that is consistent with site-specific hazard risk. All simulation calculations were performed using Microsoft Excel and are straightforward to produce. Most of the distributions available in ExpertFit would be amenable to this approach.

Some comments are warranted with respect to the choice among 90%, 95% and 99% confidence levels. In general, the higher the confidence level chosen for planning purposes, the more risk averse the planner. Obviously, 99% confidence gives larger storm surge values than 95% confidence, which in turn exceed those corresponding to 90%. Confidence levels are conceptually challenging to explain as they involve the long-term performance of a statistical procedure imagined to be repeated on comparable hypothetical data sets. Using the 90% figure, we expect that about 90% of time that limits derived as in this analysis would include the true value of storm surge. The 90% figure attempts to capture the uncertainty in predictions due to estimating Weibull parameters with the 111 years of data. A less precise (and hence strictly speaking statistically-not-quite-correct) interpretation is that we are 90% confident that the predicted storm surge value will not exceed the 90% report value from the table. Of course, the realization of the storm surge will either exceed or not exceed this value. In this study, the 99% storm surge limit for the 100-year value turned out to be 4.066 meters. Such a value is physically plausible given a direct hit of a category 4 storm. Since one such storm has struck this area once in the past 100 years (albeit on a somewhat unexpected track), at least the upper 99% confidence limit on the statistical predicted value is physically within reason. Validating the return time estimates produced using this analysis against the historical record of storm surge levels (obtained from tidal gages or indirect means) is complicated by the changing nature of coastal topography and bathymetry.

There are some obvious (and potentially large) differences between the 3-parameter and 2-parameter Weibull distribution fits. It may seem unusual that estimation of a threshold parameter can cause such an impact. For shape parameters less than 1, the Weibull density function tends to infinity at its lower limit, which disrupts estimation (recall that ExpertFit sets the threshold to the minimum value in the sample to supposedly avoid this difficulty). The 2-parameter Weibull matches the percentiles better than the 3-parameter and is more amenable to interpretation at its left tail (near zero corresponding to years with few or distant storms). Pending the consideration of additional sites, the 2-parameter Weibull is recommended for the time being.

6.4. Sensitivity Analyses

A variety of additional analyses were conducted to explore the robustness of the methodology proposed here for estimation of storm surge return period. Additionally, further work was conducted to assess data quality issues.

Data subsets. Perhaps the most obvious question is: what is the impact of the most extreme event in the dataset, the1912 storm? In particular, how sensitive are the distributional fits and the corresponding estimated maximum storm surges for various return periods to the result from the 1912 storm? To answer this query, we consider recomputing the estimated return periods with the 1912 storm omitted. While we are in the habit of deleting storms, it may seem reasonable to consider the more recent storms, since the associated meteorological data is less subject to debate. The results (using the three-parameter Weibull) are summarized below.The results are similar using the two-parameter Weibull.

Case:

Full data

All but 1912

1901-1996

1950-1996

Estimated x (location)

0.0114

0.0114

0.0114

0.0142

Estimated b (scale)

0.2462

0.2364

0.2384

0.1338

Estimated a (shape)

0.6003

0.6104

0.5822

0.5400

est. 10-year storm surge

1.00 m

0.94 m

1.01 m

0.64 m

est. 25-year storm surge

1.74 m

1.62 m

1.79 m

1.18 m

est. 50-year storm surge

2.40 m

2.22 m

2.49 m

1.69 m

The removal of the 1912 year has a minimal effect on the full data set results. On the other hand, consideration of merely the past 47 years worth of data (1950-1996) reveals a relative lull in storm surge extremes.

Alternative distributions. The analysis so far has a definite Weibull slant (the best fits in all cases so far have identified the Weibull distribution as the "winner.") This should not be too surprising, since if the full data set favors the Weibull distribution, we might expect to see this continued predilection with various reduced subsets. However, it is nevertheless of interest to examine, at least briefly, the differences possible with the penultimate fitting distribution. In the case of the full data set, the runner-up distribution was the log-normal distribution. Comparison of parameters between the Weibull and other distributions is not meaningful, but differences in estimated surges for various return periods are:

Estimated storm surge maximum

Weib. I

Weib. II

Log-Norm. I

Log-Norm. II

Empirical

(3-parameter)

(2-parameter)

(3-parameter)

(2-parameter)

cdf

10-year storm

1.00 m

0.97 m

1.36 m

0.98 m

1.20 m

25-year storm

1.74 m

1.56 m

3.57 m

2.00 m

1.58 m

50-year storm

2.40 m

2.06 m

6.68 m

3.19 m

2.22 m

100-year storm

3.15 m

2.61 m

11.74 m

4.84 m

2.84 m

The results deduced from ExpertFit on the 3-parameter log-normal are suspicious. The 2-parameter results have been verified independently [by logging the data, the maximum likelihood estimates are the mean and variance of the transformed to normal distribution.] Even the values for the 2-parameter log-normal reveal an abundance of probability mass in the right hand tail, beyond that suggested by the percentile approach (i.e., the "emp. cdf" column). Arguably, the 100-year storm surge value is excessive (well above the maximum in the data set). Pending the results from other sites, the 2-parameter Weibull is recommended.

7. General Considerations for Other Sites

The methodology described in this report should be applicable to any Caribbean site. The surge model (TAOS) must be run for the set of 951 storms. Extracting the annual maximum storm surge from the production runs is a simple task. Prior to fitting distributions, it is useful to perform quality control checks on the data to be analyzed:

  1. Are there any unusual features in the results? Before perfecting the production runs, we discovered occasional repeats of storm surge values for adjacent storms. This minor problem was then corrected prior to performing the distribution fits.
  2. Are the storms that produce the maxima the direct hits or bypassing storms? Are very low storm surge values associated with mid-Atlantic tropical storms?

Once we are confident that the data to be fit is acceptable, the distribution fits can proceed. In a sense we were fortunate that the Weibull distribution was recommended by ExpertFit and confirmed from the graphical representation. Percentile and variate generation for the Weibull distribution is simple, because closed-form expressions exist. Less friendly distributions require more numerical work, but references are available to assist in the task (Johnson, 1987; Law and Kelton, 1991; Devroye, 1986). Note that an expression that produces percentiles can also be used to generate random variates by supplying a uniform 0-1 variate for p.

Statistical notes: ExpertFit provides asymptotic variances and covariances for each fitted distribution. The general approach for obtaining prediction intervals involving simulated parameters should apply. The convergence rates to asymptotic normality for moderate samples needs some attention in conducting the analysis. Caution is recommended in dealing with non-zero threshold parameters.

8. Future Efforts

Consideration of other sites throughout the Caribbean would be desirable to provide further confidence in the methodology. Although we do not anticipate fundamental problems, the added experience could help to refine the analyses.

Aside from widely dispersed sites, it would also be desirable to consider a grid of sites in a smaller locale. Potential benefits of neighboring sites should be explored. At the very least, the consideration of neighboring sites provides further opportunities for sanity checks. For example, we would expect that annual maximum surges are likely to arise from the same extreme storms.

The results presented in this report fundamentally are conditional on the correctness of the TAOS model. Since there has been considerable work by its developer in assessing the match between modeled output and real data (from tidal gauges, for example), we are confident that the numbers being subject to statistical analyses are a reasonable reflection of what should be expected with the historical storm set and the current Montego Bay area bathymetry and topography. It has been the case in producing the results of this report that no anomalies in TAOS have been uncovered.

The approach for prediction intervals for 10-, 25- and 50-year return periods is simulation-based relying upon the approximate normality of the parameter estimates which in turn are used to generate storm surge annual maximum distributions. As was noted earlier, this scheme represents a tremendous saving in computing. Alternatively, one could generate full samples of size 111 from the fitted distributions, fit the generated sample with maximum likelihood and then summarize the results. It would be interesting from a statistical perspective to compare these approaches. The advantage of the approach advocated here is that peculiar samples leading to difficulties in maximum likelihood estimation are avoided. Maximum likelihood could occasionally struggle with convergence, which would complicate the analysis for generated samples likely to produce very extreme surges. It was for this pragmatic reason that the alternate approach was followed here.

Another approach to obtaining prediction intervals involves the potential use of bootstrap techniques (see, for example, Efron and Tibshirani, 1993). The bootstrap approach applied here would involve sampling with replacement from the 111 storm surge values, and then fitting by MLE the parameters, much as was suggested in the previous paragraph. Although peculiar samples are not so likely here, there is considerable computation involved. Much of the bootstrap research has focused upon estimation of means, not extreme percentiles. An exception is a paper published recently by Hall and Weissman (1997), who consider estimation of extreme tail probabilities in the bootstrap context. Their approach may be worth considering here, although their "pull back" approach (considering a subset of the data) may strain the 111 sample size available.

References

Cox, D.R. and Hinkley, D.V. (1974). Theoretical Statistics, London: Chapman and Hall.

Devroye, L. (1986). Non-uniform Random Variate Generation, New York: Springer-Verlag.

Hall, P. and Weissman, I. (1997). "On the Estimation of Extreme Tail Probabilities," The Annals of Statistics, 25, 1311-1326.

Jöckel, K.-H. (1986). "Finite Sample Properties and Asymptotic Efficiency of Monte Carlo Tests," Annals of Statistics, 14, 336-347.

Johnson, M.E. (1987). Multivariate Statistical Simulation, New York: John Wiley & Sons.

Law, A.M. and Kelton, W. D. (1991). Simulation Modeling and Analysis, New York: McGraw-Hill, Inc.

"Montego Bay 100-year Hurricane Coastal Flooding Hazard Assessment," The Caribbean Disaster Mitigation Project, OAS Report, August 1994.

"Storm Hazard Assessment for Montego Bay, Jamaica," The Caribbean Disaster Mitigation Project, OAS Report, September 1995.

Watson, C. C. "Introduction to the TAOS Storm Hazard Modeling System," Workshop on Using the TAOS Storm Hazard Model, May 8-9, 1997.

Watson, C. C. "The Arbiter of Storms: A High Resolution, GIS Based System for Integrated Storm Hazard Modeling," National Weather Digest, Volume 20, Number 2, pp. 2-9, December 1995.

Watson, C. C. Output from TAOS Model for Doctor’s Point, Jamaica, personal communication.

Appendix. Notes on Running the TAOS Model for the Full Data Set.

The emphasis in this report has been to outline the statistical methodology to produce forecasts of maximum storm surges for various epochs with corresponding uncertainty assessments. In effect, this methodology is predicated upon having available the annual maximum (over all storms in a given year) storm surge for each of the 111-year time frame of 1886–1996. This task is rather more onerous than might be suggested by the lack of emphasis in this report on development of the data necessary for the statistical analysis. The purpose of this Appendix is to record the steps and difficulties in obtaining the necessary data. Actually, the term "data" is used rather loosely, as each year’s value represents the maximum of potentially several computed values from TAOS runs (C. Watson, personal communication, 1997).

For a given site, such as Doctor’s Point and for each storm, we need: latitude, longitude, central pressure (CP), peak wind speed in the storm (Vmax), radius of maximum winds (Rmax) plus neighboring topography and bathymetry. For older storms, central pressure and radius of maximum winds are not included in the HURDAT data set. Hence, they were estimated with CP inferred from Vmax and then Rmax in turn inferred from CP, Vmax and storm track information.

There were two types of computer runs made. The wave propagation component of TAOS is relatively simple and quick to run, so it was applied to all 951 storms to produce a significant wave height. The maximum significant wave height over the storm history multiplied times 0.02 yields a wave set-up value. Since many storms do not directly strike Jamaica, in these cases, wave set-up is the only needed calculation. However, for storms that bypass or hit direct, the full capability of the TAOS model is needed to calculate the composite storm surge (not just the wave set-up). There were no years in which a "distant" storm produced a maximum storm surge value (due exclusively to wave set-up) that exceeded storm surges due to nearby storms.

Acknowledgements.

Funding for this project was made possible through a contract from the Organization of American States (OAS). Support and professional assistance are gratefully acknowledged from Jan Vermeiren and Steven Stichter (both with OAS). We are grateful to Dr. Steve Lyons for the streamlined wave set-up model. This project was truly launched on May 7, 1997 in Rincon, Georgia, where Steven Stichter, Charles Watson (Watson Technical Consulting), and Mark Johnson (University of Central Florida) met to hammer out definitions of storm surge return period and brainstorm potential approaches. This report is an outgrowth of the initial formative meeting. The three of us intend to submit a suitably edited version of the report to a refereed journal.

Table 1. Annual Maximum Storm Surge Values (m) from Doctor's Point

year TPC# Max Set-up year TPC# Max Set-up year TPC# Max Set-up
1996 951 0.248 0.043 1956 558 0.032 0.032 1916 218 1.216 0.168
1995 936 0.037 0.037 1955 551 0.495 0.117 1915 211 1.571 0.160
1994 919 0.349 0.022 1954 541 0.054 0.054 1914 209 0.011 0.011
1993 909 0.018 0.018 1953 526 0.290 0.076 1913 205 0.037 0.037
1992 899 0.028 0.028 1952 518 0.059 0.059 1912 204 2.920 0.169
1991 891 0.021 0.021 1951 504 1.580 0.164 1911 198 0.303 0.067
1990 879 0.022 0.022 1950 499 2.082 0.162 1910 192 1.117 0.148
1989 872 0.022 0.022 1949 486 0.184 0.024 1909 185 0.702 0.155
1988 860 1.151 0.177 1948 473 0.189 0.136 1908 176 0.041 0.041
1987 852 0.027 0.027 1947 463 0.201 0.078 1907 169 0.025 0.025
1986 843 0.017 0.017 1946 456 0.046 0.046 1906 168 0.401 0.039
1985 839 0.050 0.050 1945 451 0.593 0.074 1905 156 0.303 0.061
1984 820 0.020 0.020 1944 433 2.257 0.157 1904 150 0.614 0.033
1983 813 0.014 0.014 1943 429 0.023 0.023 1903 140 1.410 0.176
1982 806 0.032 0.032 1942 413 0.518 0.104 1902 137 0.018 0.018
1981 794 0.309 0.035 1941 407 0.063 0.063 1901 129 0.274 0.043
1980 783 1.422 0.158 1940 400 0.027 0.027 1900 117 0.034 0.034
1979 776 0.234 0.012 1939 395 0.877 0.107 1899 116 0.053 0.053
1978 769 0.059 0.059 1938 385 1.321 0.136 1898 110 0.952 0.118
1977 761 0.022 0.022 1937 376 0.015 0.015 1897 98 0.032 0.032
1976 748 0.026 0.026 1936 372 0.023 0.023 1896 94 1.314 0.179
1975 741 0.211 0.027 1935 356 0.862 0.100 1895 86 0.424 0.162
1974 734 0.597 0.065 1934 350 0.595 0.070 1894 81 0.195 0.081
1973 725 0.149 0.044 1933 334 1.428 0.163 1893 78 0.369 0.068
1972 712 0.026 0.026 1932 318 0.625 0.091 1892 64 0.053 0.053
1971 703 0.064 0.064 1931 305 0.877 0.142 1891 53 0.195 0.038
1970 688 0.193 0.041 1930 299 0.030 0.030 1890 46 0.020 0.020
1969 675 0.064 0.064 1929 296 0.038 0.038 1889 42 0.435 0.177
1968 669 0.035 0.035 1928 291 0.418 0.081 1888 31 0.045 0.045
1967 655 0.326 0.107 1927 287 0.355 0.013 1887 13 0.299 0.035
1966 651 0.074 0.060 1926 280 0.048 0.048 1886 7 0.953 0.137
1965 639 0.035 0.035 1925 270 0.024 0.024
1964 629 0.364 0.122 1924 268 0.410 0.071
1963 622 0.275 0.161 1923 257 0.140 0.026
1962 615 0.023 0.023 1922 253 0.034 0.034
1961 607 0.329 0.007 1921 249 0.045 0.045
1960 597 0.055 0.055 1920 241 0.018 0.018
1959 589 0.032 0.032 1919 238 0.039 0.039
1958 578 0.318 0.019 1918 232 0.246 0.115
1957 565 0.016 0.016 1917 231 1.129 0.165
CDMP home page: http://www.oas.org/en/cdmp/ Project Contacts Page Last Updated: 20 April 2001