Many ecological modeling applications rely upon characterizations of the earth s surface

CHARACTERIZING UNCERTAINTY IN DIGITAL ELEVATION MODELS Ashton Shortridge Department of Geography Michigan State University East Lansing, MI, 93117, US...
Author: Maryann Owen
3 downloads 0 Views 178KB Size
CHARACTERIZING UNCERTAINTY IN DIGITAL ELEVATION MODELS Ashton Shortridge Department of Geography Michigan State University East Lansing, MI, 93117, USA [email protected]

Reference: Shortridge, A. M., 2001. Characterizing uncertainty in digital elevation models. Chapter 11 in Spatial uncertainty in ecology: implications for remote sensing and GIS applications. Hunsaker, C.T., Goodchild, M. F., Friedl, M. A., Case, T. J. (Editors). Springer: New York, 238-257.

11.1. Introduction

Many ecological modeling applications rely upon characterizations of the earth’s surface. In some instances, the actual elevations are important (see for example Nisbet and Botkin 1993, de Swart et al. 1994, Liebhold et al. 1994). In others, secondary terrain attributes such as slope and aspect are also critical for the application (see for example Huber and Casler 1990, Nemani et al. 1993, Austin et al. 1996, Zhu et al. 1996, Russel et al. 1997). The term digital elevation model (DEM) refers to a variety of digital forms which characterize some portion of the earth’s topography. These are used as proxies for the actual terrain surface in environmental modeling applications. The process of DEM production is one of abstraction, and is subject to error. A DEM therefore does not perfectly match the real-world terrain it represents. The precise degree of this mismatch at every point is unknown, giving rise to uncertainty about the relationship between data and actual terrain.

1

This chapter identifies approaches to characterize DEM uncertainty and to assess its impact upon ecological applications. A brief overview of commonly employed data structures is provided, followed by a discussion of DEM uncertainty. Two categories of DEM-related uncertainty are identified. The first source of uncertainty results from differences between the form of the data model and the actual elevation surface. Uncertainty arises regarding elevations at locations not directly sampled in the DEM; this is referred to as data model-based uncertainty. The second is concerned with the fact that data production methods do not accurately capture elevations at specified x,y locations. This error is directly measurable and is referred to as data-based uncertainty.

Uncertainty in elevation data can be addressed if it can be characterized, either through field measurements or by the adoption of assumptions about the data and the terrain. An uncertainty model is employed to characterize uncertainty in a spatial dataset. Using this model, a researcher can produce a map of the most likely elevation surface, given the available information. For many applications, a better approach may be to propagate DEM uncertainty through the analysis to identify its impact upon the results of the application. This is accomplished by producing, via Monte Carlo simulation, a set of equiprobable realizations of the DEM. The ecological application is then run upon all realizations, producing a distribution of results.

While digital elevation models are commonly employed in ecological modeling, they are also representative of a more general class of spatial data that seeks to characterize continuous surfaces. The theory and methods discussed here are therefore applicable to

2

many other forms of spatial data employed in ecological modeling. This chapter is intended to provide both a thorough background on DEMs and uncertainty, as well as indicating general approaches to modeling uncertainty in data for continuous phenomena.

11.2. Production methods and data structures

DEM sources vary widely. Some scientists perform their own field measurements and generate elevation models to their own specifications (see for examples Dubayah and Rich 1993, de Swart et al. 1994). The increasing affordability and sophistication of global positioning system (GPS) technology has enabled the construction of project-specific DEMs, particularly for small study areas or for spot elevations at specific sites. Advantages of this approach include that the researcher can specify the spatial resolution and extent of the survey, design the sampling campaign, and oversee the production of the final elevation model. Accuracy of both the final DEM and the GPS data can be tested if the sampling campaign is carefully designed, leading to sophisticated uncertainty models (Oliver et al. 1989). In spite of these advantages, environmental scientists often do not collect their own elevation data, particularly when the goal is to characterize the terrain surface over a more extensive region. Calibrating the GPS, particularly in rural or wilderness areas distant from elevation benchmarks, can be difficult, time consuming, and expensive. Collecting densely sampled elevation data in a careful and systematic manner is a lengthy process, particularly if it is combined with a plan to model data uncertainty. For watershed-scale areas or larger, building a quality DEM with a handheld

3

GPS is impracticable. Finally, interpolation schemes to generate an elevation surface from the collected point data introduce uncertainty into the final product, and the choice of method can seem arbitrary (see Lam 1983 for a description of spatial interpolation methods).

Instead, most projects use general-purpose DEMs typically produced and/or distributed by governments agencies like the U.S. Geological Survey or the Ordinance Survey in the United Kingdom (see for examples Huber and Casler 1990, Liebhold et al. 1994, Russel et al. 1997). Currently, these are produced in regular blocks at various resolutions and to various quality specifications. The USGS, for example, distributes several terrain data products at different resolutions and coverage ranges. Complete nationwide coverage is available from the 1:250,000-scale set. These are also called one-degree DEMs, which specifies the coverage area of each file. Elevations are stored on a regular 3 arc second grid. Distance between 3 arc second posts varies by direction and latitude but is roughly between 70 and 90 meters in the conterminous United States. A second major format is the 7.5’ DEM. The spatial organization of this product corresponds to the 7.5’ quadrangle system, and each file is named after its corresponding topographic map sheet. Elevations are stored in a regular grid with 30 meter spacing, ostensibly at a scale of 1:24,000. Quality specifications for these DEMs vary and are dependent on the production method (for more about these DEM products, see USGS 1995).

Caution must be used when employing these elevation datasets. They were doubtless not designed with any particular ecological purpose in mind, and their resolution and

4

accuracy might be entirely inappropriate for a given application. For example, (Russel et al. 1997) found systematic ‘ripples’ in the 7.5’ DEM they used to derive wetness potential. These ripples propagated through their data processing and corrupted the end product, a wetland reclamation suitability site map. Current quality specifications for agency-produced DEMs are not adequate for specifying models of spatial data uncertainty, since they use purely global accuracy measures. This topic will be treated more fully in a later section.

A third choice is to manually digitize elevation data from topographic map sources (see for example Gessler et al. 1993, Austin et al.1996). This was frequently done in the United States before production DEMs became widely available (indeed, many USGS 7.5’ DEMs are derived from nondigital contour maps), and remains a viable option for regions not included in inexpensive, readily available, high resolution DEM data banks. Some researchers have preferred to manually produce elevation data in areas where existing DEMs were subject to excessive production artifact error (Garbrecht and Starks 1995).

Whether elevation data were collected by the scientist or downloaded from a digital library, the decision about how to characterize the surface for the purposes of analysis and GIS processing remains. This data modeling decision is typically limited to four choices (Weibel and Heller 1991): point model, digitized contour lines, irregular networks, and raster cells. The data may be treated simply as a set of locations with associated elevations. The point model may consist of a collection of scattered points, or

5

of gridded locations in regular profiles, as from a USGS DEM. This point model makes no explicit assumptions about the nature of the surface between the specified locations. A second model is that of a set of digitized contour lines which fit the data. While digital algorithms have been developed to work on this type of data (Moore et al. 1991), the model has been criticized for overspecifying the elevation surface at contour intervals and underspecifying areas falling between the intervals. Digital contours are essentially verbatim copies from the paper world of pen-based cartography; for terrain modeling purposes, other digital models should be more effective at characterizing terrain. Triangular irregular networks (TINs) store elevations at the vertices of triangular elements. The terrain is then modeled as a surface composed of triangular facets (Weibel and Heller 1991). These facets are typically planar, so that the elevation surface is continuous with discontinuities in slope occurring at the facet boundaries. An advantage of the TIN model is that, since the facets can be of any size, sampling density may be increased to capture short-range variation in rugged terrain and reduced in less rugged areas (Kumler 1994). The most commonly used spatial data model employs raster cells to completely describe the elevation surface. This model partitions space into a rectangular matrix of identically sized, usually square cells. A single value is stored in each cell; all parts of the cell’s space possess the same value. The raster model’s popularity stems from its frequent implementation in geographic information systems, its utility for GIS analysis with other raster data layers, and its similarity to the gridded profiles of common production-level DEMs. It is worth noting that most GIS packages enable conversions between these models, so users should not feel constrained to one particular model for all processing and analysis.

6

For any of these models, two general sources of uncertainty may be specified: data-based uncertainty, due to the inaccuracy of measured elevations, and data model-based uncertainty, due to differences between the structural characteristics of the model and the landscape.

Data-based uncertainty is defined as the difference between the elevation of a location specified in the data set and the actual elevation at that location. This difference can in theory be measured by ground survey. DEM files produced by the USGS and other mapping agencies typically are assigned a root mean square error (RMSE) from a (usually small) set of locations for which the true elevation is known (USGS 1995). The actual elevations at these locations are compared with the DEM estimate; RMSE is derived from the sum of the squared differences. Agencies engaged in DEM production use this sort of measure for their data quality reporting, and the DEM accuracy literature describes this approach in detail (Shearer 1990, Bolstad and Stowe 1994, Monckton 1994). Global measures like RMSE are inadequate by themselves for analysis of uncertainty, since they provide no information about local spatial structure. Indeed, map and data accuracy standards in general are not sufficient to characterize the spatial structure of uncertainty (Goodchild 1995, Unwin 1995).

Uncertainty arising from surface characterization depends very much upon the data model being used (see Goodchild 1992, for a discussion of the relationship between spatial data models and data structures). Two of the terrain data models described earlier

7

do not explicitly represent a continuous terrain surface. In an array of points, no assumption is made about the elevation of intermediate locations. Digitized contours exhaustively capture all elevations at the contour intervals, but do not specify elevations falling between contour intervals. Strictly speaking, complete uncertainty exists regarding the elevation of any location not specified in either of these models. In practice, assumptions about intermediate values are frequently made, since it is usual and often critical to model terrain as a continuous surface, with an elevation specified at every point. Typically, contours and arrays of points are converted to data models that are used for surfaces, with defined values at every location, like rasters and TINs.

A raster DEM assigns a single elevation to every location within a cell. There is often uncertainty about what this elevation represents. Is it the elevation of the center of the cell? Is it the height of the lower left corner of the cell? Is it the mean elevation value for the area within the cell? Surface elevation is discontinuous at cell boundaries. In contrast, the elevation surface is continuous across a TIN, though the slope surface is not continuous at facet edges if TIN facets are planar. Although elevations are specified for every location in these models, discrepancies can certainly arise between the real-world terrain and the structural characteristics of the model. The elevations at all vertices of a TIN could conceivably be without error, yet the facets fail to capture actual terrain characteristics. Similarly, elevations for all cells in a raster DEM could correctly characterize the mean cell elevation, but the fidelity of the model surface (flat-topped squares, like stacks of blocks) to the real world surface is very poor.

8

11.3. Data model uncertainty

This section is concerned with the effect of spatial resolution on parameters derived from elevation data. This is a problem related to data model uncertainty, as defined previously, since discrepancies at known locations are not relevant to the resolution effect. The impact of DEM raster cell resolution on secondary terrain variables like slope, aspect, and curvature has long been recognized and studied, particularly in hydrology, but also in soil science and geomorphology (Carter 1990, Chang and Tsai 1991, Zhang and Montgomery 1994, Hutchinson 1996, Gao 1997, Saulnier et al. 1997). Because slope is an elevation derivative frequently of interest to both hydrologists and ecologists, the following example has been developed to demonstrate this impact on slope algorithms.

This experiment employed the USGS 7.5’ DEM Rancho Santa Fe, located north of San Diego, CA. The terrain for this region ranges from level to high relief. For this example the accuracy of the data will not be considered. The original DEM consisted of a cell resolution of 30 meters. It was resampled to four larger resolutions: 60 m, 90 m, 150 m, and 300 m. All were resampled using each of the Arc/Info GIS’ interpolation methods (nearest neighbor, bilinear, and cubic convolution ) to identify what effect, if any, choice of interpolator had on the resulting slope grids. For each of the five DEMs, slope (in degrees) was calculated following the method presented in Burrough 1986, which is that used in the Arc/Info geographic information system for gridded surfaces.

9

Table 11.1 presents information about the statistical distribution of the degrees slope map for each of the cell resolutions and interpolation methods. It should be clear that the first row of the table, the 30 m information, represents the original 7.5’ data, prior to any resampling. Two features stand out. First, choice of interpolation method does not appear to affect slope distribution for cell sizes greater than 60 m. Second, regardless of interpolator, average slope values and maximum slope values decline as cell size increases.

Figure 11.1 presents a set of boxplots to graphically depict the effect of cell resolution on slope distribution. This plot shows only the bilinear interpolation distributions. The most profound shift with increasing cell size is the effect on larger slope values, as the distribution becomes increasingly foreshortened on the high side. Similar trends have been noted in many different studies on many DEMs and many terrain types. An important conceptual point is that the resampled elevation values themselves may not be in error, but the calculated slope will change as cell size changes. Uncertainty in the true slope measure in this case is not due to error in the elevation values, the interpolation measure, or the slope algorithm, but is rather due to the gap between the spatial data model and the real-world terrain. As cell size increases, this gap increases as well.

Some lessons can be drawn from this study. First and more obviously, data resolution can have a profound effect on terrain attributes, specifically slope distribution. Locations characterized by high slopes at fine resolution experience large reductions in slope as resolution coarsens. Researchers have attempted to use this sort of trend to extrapolate to

10

characterize terrain at levels finer than the available data (Polidori et al. 1991), but this sort of approach is fraught with critical and largely untestable assumptions. A second point is that terrain data resolution should match the scale of the ecological process of interest. If the Mojave ringtailed chimera prefers slopes of greater than 10%, and the chimera is operating on scales smaller than 2 meters, than using a 30 meter resolution DEM to identify habitat is completely inappropriate, even if the 30 meter data is perfectly accurate. There is no reason to expect much correlation between the slope maps of an area using data with 2 meter versus 30 meter resolution. Data model uncertainty issues clearly extend beyond this simple example. Alternative data structures like TINs, in which the only sampled locations are the triangle vertices, and contour graphs, which are poorly defined at locations not on the contour interval, may be subject to different uncertainty characteristics. The impact of elevation uncertainty on alternative terrain attributes like slope and aspect will differ by data model as well.

11.4. Characterizing and modeling DEM uncertainty: the simulation and propagation framework

Research into uncertainty modeling is related to the large body of work on spatial data accuracy assessment. The first portion of this section briefly reviews approaches to characterizing the accuracy of DEMs. The relationship between this work and uncertainty modeling is then discussed, and an overview of the modeling procedure follows.

11

A variety of directions have been taken to assess the accuracy of either a particular data production method or a specific elevation data file. The most straightforward method has been to compare accurate field elevation measures with DEM estimates. This has been accomplished in several studies (for examples, see Shearer 1990, Bolstad and Stowe 1994, Monckton 1994). Frequently, however, survey-quality terrain measurements are not available for an area of interest. In such cases, investigators have three general options.

The first option is to make assumptions about intrinsic spatial qualities of the landscape and identify deviations from these assumptions as indications of error. For example, the potential for fractal properties of terrain surfaces has inspired research into the detection of artifacts in DEMs using a fractional Brownian motion model (Polidori et al. 1991, Brown and Bara 1994).

A second option is production-oriented; by studying problems in the methods used to generate digital elevation data, likely artifacts resulting from the methods can be identified. (Robinson 1994) reviewed the generation of DEMs from contours. Common conversion routines suffer from a variety of problems relating to contour configuration. Earlier research compared several common contour interpolation algorithms to identify differences and problems (Clarke et al. 1982). (Carter 1989) explored individual DEMs produced using different methods to identify artifacts. He found that particular error types tended to be symptomatic of particular production methods.

12

The third option is to compare independently derived terrain data for the same location. (Isaacson and Ripple 1990) compared USGS 1:250,000 and 7½’ DEMs. They performed a linear regression on a sample of collocated points from the two datasets and determined that the best fit was not significantly different from a 1:1 line with an intercept at zero. The standard error for the regression was 31 meters (but see Guth 1992 for a critique of their methodology). Greater differences between DEM sets were found in slope and aspect. They found no sign of artifacts in their 1:250,000 DEM, and concluded that the coarser DEM was adequate for their modeling purposes. (Guth 1992) compared SPOTderived 30 meter data with a USGS 7.5’ DEM. Distribution statistics were very comparable, but spatial patterns were identified for locations with large differences between the two data sets. Larger differences were found to be correlated with steep slopes and rugged terrain. Similar analysis was carried out for an area with two independently derived USGS 7.5’ DEMs. Again, areas with large differences were concentrated in high slope portions of the study area.

In the literature discussed above, accuracy characterization is concerned with description of error. In most of this literature, the link between error description and doing something with that description to improve the data’s depiction of the world is not explicit. Users are warned that benching artifacts (i.e., artificially flat regions above and below contour lines) are common in DEMs derived from contour data, or that the average error is usually not zero (i.e., error is biased), or that elevation error is spatially autocorrelated and therefore not independent. How are ecological applications employing elevation data supposed to benefit from such warnings? Uncertainty modeling attempts to answer this

13

question by using these descriptions of discrepancy in a statistically driven manner to develop improved data characterizations of terrain. The problem becomes identifying the impact of imperfect terrain characterization upon spatial applications using this imperfect data.

Two theoretical approaches for uncertainty modeling present themselves: analytical derivation and stochastic simulation/propagation. The first and more straightforward conceptually is to derive the uncertainty measure analytically. (Hunter and Goodchild 1997) use the USGS level 1 DEM accuracy specification to mathematically characterize the RMSE of the orthogonal components of the gradient vector, which is a derived measure similar to slope. Their method relies on an estimation of the spatial autocorrelation of elevation errors, which is critical for any model of spatial uncertainty. Analytical solutions for uncertainty in more complex DEM applications like slope or aspect are much more difficult, however. This problem has been addressed by adopting a second, more general approach, in which error is modeled stochastically, simulations are developed, and uncertainty is propagated through an analysis, producing a distribution of results which may be assessed statistically. This framework is presented in Figure 11.2 for characterization of uncertainty in DEMs and its effects on analysis. It has been explored by a number of researchers in geography and the earth sciences; just a few examples include (Heuvelink et al. 1989, Openshaw 1989, Goovaerts 1997, Hunter and Goodchild 1997, Heuvelink 1999). A more complete discussion of the general framework may be found in (Chapter 9, this volume).

14

On the left in Figure 11.2 are a (possibly large) number of DEM realizations of a study region. Each of these realizations is a characterization of the true elevation surface; it’s statistical characteristics match those specified by a model of uncertainty for the DEM. In the center of the figure is an operation which results in some (possibly spatially distributed) outcome. This operation might be as simple as a single GIS command (identify the elevation at each of the 27 nesting sites; determine the viewshed area for the top of Mount Baldy), or it might be a complex series of algorithms (execute the sage owl habitat spatial model; develop a regression model to explain hemlock growth, using elevation as one explanatory variable) incorporating cartographic modeling techniques (Berry 1993). For each realization, an answer is returned (an array of elevations, a value in square meters, a habitat map, a set of regression coefficients). Taken together, the outcomes from all of the realizations form a distribution which can be analyzed and described using graphs and statistics. Questions asked of these distributions can be subjected to statistical tests (provide a 95% confidence interval for the elevation of nest site D; find the likelihood that the viewshed is smaller than 25 square kilometers).

Of critical importance for this general propagation approach is the method by which the DEM realizations are developed, the spatial uncertainty model. Development of these models has been ongoing for some time, and they have received increasing attention in geography (Heuvelink et al. 1989, Fisher 1991, Lee et al. 1992, Englund 1993, Ehlschlaeger et al. 1997). Chapter 9 in this volume discusses the development of specific

15

uncertainty models. This chapter will deal more generally with how elevation data may be handled with uncertainty models.

The following familiar circumstance forms the backdrop for this discussion on model development. Elevation data is required for a regional study. While a gridded DEM is available for the region of interest, there is concern that infidelity between the actual terrain and this DEM will introduce uncertainty into the analysis. How can some limited information about the error in this DEM be used to build a model? A geostatistical approach to modeling DEM uncertainty begins by treating the elevation uncertainty surface as a random field. At each location on the surface, uncertainty is characterized by a probability distribution for all possible values at that location. Uncertainty model development hinges upon assumptions for characterizing the distribution of possible elevations (or elevation error) at locations where the actual elevation (or elevation error) is not known. In some instances, elevation may be known at a set of points within the study area; for example, a global positioning system (GPS) could have been used to sample several dozen locations within a study area. Conditional methods ensure that the surface model passes through these locations, “honoring” the ground truth data (for introductory discussions of geostatistics in general and conditional methods in particular, see (Isaaks and Srivistava 1989, Goovaerts 1997). In other cases, true elevations may not be available within the area of interest, but error characteristics are known. Perhaps they are derived from data quality specifications, though these would have to include not only aspatial characteristics like RMSE, but also measures for the spatial structure of error and any correlations between error and slope, or error and absolute elevation. Alternatively,

16

they may be assumed to match those of nearby regions for which this information is available. Unconditional methods are useful in either circumstance, and are used to build error surfaces which are then added to the elevation data. Such methods are termed ‘unconditional’ because no elevations are specifically honored; instead, elevations at all locations on the surface are perturbed; for examples see (Ehlschlaeger et al. 1997, Hunter and Goodchild 1997).

A critical issue for either the conditional or unconditional approaches is proper specification of the spatial structure of the error surface. These models rely upon the notion that knowledge of either elevation or elevation error at a particular location informs the simulation of elevation or elevation error at nearby locations. Characterizing this correctly, typically with the semivariance or correlogram, is very important. Figure 11.3 compares two uncertainty realizations; one has very little spatial autocorrelation, while the other is highly spatially autocorrelated, along with the corresponding DEM cross section. Were this DEM cross-section to be added to each of the uncertainty realizations in turn, those resulting elevation surfaces would be quite different. Studies investigating DEM accuracy for a variety of data sets have found that elevation error is spatially autocorrelated, and incorporating this information in an error model is critical (Guth 1992, Monckton 1994, Ehlschlaeger et al. 1997).

The degree of uncertainty in slope and aspect calculations is highly dependent on the spatial structure of error in the DEM. (Hunter and Goodchild 1997) determined that, as spatial autocorrelation of elevation error increased, the standard deviation of simulated

17

slope differences decreased. Other applications using DEMs are also sensitive to spatial autocorrelation in the error structure, for example viewshed calculation (Fisher 1991), particularly when analysis is dependent upon areal estimates (the area in square meters above 600 meters) or neighborhood measures (the upslope contributing area). Therefore, any uncertainty model must account for the spatial dependence of the error surface.

11.5. Case study: a terrain-based model of bigcone spruce habitat

By way of example a habitat model for the bigcone spruce (Pseudotsuga macrocarpa) is developed that uses elevation and elevation derivatives as its sole input. This model is for demonstration purposes only; the results may not be particularly good at identifying spruce habitat. The primary interest is in demonstrating how DEM uncertainty can be modeled, simulated, and propagated through an ecological application. For elevation data input we will use two sources: a portion of a USGS 1:250,000 quadrangle and a set of 250 high quality elevation sample points, such as might be collected with a GPS. The study will demonstrate how the 250 points can be used to model elevation uncertainty and propagate uncertainty to the habitat model. Decisions about the implications of uncertainty in the final product are left for other chapters (see in particular Chapter 18, this volume).

The bigcone spruce is a conifer with foliage similar to the douglas fir. The species range extends from the Santa Barbara, California region south to San Diego County, and is

18

found at higher elevations in the coastal ranges (Griffin and Critchfield 1972). Within Santa Barbara County, stands of bigcone spruce are found on steep north and west facing slopes in the mountains, especially at the heads of canyons in the interior (Smith 1976). This habitat description lends itself to a cartographic model using input raster GIS layers (Berry 1993), presented here in pseudo-map algebra: HABITAT = (ASPECT == (north OR west)) AND (SLOPE > 33) AND (ELEV > 1500)

where steep slope is somewhat arbitrarily chosen to be 33 degrees and higher elevation is chosen to be greater than 1500 meters.

The study area is the USGS 7.5’ quadrangle, Madulce Peak, CA, located in a rugged portion of northeastern Santa Barbara County. Elevation ranges from just over 800 meters to 1,988 meters. Highest elevations are located in the northwestern portion of the quadrangle, with canyons draining to the southeast and north, as shown in Figure 11.4a. Two elevation data sets were obtained for Madulce Peak. The first is the USGS 7.5’ DEM for the quadrangle, with a resolution of 30 meters and 176,584 elevation postings. Elevation locations were transformed to the WGS72 horizontal datum to match the second data set. The second is the USGS 1:250,000 DEM Los Angeles-west, which represents the area with elevation postings at 3 arc second intervals. This DEM was projected to Universal Transverse Mercator, resulting in a square grid resolution of 85 meters, with 22,205 cells containing elevation data; this data set will henceforth be called the 85 m DEM. Both DEMs are presented in Figure 11.4. Note the general agreement in

19

depiction of the terrain but the relative lack of texture in the 85 m DEM. In fact, local disagreement between the two datasets is extremely large, as will be seen shortly.

This case study treats the 7.5’ DEM as ground truth. This is clearly not true but does enable us to check the quality of the model. The bigcone spruce model is run on the ground truth terrain and compared with the habitat model developed from the 85 m DEM; the results are shown in Figure 11.5. The 85 m DEM-based model differs considerably from the ground truth model because the spatial resolution is much coarser and the DEM contains errors.

Instead of the entire ground truth, the study uses a set of 250 randomly sampled points across the quadrangle. These could be collected with a GPS (for this example they are sampled from the 7.5’ DEM). Summary statistics reveal serious inaccuracies in the 85 m DEM: mean error at the 250 sample locations is just 2.36 meters, but the standard deviation is 36.7 meters. At more than half the sample points, the DEM is in error by more than 20 meters.

To account for the large discrepancy between the 85 m DEM and the actual terrain, the simulation/propagation approach discussed in the previous section is performed. A variogram model is constructed for the error data, and Gaussian simulation (a typical geostatistical simulation technique) is employed to generate realizations of the error surface, conditioned to the sample error points. The general procedure follows the stochastic simulation implementation discussion in (Chapter 9, this volume, and

20

Goovaerts 1997). Gstat, a public domain geostatistics package, is used to generate fifty error surface realizations (Pebesma and Wesseling 1998). Each surface is then subtracted from the 85 m DEM to produce a realization of the terrain of the Madulce quadrangle. Figure 11.6 presents one such error realization and its associated elevation realization. It is apparent that the realization is considerably rougher than the original 85 m DEM.

The bigcone spruce habitat model is then run upon each of the 50 elevation realizations to produce 50 alternative habitat maps, one of which is depicted in Figure 11.7a. These 50 maps provide a probabilistic sense of what areas may be suitable for spruce, given the quality of the input data. Information from the 50 maps is summarized in the probability map in Figure 11.7b. Cell values in this map range from 0 (black), meaning that the cell is not identified in any realization, to 100 (white), indicating that all realizations identified the cell as suitable habitat. Note that this probabilistic map has identified many habitat locations in Figure 11.4a. The model has been partially successful at reproducing the complexity of the original habitat map – though the model has only had access to 12.7% as many elevation values as the ground truth DEM, and only 1.1% of those were accurate.

The map also suggests, via comparison with the habitat map produced using only the 85 m DEM (Figure 11.5.b), that uncertainty about the location of spruce habitat is quite high, given the input data. The white areas in Figure 11.7b were counted as habitat by at least 40% of the realizations. However, many of the gray colored cells, including the vertical linear form in the northeastern portion of the map, were identified in only 15-

21

20% of the realizations. One might conclude that the available terrain data is adequate for identifying the largest blocks of most suitable habitat for the bigcone spruce, but that uncertainty is too high for many marginal but possibly quite viable areas in the Madulce quadrangle.

11.6. Conclusion

Topography plays an important role in many environmental processes. Scientists studying land-based ecological systems often identify elevation derivatives like slope and aspect, as well as the elevation surface itself, as important descriptive or explanatory factors. Discretized characterizations of topography, DEMs, are used to represent the terrain. Discrepancies between DEMs and the real-world surfaces they represent introduce uncertainty into analyses which employ DEM data. This chapter classifies DEM-related uncertainty into two categories. The first source of uncertainty results from differences between the form of the spatial data model and the actual elevation surface. Uncertainty arises regarding elevations at locations not directly sampled in the DEM; this is referred to as data model-based uncertainty. An example application demonstrated the sensitivity of a terrain attribute, slope, to data model resolution.

A second source of uncertainty arises when data production methods do not accurately capture elevations at specified x,y locations. This error is directly measurable and is referred to in this chapter as data-based uncertainty. Uncertainty in elevation data can be

22

addressed if it can be characterized, either through field measurements or by the adoption of assumptions about the data and the terrain. An uncertainty model is employed to characterize uncertainty in a spatial dataset. Using a model of spatial uncertainty, a researcher can produce a map via a range of interpolation techniques, including a variety of kriging methods, of the most likely elevation surface, given the available information.

Often a superior approach is to propagate DEM uncertainty through the analysis to identify its impact upon the results of the application. This is accomplished by generating a set of realizations of the DEM via stochastic simulation from an uncertainty model. The ecological application is then run upon all realizations to produce a distribution of results. This procedure was demonstrated in the case study using a cartographic modeling approach to characterize bigcone spruce habitat. The model relied upon a DEM which matched the actual terrain quite poorly. Using scattered spot height data, however, an uncertainty model was developed. This model produced, via Gaussian simulation, a set of elevation realizations. By propagating uncertainty through the habitat model, a more complete picture emerged of the uncertainty in the application due to the quality of the input elevation data.

Several chapters in this book demonstrate approaches to modeling uncertainty which are applicable for projects using digital elevation data. In each case the ecologist is able to use higher quality information to develop an uncertainty model for continuous data. Through subsequent simulation and propagation, the effect of elevation uncertainty on an application can be rigorously assessed, providing not only insight into the quality of the

23

elevation data, but also a closer bond between the ecological model and the real world processes it attempts to characterize.

11.7. Acknowledgments

This work was financially supported by the National Center for Geographic Information and Analysis, the National Center for Ecological Analysis and Synthesis, and the National Imagery and Mapping Agency.

References

Austin, G. E., C. J. Thomas, D. C. Houston, and D. B. A. Thompson. 1994. Predicting the spatial distribution of buzzard Buteo buteo nesting sites using a geographical information system and remote sensing. Journal of Applied Ecology 33:1527-1540.

Berry, J. K. 1993. Cartographic modeling: the analytical capabilities of GIS. Pages 5874 in M. F. Goodchild, B. O. Parks, L. T. Steyaert, editors. Environmental Modeling with GIS. Oxford Press, New York, New York, USA.

Bolstad, P. V., and T. Stowe. 1994. An evaluation of DEM accuracy: elevation, slope, and aspect. Photogrammetric Engineering & Remote Sensing 60(11):1327-1332.

24

Brown, D. G., and T. G. Bara. 1994. Recognition and reduction of systematic error in elevation and derivative surfaces from 7½-minute DEMs. Photogrammetric Engineering & Remote Sensing 60(2):189-194.

Burrough, P. A. 1986. Principles of Geographic Information Systems for Land Resources Assessment, Clarendon Press, Oxford, UK.

Carter, J.R. 1990. Some effects of spatial resolution in the calculation of slope using the spatial derivative. Pages 43-52 in Technical Papers Vol. 1, ACSM-ASPRS Annual Convention, Denver.

Carter, J.R.. 1989. Relative errors identified in USGS gridded DEMs. Pages 255-265 in Proceedings of Autocarto 9, ASPRS-ACSM.

Chang, K., and B. Tsai. 1991. The effect of DEM resolution on slope and aspect mapping. Cartography and Geographic Information Systems 18(1):69-77.

Clarke, A.L, A. Gruen, and J. C. Loon. 1982. The application of contour data for generating high fidelity grid digital elevation models. Pages 213-222 in Proceedings of Autocarto 5, ASPRS-ACSM.

25

de Swart, E.O.A.M., A. G. van der Valk, K. J. Koehler, and A. Barendregt. 1994. Experimental evaluation of realized niche models for predicting responses of plant species to a change in environmental conditions. Journal of Vegetation Science 5:541552.

Dubayah, R., and P. M. Rich. 1993. GIS-based solar radiation modeling. Pages 129-134 in Goodchild, M.F., B. O. Parks, and L. T. Steyaert, editors. Environmental Modeling with GIS. Oxford Press, New York, New York, USA.

Ehlschlaeger, C.R., A. M. Shortridge, and M. F. Goodchild. 1997. Visualizing spatial data uncertainty using animation. Computers & Geosciences 23(4):387-395.

Englund, E.J. 1993. Spatial simulation: environmental applications. Pages 432-437 in Goodchild, M.F., B. O. Parks, and L. T. Steyaert, editors. Environmental Modeling with GIS. Oxford Press, New York, New York, USA.

Fisher, P. F. 1991. First experiments in viewshed uncertainty: the accuracy of the viewshed area. Photogrammetric Engineering & Remote Sensing 57(10):1321-1327.

Gao, J. 1997. Resolution and accuracy of terrain representation by grid DEMs at a micro-scale. International Journal of Geographical Information Science 11(2):199-212.

26

Garbrecht, J., and P. Starks. 1995. Note on the use of USGS level 1 7.5-minute DEM coverages for landscape drainage analyses. Photogrammetric Engineering & Remote Sensing 61(5):519-522.

Gessler, P. E., I. D. Moore, N. J. McKenzie, and P. J. Ryan. 1993. Soil-landscape modeling in southeastern Australia. Pages 53-58 in Goodchild, M.F., B. O. Parks, and L. T. Steyaert, editors. Environmental Modeling with GIS. Oxford Press, New York, New York, USA.

Goodchild, M. F. 1992. Geographical data modeling. Computers & Geosciences 18(4):401-408.

Goodchild, M. F. 1995. Attribute accuracy. Pages 59-79 in Guptil, S.C., and J. L. Morrison, editors. Elements of Spatial Data Quality. Elsevier, London, UK.

Goovaerts, P. 1997. Geostatistics for Natural Resources Evaluation. Oxford, New York, New York, USA.

Griffin, J. R., and W. B. Critchfield. 1972. The Distribution of Forest Trees in California. USDA Forest Service Research Paper PFW-82/1972.

Guth, P. L., 1992. Spatial analysis of DEM error. Pages 187-196 in ASPRS/ACSM/RT Technical Papers 2, Washington, DC, USA.

27

Heuvelink, G. B. M.. 1999. Propagation of error in spatial modeling with GIS. Pages 207-217 in Longley, P. A., M. F. Goodchild, D. J. Maguire, and D. W. Rhind, editors. Geographical Information Systems, Volume 1: Principles and Technical Issues, 2nd Edition. Wiley, New York, New York, USA.

Heuvelink, G. B., P. A. Burrough, and A. Stein. 1989. Propagation of errors in spatial modeling with GIS. International Journal of Geographical Information Systems 3(4):303322.

Huber, T. P., and K. E. Casler. 1990. Initial analysis of Landsat TM data for elk habitat mapping. International Journal of Remote Sensing 11(5):907-912.

Hunter, G. J., M. F. Goodchild. 1997. Modeling the uncertainty of slope and aspect estimates derived from spatial databases. Geographical Analysis 29(1):35-49.

Hutchinson, M. F. 1996. A locally adaptive approach to the interpolation of digital elevation models. Third International Conference/Workshop on Integrating GIS and Environmental Modeling, Santa Fe. NCGIA, Santa Barbara.

Isaaks, E. H., and R. M. Srivastava. 1989. Applied Geostatistics. Oxford University Press, New York, New York, USA.

28

Kumler, M. P. 1994. An intensive comparison of triangulated irregular networks (TINs) and digital elevation models (DEMs). Cartographica Monograph 45 31(2):1-99.

Isaacson, D. L., and W. J. Ripple. 1990. Comparison of 7.5-minute and 1 degree digital elevation models. Photogrammetric Engineering & Remote Sensing 56(11):1523-1527.

Lam, N-S. 1983. Spatial interpolation methods: a review. The American Cartographer 10(2):129-149.

Lee, J., P. K. Snyder, and P. F. Fisher. 1992. Modeling the effect of data errors on feature extraction from digital elevation models. Photogrammetric Engineering & Remote Sensing 58(10):1461-1467.

Liebhold, A. M., G. A. Elmes, J. A. Halverson, and J. Quimby. 1994. Landscape characterization of forest susceptibility to gypsy moth defoliation. Forest Science 40(1):18-29.

Monckton, C. G. 1994. An investigation into the spatial structure of error in digital elevation data. Pages 201-210 in Worboys, M. F., editor. Innovations in GIS 1. Taylor & Francis, London, UK.

29

Moore, I.D., Grayson, R.B., Ladson, A.R., 1991. Digital terrain modelling: a review of hydrological, geomorphological, and biological applications. Hydrological Processes, 5, 3-30.

Nisbet, R.A. and D. B. Botkin. 1993. Integrating a forest growth model with a geographic information system. Pages 265-269 in Goodchild, M.F., B. O. Parks, and L. T. Steyaert, editors. Environmental Modeling with GIS. Oxford Press, New York, New York, USA.

Nemani, R., S. W. Running, L. E. Band, and D. L. Peterson. 1993. Regional hydroecological simulation system: an illustration of the integration of ecosystem models in a GIS. Pages 296-304 in Goodchild, M. F., B. O. Parks, and L. T. Steyaert, editors. Environmental Modeling with GIS. Oxford Press, New York, New York, USA.

Oliver, M., R. Webster, and J. Gerrard. 1989. Geostatistics in physical geography. Part II: applications. Trans. Inst. British Geography 14:270-286.

Openshaw, S. 1989. Learning to live with errors in spatial databases. Pages 263-276 in Goodchild, M., and S. Gopal, editors. The Accuracy of Spatial Databases. Taylor & Francis, London, UK.

Pebesma, E. J., and C. G. Wesseling. 1998. Gstat, a program for geostatistical modelling, prediction and simulation. Computers and Geosciences 24(1):17-31.

30

Polidori, L., J. Chorowicz, and R. Guillande. 1991. Description of terrain as a fractal surface, and application to digital elevation model quality assessment. Photogrammetric Engineering & Remote Sensing 48(2):18-23.

Robinson, G. J. 1994. The accuracy of digital elevation models derived from digitised contour data. Photogrammetric Record 14(83):805-814.

Russel, G. D., C. P. Hawkins, and M. P. O’Neill. 1997. The role of GIS in selecting sites for riparian restoration based on hydrology and land use. Restoration Ecology 5(4s):5668.

Saulnier, G. M., C. Obled, and K. Beven. 1997. Analytical compensation between DTM grid resolution and effective values of saturated hydraulic conductivity within the TOPMODEL framework. Hydrological Processes 11:1331-1346.

Shearer, J. W. 1990. Accuracy of digital terrain models. Pages 315-336 in Petrie, G, and T. J. M. Kennie, editors. Terrain Modelling in Surveying and Civil Engineering. Thomas Telford, London, UK.

Smith, C. F. 1976. A Flora of the Santa Barbara Region, California. Santa Barbara Museum of Natural History, Santa Barbara, CA, USA.

31

Unwin, D. J. 1995. Geographical information systems and the problem of ‘error and uncertainty’. Progress in Human Geography 19(4):549-558.

U.S. GS (U.S. Geological Survey). 1995. National Mapping Program Technical Instructions. Standards for Digital Elevation Models. U.S. Dept. Interior, Washington, DC, USA.

Weibel, R., and M. Heller. 1991. Digital terain modelling. Pages 269-297 in Maguire, D.J., M. F. Goodchild, and D. W. Rhind, editors. Geographical Information Systems: Principles and Applications (1). Longman, London, UK.

Zhang, W., and D. R. Montgomery. 1994. Digital elevation model grid size, landscape representation, and hydrologic simulations. Water Resources Research 30(4):1019-1028.

Zhu, A. X., L. E. Band, B. Dutton, and T. J. Nimlos. 1996. Automated soil inference under fuzzy logic. Ecological Modelling 90:123-145.

32

Table and Figure captions (1 page)

Table 11.1. Characterization of degrees slope, for five cell resolutions and 3 interpolation methods.

Figure 11.1. Boxplots for degrees slope distributions; five cell size resolutions for Rancho Santa Fe 7.5’ raster DEM.

Figure 11.2. Uncertainty propagation using DEMs.

Figure 11.3. Spatial autocorrelation of the uncertainty surface.

Figure 11.4: Madulce Peak, California. 200 meter contours, 1000 meter baseline. Lighter shades indicate higher elevations in all elevation maps in Figures 11.411.7. North is at the top for all maps in Figures 11.4-11.7.

Figure 11.5. Bigcone Spruce habitat model outcome(white indicates suitable). The 85 m DEM does not capture the range or complexity of the habitat identified by the ground truth DEM.

33

Figure 11.6. One (of 50) simulations. 200 meter contours, 800 meter baseline. The elevation realization is generated by adding the error realization to the 85 m DEM. Terrain complexity matches that of the ground truth DEM.

Figure 11.7. Characterizing DEM uncertainty in bigcone spruce habitat. White in 11.7.a indicates suitable habitat identified in one realization. The map in 11.7.b indicates the probability of each point being suitable for habitat. It was generated by summing across all 50 habitat realizations. Lighter regions indicate higher probabilities of suitable habitat.

34

Table 11.1 Cell size in meters 30 60 90 150 300

Nearest Neighbor Range, Mean, Median (in degrees) 0.00-45.01, 9.74, 8.10 0.00-38.48, 8.45, 6.99 0.00-34.30, 7.63, 6.19 0.00-30.26, 6.32, 5.07 0.05-22.15, 4.60, 3.64

Bilinear Range, Mean, Median (in degrees)

Cubic Range, Mean, Median (in degrees)

0.00-38.11, 0.00-34.30, 0.00-30.26, 0.14-22.33,

0.00-39.66, 0.00-34.30, 0.00-30.26, 0.13-22.40,

8.41, 7.63, 6.32, 4.60,

6.92 6.19 5.07 3.64

8.69, 7.63, 6.32, 4.63,

7.18 6.19 5.07 3.67

Figure 11.1 0

10

20

30

40

Figure 11.2 Simulated DEM realizations

Distribution of outcomes analytic operation

35

Figure 11.3

DEM cross-section

Highly autocorrelated uncertainty realization (cross-section)

Uncorrelated uncertainty realization (cross-section)

Figure 11.4

a. 30 m. DEM (“Ground Truth”)

b. 85 m. DEM

36

Figure 11.5

a. from 30 m. DEM (“Ground Truth”)

b. from 85 m. DEM

37

Figure 11.6

a. error realization (black > 2 SD)

b. elevation realization

38

Figure 11.7

a. single habitat realization

b. stochastic habitat map

39