Accuracy and Uncertainty in GIS: Influence on (Hydrologic) Modeling

Accuracy and Uncertainty in GIS: Influence on (Hydrologic) Modeling Suzanne Cox NRS 509 Fall 2007 The introduction and advancement of spatial informat...
Author: Cassandra Glenn
2 downloads 0 Views 45KB Size
Accuracy and Uncertainty in GIS: Influence on (Hydrologic) Modeling Suzanne Cox NRS 509 Fall 2007 The introduction and advancement of spatial information technologies such as remote sensing and geographic information systems (GIS) have provided many (software) tools that assist in natural resource planning and management. GIS technology has particularly provided many cost-effective tools for evaluating land use/land cover (LU/LC) and the implications of different land use practices within a watershed (Crawford et al., 1998). In an attempt to study and/or predict the processes that occur within a watershed, hydrologic models are often used. Hydrologic models are mathematical simulations or algorithms that use measurements of local rainfall along with land use/land cover patterns, soil, topography, and drainage to predict streamflow or runoff as well as sediment loss or erosion that might be caused by a storm of a given magnitude (Campbell, 2002). Many hydrologic modeling applications require land cover information because land cover is an important variable that highly influences hydrological processes including watershed runoff and sediment loss (Ward and Elliot, 1995). Such models provide a means of estimating or predicting the impacts of certain land use/land cover changes. Because (hydrologic) models require detailed and accurate data, much of the cost and time involved in applying a given model to specific area arise from the effort required to collect accurate information concerning land cover data (Campbell, 2002). Land cover information can be directly interpreted from aerial photographs manually or digitally, or from satellite images with the aid of computer-assisted classification software that rely on the relationship between land cover and spectral response (Lillesand and Kiefer, 2000). Manual interpretation of land cover from aerial photographs has historically been a standard procedure for deriving land cover classification data; however, in recent years the use of image classification software of digital remote sensing data as a source of land cover information has grown in popularity due to reduced costs, greater efficiency, and the ability to quickly generate new land cover information as changes occur (Campbell, 2002). Land cover classification maps underline the role of GIS in the planning process by assisting in preliminary watershed characterization and predicting potential impacts of certain land use patterns. Thus GIS has become an invaluable tool in the acquisition, manipulation, presentation, and evaluation of spatial data that describe watershed land use/land cover, as well as soils and hydrologic conditions. Significant progress in coupling GIS data, particularly land cover, soils, and digital elevation models (DEMs) with hydrologic simulation models has provided an efficient and cost-effective spatial decision support system to assess

1

the implications of different watershed management and planning alternatives and to perform “what if” scenarios. As well, GIS-based watershed delineation has become a practical alternative to traditional manual delineation methods due to the increased availability and improved accuracy of DEMs and topographic databases. Particular advantages associated with DEM-based delineation are that the drainage divides are consistent with the elevation data, and the delineation process is more objective, repeatable, and transferable. However, the delineation process is very sensitive to the quality of the DEM and thus creates and element of uncertainty in the results (Oksanen and Sarjakoski, 2005). Error and uncertainty associated with remotely sensed and GIS data can adversely affect the reliability of models and their output data thus affecting planning and management decisions (Crawford et al., 1998). The magnitude of error propagation is directly correlated to model sensitivity—how the results of the model are influenced by variations in input data. Thus, the influence of data accuracy becomes an important consideration in determining the validity of models based on the magnitude of how error and uncertainty of input data are transmitted to the results (Burrough, 1989). The question/concern then becomes: does the transmitted error exceed an acceptable or essential level for the proper use of the results? Perceptibly, the smaller the error and the more appropriate the resolution of data associated with model inputs and parameter values, the better the results will be. Errors may have many different sources and are often dependent on the spatial resolution and scale of the input data being observed and/or referenced. Sources of error include encoding errors, positional errors, interpretation error, variation in classification, registration differences, errors in remotely sensed classification, and land cover change/temporal error, and of course the fuzzy creep. Given the many potential sources of error and the relative magnitude of each, it is essential that GIS users consider data accuracy, such as land cover classification, soils, or DEM accuracy, and the potential outcome based on the scope and sensitivity of the project/application (Congalton and Plourde, 2002). According to Morrison, 2002, digital spatial datasets are routinely used for purposes that differ from their intended use and that readily available sources of information about how to collect, interpret, and use digital spatial data do not exist. Because GIS plays such an active role in decision-making processes in many disciplines today, effective use of GIS becomes more and more important, and an overt knowledge of the uncertainty inherent in the data also becomes more important as the potential impact of decisions based on GIS increases (Alesheikh et al., 1999). In order to make effective and intelligent land use decisions using GIS, and to ensure the input data is appropriately accurate and reliable, accuracy and uncertainty assessments should be an integral component of GIS and remotely sensed data (Congalton and Plourde, 2002).

2

Additional References: Alesheikh, A.A., J.A.R. Blais, M.A. Chapman, and H. Karimi. 1999. Rigorous Geospatial Data Uncertainty Models for GISs, in Spatial Accuracy Assessment:Land Information Uncertainty in Natural Resources, Kim Lowell and Annick Jaton, Eds., Ann Arbor Press, Chelsea, Michigan, 1999, pp. 195-202. Burrough, P.A. 1989. Matching spatial databases and quantitative models in land resource assessment. Soil Use and Management. 5 (1): 3-8. Campbell, James B. 2002. Introduction to Remote Sensing. The Guilford Press, New York, 2002. Congalton, Russel G. and Lucie C. Plourde. 2002. Quality Assurance and Accuracy Assessment of Information Derived from Remotely Sensed Data, in The Manual Of Geospatial Science and Technology, John D. Bossler, Editor, Taylor and Francis, London and New York, 2002, pp. 349-363. Crawford, I.M., U.S. Tim, D.K. Jain, and H. Liao. 1998. Managing Ecosystems in a Watershed Context: Progress Made and the Emerging Role of Integrated Spatial Information Technologies, in GIS Technologies and their Environmental Applications, P.Pascolo and C.A. Brebbia, Eds., WITpress, Southampton, Boston, 1998, pp. 13-22. Lillesand, Thomas M. and Ralph W. Kiefer. 2000. Remote Sensing and Image Interpretation, John Wiley and Sons, Inc., New York 2000. Morrison, Joel. 2002. Spatial Data Quality, in The Manual Of Geospatial Science and Technology, John D. Bossler, Editor, Taylor and Francis, London and New York 2002, pp. 500-516. Oksanen, J. and T. Sarjakoski. 2005. Error propagation analysis of DEM-based drainage basin delineation. International Journal of Remote Sensing. 26 (14): 3085-3102.

3

Ward, Andy D. and William J. Elliot. 1995. Environmental Hydrology. CRC Press, Boca Raton, Florida 1995.

Annotated Bibliography: Accuracy and Uncertainty in GIS Suzanne Cox November, 2007 Burrough, P.A. 1989. Matching spatial databases and quantitative models in land resource assessment. Soil Use and Management. 5 (1): 3-8. In the article, Burrough discusses the problems and dangers associated with the ad hoc linkage of simulation models and GIS. According to Burroughs, models often require data with a much better spatial resolution than is usually available thus particular attention needs to be paid to the problems of error propagation when using models. For the model results to carry any true meaning, careful consideration should be given to model calibration and sensitivity analysis, as well as error propagation. The magnitude of error propagation is directly correlated to model sensitivity—how the results of the model are influenced by variations in input data. Thus, the influence of data accuracy becomes an important consideration in determining the validity of models based on the magnitude of how error and uncertainty of input data are transmitted to the results. The question/concern then becomes: does the transmitted error exceed an acceptable or essential level for the proper use of the results? Perceptibly, the smaller the error and the more appropriate the resolution of data associated with model inputs and parameter values, the better the results will be.

Carlisle, Bruce H. 2005. Modeling the Spatial Distribution of DEM Error. Transactions in GIS. 9 (4): 521-540. In the research presented in this article, Carlisle addresses the limitations of using a single root mean square error (RMSE) value to represent the uncertainty associated with a digital elevation model (DEM) by developing a new technique for creating a spatially distributed model of DEM quality—an accuracy surface. The new technique is based on the hypothesis that the distribution and scale of elevation error within a DEM are at least partly related to terrain morphometry (quantitative land surface analysis) and involves generating a set of parameters

4

to characterize terrain morphometry then developing regression models to define the relationship between DEM error and morphometric character. Measures such as RMSE and standard deviation of error summarize elevation errors in a DEM as a single value. Describing DEM errors in such a way is advantageous because a single value is relatively quick to calculate and easy to report, and makes DEM comparison simple. As well, single value accuracy measures can be used to model the influence of DEM error on uncertainty in DEM-based spatial modeling outcomes. However, a single RMSE value implies that error in uniform across the DEM. This is usually not the case, particularly in DEMs representing complex terrain. Utilizing a set of morphometric terrain parameters derived from DEMs such as elevation, gradient, aspect, and curvature, the research presented in this paper shows that the magnitude and distribution of errors in a DEM are related to the varying character of the terrain. Also, the nature of the relationship between DEM error and terrain parameters varies according to the type of terrain, the resolution of the DEM, and the DEM production method. According to Carlisle’s research, the accuracy surfaces derived from a spatially distributed model of DEM quality provide more detailed information about DEM accuracy than a single estimate of RSME.

Dehghan, Hamid and Hassan Ghassemian. 2006. Measurement of uncertainty by the entropy: application to the classification of MSS data. International Journal of Remote Sensing. 27 (18): 4005-4014. Dehghan and Ghassemian address the uncertainty that is imposed simultaneously with multispectral data acquisition in remote sensing and introduce a new entropy-based criterion. Because uncertainty grows and propagates in processing, transmitting, and classification processes, it affects the quality of extracted information. Criteria such as RMSE are often used to evaluate the classification accuracy and reliability but no special criterion has been established for evaluation of the certainty and uncertainty of classification results. This study proposes an entropy-based criterion as a special criterion for visualizing and evaluating the uncertainty of results. This criterion is employed in the information theory which involves the quantification of information. Information entropy is the measure or quantification of the uncertainty associated with a random variable contained within a piece of data. In this case the entropy criterion is used to express in a single number the distribution and extent of uncertainty of classification results by summarizing the classification uncertainty in a single number per pixel, per class, or per image. The uncertainty visualization of the classification results gives a point of view about results quality and classifier performance; therefore, the certainty criteria can be used as a new means of classification performance comparison.

5

Lu, D. and Q. Weng. 2007. A survey of image classification methods and techniques for improving classification performance. International Journal of Remote Sensing. 28 (5): 823-870. In the article, Lu and Weng examine current practices and problems associated with image classification and summarize some of the major advanced classification approaches and techniques used for improving classification accuracy. Three particular areas of progress in image classification discussed include: (1) development of advanced classification algorithms such as subpixel and knowledge-based algorithms; (2) use of multiple remote-sensing features such as spectral, spatial, multitemporal, and multisensor information; and (3) incorporation of ancillary (GIS) data, such as topography, soil, road, and census data, into classification procedures. The integration of GIS and remote sensing have resulted in significant classification improvement and may provide new insights in handling the scale issue associated with different data resolutions as well.

Other key considerations presented in the article include the importance of accuracy assessment in image classification and the evaluation of uncertainties caused by the use of multisource data. Uncertainty and error propagation in the image processing chain is an important factor influencing classification accuracy. Identifying the weakest links in the chain and reducing uncertainties are critical for improvement of classification accuracy.

Oksanen, J. and T. Sarjakoski. 2005. Error propagation analysis of DEMbased drainage basin delineation. International Journal of Remote Sensing. 26 (14): 3085-3102. Oksanen and Sarjakoski investigate the uncertainty in the GIS-based automatic delineation process of drainage basins from DEMs. The study represents a process-convolution-based Monte Carlo simulation tool that offers a framework for investigating DEM error propagation with thousands of GIS-analysis repetitions. (Monte Carlo methods/simulations can be described as any technique of statistical sampling employed to approximate solutions to quantitative problems or a method based on the use of random numbers and probability statistics to investigate problems). GIS-based watershed delineation has become a practical alternative to traditional manual delineation methods due to the increased availability and improved accuracy of DEMs and topographic databases. Particular advantages associated with DEM-based delineation are that the drainage divides are consistent with the elevation data, and the delineation process is more objective, repeatable, and transferable. However, the delineation process is very sensitive

6

to the quality of the DEM and thus creates and element of uncertainty in the results. The purpose of this study was to devise a Monte Carlo tool that offers framework for generating meaningful results with thousands of repetitions of catchment delineations, to investigate the difference between drainage basin delineation where DEM error propagation had been taken into account, and to apply the Monte Carlo results in an attempt to express drainage basin-specific quality measurements of the DEM. The results revealed that while the Monte Carlo method showed efficiency as an error-propagation analysis tool, the characterization of the DEM error appeared lacking. Overall, the results showed that automatic basin delineation is very sensitive to DEM uncertainty and that the modeling of this uncertainty can be used to find out the lower bound for the size of drainage basins that can be delineated with sufficient accuracy.

Quinn, Paul. 2004. Scale appropriate modeling: representing cause-andeffect relationships in nitrate pollution at the catchment scale for the purpose of catchment scale planning. Journal of Hydrology. 291: 197-217. In this article, Quinn discusses some issues and complexity of modeling nitrate pollution at the catchment scale. He argues that the modeler must use the appropriate model type, at the appropriate scale, to best understand nitrate losses observed at that scale. He also argues that a detailed, complex physical model is not necessary at this scale. The article also shows, through a fully worked example, how hydrologic flow paths and nitrate pollution sources can be simulated at the catchment by first reflecting our understanding of the physical world, and then paying full respect to catchment scale issues and uncertainty problems. The paper demonstrates that the fluxes of flow and nitrate that are generated at the plot scale can be routed downstream and mixed with other flow components, such as groundwater or baseflow, to give a realistic catchment scale estimate of nitrate pollution. According to Quinn, synchronous determinations of water and nitrate fluxes made at the point, plot, hillslope, catchment and basin scales offers the best hope of understanding scale dependent effects and determining modeling strategies appropriate to specific scales of application. The paper suggests that the scaling up of cause-and-effect relationships can be achieved by combining the use of outputs from physicallybased models applied at the local (plot) scale with quasi-physical models at the hillslope/catchment scale (which can reflect buffer and wetland effects) and a simple lumped MIR (Minimum Information Requirement) model that routes and mixes flow downstream so that simulations can be made at any catchment scale.

Van Oort, P. A. J. 2005. Improving land cover change estimates by accounting for classification errors. International Journal of Remote Sensing. 26 (14): 3009-3024.

7

The objectives of this study were to research current land cover change detection methods and propose other methods to improve land cover change estimates by accounting for classification errors. The results of the study show that the error matrix can be used for more purposed than just describing the quality of a single data set; it can also be used to improve land cover change estimates. According to Van Oort, Producers of vector datasets, such as topographical maps, tend to focus on quantifying positional accuracy and rarely report error matrices. Error matrices cannot be derived from standard measures of positional accuracy since some but not all misregistrations result in misclassification. He asserts that error matrices associated with these datasets should also be reported, and the estimates of land cover change between two dates could be further improved by accounting for temporal correlation in classification accuracy.

Warwick, J. J. and S. J. Haness. 1994. Efficacy of Arc/INFO GIS Application to Hydrologic Modeling. (Journal of Water Resources, Planning, and Management). 120 (3): 366-381. This article dates back to 1994, and I thought it presented an interesting perspective on the application of GIS in hydrologic modeling before it became such a widespread application and while the science/software was still relatively young. In the article, the authors constructed a hypothetical watershed to in an attempt to test the efficacy and to quantify specific inaccuracies associated with the application of ArcINFO, and to provide spatially related input for US Army Corps of Engineers HEC-1 hydrologic model. They used ArcINFO to delineate watersheds and to calculate basin area, runoff curve numbers, and other hydrologic model parameters. A TIN (triangulated irregular network) model was created from elevation points to create a 3-D terrain model. They also incorporated landcover, streams and soils coverages into their analysis. Their overall conclusion was that ArcINFO performed many tedious and labor intensive tasks quite effectively; however, they encountered problems finding the correct location of basin centroids (using ArcINFO v.5.0.1) and in accurately estimating the average rainfall intensity over a basin. (The centroid issue has long been addressed, probably with the next version. It is interesting though that accurately estimating average rainfall intensity over a basin is still a fuzzy issue today). Ultimately, the authors decided that while some of the GIS products might look impressive, the real product (outflow hydrographs) will likely be no more accurate than that obtained by more traditional (manual) means, and that the cost of a GIS may not outweigh the benefits. They concluded that although the growing presence of detailed spatial information and associated analytical tools will predictably foster the continued development and implementation of more

8

complex hydrologic models, the traditional hydrologic modeling approaches will likely be utilized for some time.

9

Suggest Documents