Definition of Spatial Analysis

Definition of Spatial Analysis • A method of analysis is spatial if the results depend on the locations of the objects being analyzed • i.e. if you mo...
Author: Bennett Jordan
2 downloads 0 Views 129KB Size
Definition of Spatial Analysis • A method of analysis is spatial if the results depend on the locations of the objects being analyzed • i.e. if you move the objects and the results change, or the results are not invariant under relocation, spatial analysis is being applied • To conduct a spatial analysis requires both attributes and locations of objects • Conveniently, GIS has been designed to store both … we usually assemble geographic information in a GIS so that we might analyze it David Tenenbaum – GEOG 070 – UNC-CH Spring 2005

Types of Spatial Analysis • There are literally thousands of spatial analysis techniques, with new ones developed all the time • We will consider six categories of spatial analyses, each having a distinct conceptual basis: 1. Queries and reasoning 2. Measurements Chapter 13 3. Transformations 4. Descriptive summaries 5. Optimization Chapter 14 6. Hypothesis testing David Tenenbaum – GEOG 070 – UNC-CH Spring 2005

3. Transformations • The category encompassing transformations of spatial data includes many analytical approaches that can be applied using either the vector or raster spatial data models, or combining both together • Transformations create new attributes and objects, based on some simple rules: • They involve geometric construction or calculation • They may also create new fields, either from existing fields or from discrete objects David Tenenbaum – GEOG 070 – UNC-CH Spring 2005

Feature in Feature Transformations • These transformations determine whether a feature lies inside or outside another feature • The most basic of these transformations is point in polygon analysis, which can be applied in various situations: • The application is usually one of generalization: Assign many points to containing polygons • For example, this is used to assign crimes to police precincts, voters to voting districts, accidents to reporting counties, etc. David Tenenbaum – GEOG 070 – UNC-CH Spring 2005

Point in Polygon Analysis • Overlay point layer (A) with polygon layer (B) – In which B polygon are A points located? » Assign polygon attributes from B to points in A Example: Comparing soil mineral content at sample borehole locations (points) with land use (polygons)...

A

B

David Tenenbaum – GEOG 070 – UNC-CH Spring 2005

Line in Polygon Analysis • Overlay line layer (A) with polygon layer (B) – In which B polygons are A lines located? » Assign polygon attributes from B to lines in A Example: Assign land use attributes (polygons) to streams (lines):

A

B

David Tenenbaum – GEOG 070 – UNC-CH Spring 2005

Polygon Overlay Analysis • We can consider two different cases of analyses where we compare two layers of polygons: • We can consider the case of discrete objects, where individual polygon features make up a layer AND • We can also consider the field case, where a layer of polygons consists of edge-to-edge polygons that fill the entire area of interest • In the discrete object case: Find the polygons formed by the intersection of two polygons. There are instances where we might want to use this sort of analysis: • Do two polygons intersect? • Where area is in Polygon A but not in Polygon B? David Tenenbaum – GEOG 070 – UNC-CH Spring 2005

Polygon Overlay, Discrete Object Case

B

A 9

5

2

8 1

6

3

4 7

•In this example, the union of two polygons is taken to form nine new polygons. One is formed from both input polygons (1); four are formed by Polygon A and not Polygon B (2-5); and four are formed by Polygon B and not Polygon A (6-9)

David Tenenbaum – GEOG 070 – UNC-CH Spring 2005

Polygon Overlay, Field Case • Two layers of edge-to-edge polygons are the inputs, representing two thematic descriptions of the same area, e.g. soil type and land ownership information • The two layers are overlaid, and all intersections are computed, creating a new layer: • Each polygon in the new layer has both a soil type and land ownership information • The two attributes are said to be concatenated • The task is often performed using the raster spatial data model, but can use vector map algebra David Tenenbaum – GEOG 070 – UNC-CH Spring 2005

Polygon Overlay, Field Case Owner X Owner Y Public

•A layer representing a field of land ownership (symbolized using colors) is overlaid on a layer of soil type (layers offset for emphasis). The result after overlay will be a single layer with 5 polygons, each with land ownership information and a soil type

Polygon Overlay Analysis • Overlay polygon layer (A) with polygon layer (B) – What are the spatial polygon combinations of A and B? » Generate a new data layer with combined polygons • attributes from both layers are included in output • How are polygons combined (i.e. what geometric rules are used for combination)? – UNION (Boolean OR) – INTERSECTION (Boolean AND) – IDENTITY • Polygon overlay will generally result in a significant increase in the number of spatial entities in the output – can result in output that is too complex too interpret David Tenenbaum – GEOG 070 – UNC-CH Spring 2005

Polygon Overlay Analysis •





UNION overlay polygons and keep areas from both layers INTERSECTION overlay polygons and keep only areas in the input layer that fall within the intersection layer IDENTITY overlay polygons and keep areas from input layer David Tenenbaum – GEOG 070 – UNC-CH Spring 2005

Problems with Vector Overlay Analysis (esp. Polygon) • Overlay analysis using the vector spatial data model is highly computationally intensive: – Complicated input layers can tax even current processors • There is a tradeoff between the complexity vs. the interpretability of results – Complex input layers with many polygons can result in 100s or 1000s of resulting polygon combinations … can we make sense of all those combinations? David Tenenbaum – GEOG 070 – UNC-CH Spring 2005

Problems with Vector Overlay Analysis (esp. Polygon) • There are often spatial mismatches between input layers • This is a common problem of vector geospatial data sets » Overlay results in spurious sliver polygons • We can “filter” out spurious slivers by querying to select all polygons with AREA less than some minimum threshold • It is difficult to choose a threshold to avoid deleting ‘real’ polygons

David Tenenbaum – GEOG 070 – UNC-CH Spring 2005

Overlay of Fields Represented as Rasters

A

B

The two input data sets are maps of (A) travel time from the urban area shown in black, and (B) county (red indicates County X, white indicates County Y). The output map identifies travel time to areas in County Y only, and might be used to compute average travel time to points in that county in a subsequent step

Boolean Operations •In both Venn probability diagrams and vector overlay analysis, we used UNION & INTERSECTION operations, corresponding to Boolean operations of OR & AND A

OR

B

A

B

A

AND

B

A

B

UNION

INTERSECTION

•We can apply these concepts in the raster spatial data model as well, but on a per cell basis with two input layers that contain true/false or 1/0 data: David Tenenbaum – GEOG 070 – UNC-CH Spring 2005

Boolean Operations with Raster Layers •The AND operation requires that the value of cells in both input layers be equal to 1 for the output to have a value of 1: 0

1

1

0

0

1

1

0

1

AND

0

0

0

1

1

1

0

0

1

=

•The OR operation requires that the value of a cells in either input layer be equal to 1 for the output to have a value of 1: 0

1

1

0

0

1

1

0

1

OR

0

0

0

1

1

1

0

0

1

=

David Tenenbaum – GEOG 070 – UNC-CH Spring 2005

Algebraic Operations w/ Raster Layers •We can extend this concept from Boolean logic to algebra •Map algebra: •Treats input layers as numeric inputs to mathematical operations (each layer is a separate numeric input) •The result of the operation on the inputs is calculated on a cell-by-cell basis •This allows for complex overlay analyses that can use as many input layers and operations as necessary •A common application of this approach is suitability analysis where multiple input layers determine suitable sites for a desired purpose by scoring cells in the input layers according to their effect on suitability and combining them, often weighting layers based on their importance David Tenenbaum – GEOG 070 – UNC-CH Spring 2005

Simple Arithmetic Operations Summation 0

1

1

0

0

0

0

0

1

1

1

1

1

0

1

0

0

0

+

0

1

1

1

1

2

1

1

0

2

0

0

0

0

0

1

1

1

0

0

1

0

0

1

0

0

1

=

Multiplication 0

1

1

0

0

1

1

0

1

×

=

Summation of more than two layers 0

1

1

0

0

1

1

0

1

+

0

0

0

1

1

1

0

0

1

+

0

0

0

1

1

1

0

0

1

=

0

1

1

2

2

3

1

0

3

David Tenenbaum – GEOG 070 – UNC-CH Spring 2005

Raster (Image) Difference The difference between two layers 5

4

3

6

5

6

7

1

5

-

3

5

6

1

4

5

3

2

7

=

2

-1

-3

5

1

1

4

-1

-2

•An application of taking the differences between layers is change detection: •Suppose we have two raster layers that each show a map of the same phenomenon at a particular location, and each was generated at a different point in time •By taking the difference between the layers, we can detect changes in that phenomenon over that interval of time

•Question: How can the locations where changes have occurred be identified using the difference layer? David Tenenbaum – GEOG 070 – UNC-CH Spring 2005

Raster (Image) Division Question: Can we perform the following operation? Are there any circumstances where we cannot perform this operation? Why or why not? ÷

=

David Tenenbaum – GEOG 070 – UNC-CH Spring 2005

More Complex Operations Linear Transformation a

1

2

4

3

2

1

5

3

2

+b

1

0

0

5

1

1

2

0

1

+ c

0

0

0

1

1

1

0

0

1

=

•We can multiply layers by constants (such as a, b, and c in the example above) before summation •This could applied in the context of computing the results of a regression model (e.g. output y = a*x1 + b*x2 + c*x3) using raster layers •Another application is suitability analysis, where individual input layers might be various criteria, and the constants a, b, and c determine the weights associated with those criteria David Tenenbaum – GEOG 070 – UNC-CH Spring 2005

Spatial Interpolation • Often, we have a geographic phenomenon that we wish to represent using a field (e.g. elevation), but the values of that field have been measured at sample points • There is a need to estimate the complete field from the discrete samples, in order to • estimate values at points where the field was not measured • create a contour map by drawing isolines between the data points, or a raster digital elevation model which has a value for every cell • Methods of spatial interpolation are designed to solve this problem David Tenenbaum – GEOG 070 – UNC-CH Spring 2005

Inverse Distance Weighting (IDW) • A method of interpolation is inverse distance weighting: • The unknown value of a field at a point is estimated by taking an average over the known values • Each known value is weighted by its distance from the point, giving the greatest weight to the nearest points (thus the name inverse distance weighting, because as the distance between the point for estimation and the known point becomes greater, the weight is smaller) • This is an implementation of Tobler’s Law since it is implicit in the inverse weighting scheme that points that are close together will tend to have similar values David Tenenbaum – GEOG 070 – UNC-CH Spring 2005

Inverse Distance Weighting (IDW) point i known value zi location xi weight wi distance di unknown value (to be interpolated) location x

z (x) = ∑ wi zi i

wi = 1 d

2 i

∑ wi i

The estimate is a weighted average

Weights decline with distance

Issues with IDW • The range of interpolated values cannot exceed the range of observed values, i.e. every estimated value is produced as a weighted average of the known values and as such, no estimate is every going to be outside of the range of the known values • This is going to be a problem if the sampled known points did not include the minima and maxima of the field (e.g. the low and high points of the terrain in an elevation example) • It is thus very important to position sample points to include the extremes of the field • This can be difficult to accomplish when sampling David Tenenbaum – GEOG 070 – UNC-CH Spring 2005

Issues with IDW •This set of six data points clearly suggests a hill profile (dashed line). But in areas where there is little or no data the interpolator will move towards the overall mean (solid line) •There are other interpolation methods that can do better in this situation … David Tenenbaum – GEOG 070 – UNC-CH Spring 2005

Kriging • Kriging is a technique of spatial interpolation that is firmly grounded in geostatistical theory • It extends the concept of distance-decay (as we saw used in inverse distance weighting interpolation) by analyzing how the value of a variable changes with respect to distance using a scatter plot called a semivariogram • Again we begin with sampled values from a field at known points • Each possible pair of points is compared, producing: • A distance between the two points • A measure that describes the difference in values at the two points, called semivariance • These values are then plotted in the semivariogram David Tenenbaum – GEOG 070 – UNC-CH Spring 2005

•A semivariogram: Each cross represents a pair of points. The solid circles are obtained by averaging within the ranges or bins of the distance axis. The solid line represents the best fit to these five points, using one of a small number of standard mathematical functions.

Kriging • The semivariogram reflects Tobler’s Law • differences in value between nearby points are likely to be small • differences rise with distance, and how they rise determines the shape of the curve • Interpolation using kriging: • Analyze observed data to estimate a semivariogram • Estimate values at unknown points as weighted averages (as we did in inverse distance weighting) • obtaining weights based on the semivariogram • the interpolated surface replicates statistical properties of the semivariogram David Tenenbaum – GEOG 070 – UNC-CH Spring 2005