Effective Geographic Sample Size in the Presence of Spatial Autocorrelation

Effective Geographic SampleSizein thePresenceof SpatialAutocorrelation DanielA. Griffith AshbelSmith SchoolofSocialSciences, Professor, University ofT...
Author: Aubrie Anderson
0 downloads 5 Views 3MB Size
Effective Geographic SampleSizein thePresenceof SpatialAutocorrelation DanielA. Griffith AshbelSmith SchoolofSocialSciences, Professor, University ofTexasat Dallas contained As spatialautocorrelation latentin georeferenced theamountofduplicateinformation data increases, in thesedataalsoincreases. Thisproperty theresearchquestionaskingwhatthenumberofindependent suggests observations, sample sayn*,is thatis equivalentto thesamplesize,n,ofa dataset.Thisis thenotionofeffective size. Intuitively whenzerospatialautocorrelation speaking, prevails,n*= n; whenperfectpositivespatialautocorrelation prevailsin a univariateregionalmean problem, n*= 1. Equationsare presentedforestimating withthe goal of obof a samplemean or samplecorrelation coefficient n*based on the samplingdistribution level of precision, modelspecifications: (1) tainingsomepredetermined usingthe following spatialstatistical simultaneous These equationsare evaland (3) spatialfilter. (2) geostatistical autoregressive, semivariogram, uatedwithsimulation and are illustrated withselectedempirical experiments examplesfoundin theliterature. redundant KeyWords: geographic sample, geostatistics, spatialautoregression. information, spatialautocorrelation, fordata-gathering purposesmustaddress

Sampling questions askinghow and what to sample (Levy

and Lemeshow 1991), and it is the foundationof much empiricalsocial science research,whetherquantitativeor qualitativemethodologiesare employed.One distinction between these two methodological approachesis thatquantitativeresearchfrequently requires relativelylarge sample sizes to collect somewhatsuperficial, albeit important,attributeinformationthat is generalizableto a population,whilequalitativeresearch to relativelysmallsamplesizesin order oftenis restricted to collect large quantitiesof in-depth,detailed information from subjects or case studies. Quantitative analysis generalizationis achieved through a sound random-sampling design (i.e., how to sample); qualitative analysisgeneralization,if desired,may be achieved Considerable throughsuch techniquesas triangulation. efforthas been devoted to geographicsamplingdesigns for quantitative investigations (e.g., Stehman and Overton 1996)-translating the what into where to sample-designs that exploit random sampling error. Impacts of spatial autocorrelationin this context are partiallyunderstoodand are the topicofthisarticle.One of an arrayof purposivesamplingstrategies(i.e., how to sample) can be employedin qualitative research (see Marshall and Rossman 1999, 78). The goal often is and adherenceto selectedtheoretical representativeness as well as convenience.Impactsofspatial considerations, autocorrelationin this latter context are almost completelyunknown,althougha spatial researchershould realize that it still will come into play. For example, a snowball samplingstrategywill be impacted by spatial

ifsubjectsare fromnearbylocationsand autocorrelation because ofthe waythe autocorrelation social network by extreme-cases an is And strategy generated. sample could be impacted by the existence of geographically nonrandom"hot spots" or "cold spots,"whicharisebecause ofspatialautocorrelation. Findingsreportedin this articleforquantitativemethodologiesofferat least some speculativeinsightsinto qualitativesample sizes,too.

ImportantSampleProperties oftenin termsofstatistical Sample sizedetermination, power calculations, frequentlyis a valuable step in planninga sample-based,quantitativestudy.Most instatisticstextbooksdiscusshypothesistesting troductory in the contextof appropriatesamplesize determination, with or without statisticalpower specification.The popularityand cumbersomenessof these calculations have resulted in web-based interactivecalculators to executethe necessarycomputationsforresearchers(e.g., For the case http://calculators.stat.ucla.edu/powercalc/). of independentobservations,Flores,Martinez,and Ferrer(2003) furnishsome insightsinto sample-sizedeforarithmeticmeans of georeferenced termination data, but for systematicsampling designs rather than the tessellated random samplingdesign promotedin this article.As this literatureillustrates,calculatingan appropriatesamplesize unavoidablyinvolvesmathematical notation, which accordinglyappears in the ensuing discussion.

C

AnnalsoftheAssociation ofAmerican 95(4), 2005, pp. 740-760 2005 byAssociationofAmericanGeographers Geographers, 2005 Initialsubmission, December2004; finalacceptance,February April2004; revisedsubmission, PublishedbyBlackwellPublishing, 350 Main Street,Malden,MA 02148, and 9600 Garsington Road,OxfordOX4 2DQ, U.K.

EffectiveGeographicSample Size in the Presenceof Spatial Autocorrelation Statisticalpower (Tietjen 1986, 38) is the probability-frequentlydenoted by 1- P3,where 3 is the probabilityof failingto reject the null hypothesiswhen the alternativehypothesisis true(i.e., a TypeII error)-that a test will reject a false null hypothesis(i.e., the complementof a Type II error).The higherthe power,the greaterthe chance of obtaininga statistically significant resultwhen a null hypothesisis false.The powerof all statisticaltests is dependent on the followingdesign parameters:significancelevel selected for a statistical test; sample size; the tolerablemagnitudeof difference between a sample statisticand its corresponding population parameter;and natural variabilityfor the phenomenonunderstudy. Spatial autocorrelation,which may arise fromcommon variablesassociated with locations or fromdirect interactionbetween locations (see Griffith1992), has an impact on significancelevels, detectabledifferences in attributemeasuresfora population,and measuresof attributevariability(see, e.g., Arbia, Griffith, and Haining 1998, 1999). These impacts motivatedClifford, Richardson,and H~mon (1989) to apply the phrase "effectivedegreesof freedom"l-the equivalentnumber ofdegreesoffreedomforspatiallyunautocorrelated(i.e., independent) observations,exploiting redundant or containedin georeferenced data duplicatedinformation due to the relativelocationsof observations(i.e., spatial -to analyses in which these spatial autocorrelation) autocorrelationeffectsare adjusted for in the case of correlationcoefficients.The duplicate informationin question may arise fromgeographictrendsinduced by commonvariablesor frominformation sharingresulting fromspatialinteraction(e.g., geographicdiffusion). This articlehighlightsthe nearlyequivalentnotion of effective sample size2: the numberof independentobservations, say n*, that is equivalent to a spatially autocorrelateddata set's sample size, n. Intuitively speaking,when zero spatialautocorrelation prevailsand a regionalmean is beingestimated,n* = n; whenperfect positive spatial autocorrelationprevails,n*= 1. The importanceof correctingn to n*may be illustratedby analysisof remotelysensed data forthe High Peak districtof England, for which n = 900 pixels containing markedlyhigh positivespatial autocorrelationis equivalent to n* 5 independent pixels (see the ensuing discussionfordetails).As an aside,Getisand Ord (2000) furnisha similartypeof analysisforthe multipletesting of local indices of spatial autocorrelation, which themselvesare highlyspatiallyautocorrelated byconstruction. Of note is that establishing effective samplesize unamathematical derivations;basic ones voidablyrequires are outlinedin thebodyofthisarticlein orderto establish

741

the soundnessofresults.The validityofreportedfindings is further bolsteredwithsimulationexperimentresults. ImportantConsiderations When Designing a Sampling Network Random samplingin a geographiclandscape requires considerationsmuch like those used when designing a conventional stratifiedrandom sample. Geographic needs to be cast in termsof spatial representativeness to coverage ratherthan in termsof, say, stratification achieve good socioeconomic/demographic coverage. Designinga geographicsamplingnetworkalso needs to protectagainst sample locations being correlatedwith the geographicdistributionto be studied; this specific concern is whypurelysystematicsamplingoftenis not used. Geographic sampling networksenable regional means to be estimated,eitherforpredefinedsets of aggregateareal units (choroplethmaps) or as interpolationsofcontinuoussurfaces(contourmaps). Geographic samplingnetworks,designedforefficientestimationof of inparametersdescribingthe geographicdistribution terest,need to guard against grosslyinefficient spatial prediction (Martin 2001; Miller 2001; Diggle and resultsin Lophaven 2004) and vice versa.This trade-off a compromisebetweena systematicsample,comprising regularlyspaced samplinglocationsin orderto achieve good geographiccoverageand hence good interpolation spaced samplinglocations in accuracy,and irregularly orderto achieve betterestimationof parametersforthe of interest. geographicdistribution A samplingnetworkcan be devised in variousways to satisfythe condition of containingboth regularly and irregularly spaced samplinglocations.One wayis to of the locations systematically n/2 (e.g., on a position and the remainingn/2locaregularsquare tessellation) tions in a random fashion (i.e., randomlyselect eastnorth-southcoordinates).This westand, independently, is the typeof designassociatedwiththe GEOEAS data example (Englund and Sparks 1991; see http:// A secwww.websl1.uidaho.edu/geoe428/data_files.htm). ond method proposedby Diggle and Lophaven (2004) involvespositioningsome samplinglocations on a regular square tessellationgridand the remaininglocations on more finelyspaced regularsquare tessellationgrids withina randomlyselected subset of cells demarcating the coarsergrid-the lattice plus in-filldesign. Diggle and Lophaven also propose a thirddesign,which they thatinvolvespositioningsome samplinglocations prefer, on a regularsquare tessellationgrid,withthe remaining locationsbeing randomlyselected fromconstantradius a randomsubsetof circularbufferzones circumscribing

742

Griffith

the systematically positionedlocations-the latticeplus close pairsdesign.Unfortunately, the softwarecurrently available to supporttheirdesigns"would encounterse... with numbersof locations larger rious difficulties thana fewhundred"(Diggleand Lophaven2004, 8). Yet a fourthdesign is the one employedfor this article, which is based on hexagonal-tessellation,stratified, random sampling (Stehman and Overton 1996). A regularhexagonal tessellationcontainingn cells is superimposedon a region-the systematiccomponent. Then a singlelocationis randomlyselectedfromwithin each hexagon-the random component. This design sharesmanysimilarities withthe latticeplus close pairs design. Of note is that these networklayoutissues are centralto debates about geostatisticalsamplingdesigns. Cressie (1991, ?5.6) furnishesa usefuloverviewof numerousspatialsamplingdesigns. Mixing regularlyand irregularly spaced samplinglocations highlightsanotherimportantfeatureof spatial analysis,namelydesigned-basedand model-basedinference. The precedingsamplingdesigns supportdesignbased inference, whichassumesthata givenlocationhas a unique fixedbut unknownvalue for the geographic distributionof interest.The referencesamplingdistribution is constructed,conceptually,by repeatedlysampling froma geographiclandscape and using the same design and calculatingparameterestimateswith each sample. Initially,spatial scientistsbelieved that this used when data constrategycould not be legitimately tain non-zero spatial autocorrelation(Brus and de is to let thevalue Gruijter1993). An alternativestrategy forsome geographicdistribution at a givenlocationvary. In other words, the joint distributionof data values forminga map is one of an infinitenumberof possible realizationsof some stochasticprocess; the total set of Hence, the espossiblemaps is called a superpopulation. sentialtool fordescribinga map is a model,resultingin thisinferential basis beinglabeled model-based. A severe in shortcomingof this latterapproach is the difficulty whether or not model are knowing valid, assumptions necessitatingdiagnostic analyses. But it furnishesan nonranindispensableanalyticaltool forunderstanding domlysampled data such as remotelysensed data and forenablingspatial autocorrelationto be accounted for when devisinga samplingdesign: the model-informed, design-basedperspectiveoutlinedin this article.

A ConceptualFramework:The Effective Size of a GeographicSample A basis forestablishingeffectivesample size fornormally distributedgeoreferenceddata is presented in

termsof the samplingdistributionof a single sample mean; extensionsexploitingmultiplesample means or the sample correlationcoefficientare presentedin AppendicesA, B, and C. This approach,forwhich the assumptionof a bell-shapedcurve is critical,is directly analogousto thatreportedfortimeseries(e.g.,see the R Documentation) and is indirectlyanalogous to what is models reportedforsurveyweightswithsuperpopulation (Pottchoff, Woodbury,and Manton 1992), wherebyapplyingweightsto sampleresultsaltersthe value of n. Measuringnaturalvariabilityforsome georeferenced phenomenon resultsin an inflatedvariance estimate when spatialautocorrelationis overlooked(see Haining 2003, ?8.1). Suppose the n x n matrixV contains the covariationstructureamong n georeferencedobservations (more precisely,matrixG2V- 1 is the covariance matrix),such that Y = t + e t + V-1/2e*,where Y x attributevaldenotes an n 1 vectorof georeferenced ues, p denotes the population mean of variable Y, 1 denotes an n x 1 vectorof ones, and e and e*, respectively,denote n x 1 vectorsof spatiallyautocorrelated and unautocorrelatederrors.Suppose e* is independent and identicallydistributedN(O, oY),e*) whereN denotes and a.2 denotesthe population the normaldistribution, variance forvariatee*. If V = I, the n x n identitymatrix,then the n observationsare uncorrelated.Using matrixnotation,the populationvarianceestimatebased is upon a sample, and ignoringspatial autocorrelation, given by = E[(Y - l)/n] TR(V1) 2, (lA) l~)T(Y n where 62 denotes the estimateof cy, the variance of attributevariableY, E denotes the calculus of expectations operator,and T and TR respectivelydenote the matrix transpose and trace operators. The quantity factor(VIF), similarto TR(V-' )/nis a varianceinflation in conventional the VIF generatedby multicollinearity analysis;it expressesthedegree multiplelinearregression observations to whichcollinearityamonggeoreferenced dispersed degradesthe precisionofY relativeto similarly spatiallyuncorrelatedvalues. Popular versionsof matrix V include, for spatial autoregressiveparameterp and binarygeographicweightsmatrixC: (I- pC) for the conditional autoregressive(CAR) model; and, [(I- pW)T(I- pW)] for the autoregressiveresponse (AR) and simultaneousautoregressive(SAR) models, versionof mawherematrixW is the row-standardized trixC.3 Cliffand Ord (1981, Ch. 7), Anselin (1988), Griffith (1988), Haining (1990), and Cressie (1991, Ch. 1), amongothers,furnishadditionaldetailsabout these models. E(&)

EffectiveGeographicSample Size in the Presenceof Spatial Autocorrelation Again using matrix notation, the variance of the sample mean of variableY, y, ignoringspatial autocorrelation,is givenby

(A) 1

o

/-

08 08

a

E(^&) (1B) 1TV-11/n2 Rearrangingthe termson the right-handside of this equation and makingthe necessaryalgebraicmanipulationsyields TR(V-1) E(^Y2)E(')

2

..

02

TR(V-1) n(1C) 1TV-'1

00

side ofthisequation The denominatoron the right-hand furnishesthe formulaforeffectivesample size,namely,

0.2

00

0.4

0.6 se-formula

1TV-11

(2) (2)

If the n observationsare independent,and hence V = I, thenn* = n, and the VIF becomesTR(V-) = 1. Ifperfect positivespatial autocorrelationprevails,then, conceptually,V - 1 = kl 1T, withk --+oc as positivespatial autocorrelationincreases,and n* = 1. In additionto the mathematicalstatisticaltheoryderivationofEquation(2), itsvaliditycan be assessedthrough A simpleexploratory simulationexperimentation. experiment (100 replications)was conductedforselectedcases in whichvariableY was distributed N(O, 1), spatialautowas embeddedwithan SAR model,and n, the correlation level of positivespatialautocorrelation p, and geographic were varied. a scatterplot connectivity Figure1A portrays of the simulatedstandarderrorversusa standarderror computedwiththe VIF and Equation (2). The goodnessof the regression of-fit line appearingin thisgraphhighlightsthe soundnessof Equation (2), with a noticeable deviationbeingattributable onlyto simulationvariability.

Mean-BasedResultsfora Spatial Model Specification: The Autoregression

SAR Model

here Findingsbased on a SAR model are illuminating as a CAR model because a SAR model can be rewritten models (see Cliffand Ord 1981), whereassemivariogram and Layne 1997). can be directly relatedto it (see Griffith

TheCase ofa SingleGeographic Mean Griffith (2003) reportsfindingsforEquation (2) and its extensionto expression(Al) in AppendixA, including the followingconjecture,whichis a slightimprovement and Zhang (1999): on the resultreportedby Griffith

1.0

0.8

(B) 1800

TR(V-') n*- -n.

743

I

Variable

-f0-i--0-

16800,

Griffith Crsie

1400 1200

~

1000

E

so 800

400 400 .."

6

200 400 600 800 1000 1200 1400 1600 1800 n*fromsimulation

of the simulatedstandarderror(100 Figure1. (A): a scatterplot versusa standarderrorcomputedwiththe variance replications) inflation factor(VIF) and Equation(2), denotedby solid circles solid straight grayline denotespredicted ( ); the superimposed valuesproducedbytheestimatedregression equation.(B): a scatterplotofn*computedwithEquation(2) versusfi*computedwith Equation (3), denotedby asterisks(*), and with approximation Cressie's(1991, 15) equation,denotedby open circles(o); the solid straightgrayline denotes predictedvalues superimposed estimatedregression the equationbased upon Equaproducedby and thebrokenstraight tion(3) results, graylinedenotespredicted valuesproducedby the estimatedregression equationbased upon Cressie'sequation'sresults.

If georeferencedattributevariableY is normallydisso, and p is the spatial autotributed,or approximately correlation parameter estimate for a SAR model then the effectivesamplesize is givenby specification, 1 x1

n-l(1i1237 - 1 - e1.92369 n

204

(3) Of note,again,is thatthenormality assumptionis critical here. In addition to a nonlinearregressionanalysisof empiricalcases used to calibrateEquation (3), itsvalidity

Griffith

744

can be assessedthroughsimulationexperimentation. The experimentused to validate Equation (2) also was used to validateEquation (3). Figure1B portrays a scatterplot of n* computedwith Equation (2) versusfi*computed withEquation (3). The goodness-of-fit of the regression line appearingin thisgraphhighlights the soundnessof Equation (3). Of note is that alteringthe geographic connectivitydefinitionresultsin slightbut perceptible variationabout values calculatedwithEquation (3). Cressie (1991, 15) reportsa comparable effective whichalso was used to predictn* for samplesizeformula, the simulationexperiment(see Figure1A). Equation (3) outperforms Cressie'sequation,yieldinga mean squared error(MSE) of55 thatis substantially less thanthe MSE of 334 forhis equation.

TheCase ofTwoGeographic Means Followingthe same logic that establishesEquation (2), also reports,for the bivariateweightedmean Griffith case (see additionaldetailsin AppendixA) specifiedin termsof a SAR model, the conventionalvariance term in expression(Al), namely w2

?(

22w(1 n

-w)pxyvxGy

,

(4A)

where 01 and cro respectivelydenote the variance of variablesX and Y, Pxydenotes the correlationbetween attributevariablesX and Y,w (0 < w < 1) is the weight applied to the mean of variable X, and the term 2w(1 - w) adjusts for the presence of redunpxvyxoy in a bivariategeoreferenced dant attributeinformation dataset. This expressionis multipliedby the VIF appearingin expression(Al), namely, w22TR{ [(I -

-

1xW)T(I

A closer inspectionof this conventionalvariance expressionreveals that it containsthe individual,weighted, standarderrortermsw2o2/n and (1-w)2o1/n. A closerinspectionofthisVIF termrevealsthatit contains the weighted, individual VIF terms w2&2TR{[(Iand (1-w)2&2TR{[(IxW)]- }/n 5xW)W(I-can be seen in Equawhich I5yW)T(I-yW)]-I}/n, tion (2). One surprisehere is that this VIF expression does not include the cross-productterm involving And a closer inspection of [(I- pxW)T(I1yW)1'. termreveals that it containsthe this means variability factorforeach individualvariablen*,as well as prorating term. a cross-products A simulation experiment based on n = 625, = = 1 and 100 replicationswas conducted to esc- o,the validityofexpression(Al) acrossthe rangeof tablish w and PxY values. Figure2 portraysa scatterplotof the simulatedstandarderrorversus a standarderrorcomof the puted withexpression(Al). The goodness-of-fit regressionline appearingin this graph highlightsthe soundnessofexpression(Al), withnoticeabledeviations being attributableonlyto simulationvariability. empiricalcase studiesthat Graphsfortwo illustrative portraythe curve describedby expression(Al) when 1 #1#l appear in Figure 3. Relevant statisticsfor these two examplesused to constructthese graphsappear in Table 1A. The curvesportrayedin Figure3 may be approximated withthe following equations,whichare equivalentto but simplerin formthan the one reported in Griffith (2003, 85) and demonstratethat the joint n* value essentiallyis a weightedfunctionof the two effectivesamplesizes that can be computedseparatelyfor the individualmeans:

0.5

xW)]-1}

+ (1 - w)2&2TR{ [(I - iW)T(I-

W)-/n, 0.4

(4B) where px and py, respectively, are the spatial autocorrelationparameterestimatesforvariablesX and Y. The resultingproductthen is divided by the termdenoting ofmeansin the presenceofnon-zero samplingvariability spatial autocorrelation,also appearing in expression (Al), namely,

.?

1XW)T(I- 15XW)]-11 T _ + (1 - w)2 2 1W)11 lT[(I- _5W)T(I-

+ 2w(1- w)px 1/n2. 1T[(I- 5xW)T(I15W)]y~x?y (4C)

.i-1.

a02 S0.1

0.0

W221T[(I

/

.*0

0.3

,,

0.0

0.1

0.2

se-formula

0.3

0.4

Figure 2. A scatterplotof the simulated standard errorversus a

standarderrorcomputedwithexpression(Al), denotedby solid

circles ( * ); the superimposedsolid straightgrayline denotes pre-

dictedvaluesproducedbytheestimatedregression equation.

EffectiveGeographicSample Size in the Presenceof Spatial Autocorrelation (A)(A)i 13

Murray

mean elevation

2.0171

12

+ S AsW2.0171 (1

11 10

(1

9

+ W2.0171 (1-

-

8

=

7 elevationstandarddeviation

6 0.0

0.2

0.4

0.8

0.6

1.0

w

(B)

745

7

As

76 75 74

+ nPb W)1.3475

-

--)W1.3475 W)1.3475'pseudo-R2

0.9994,

(5B)

where pseudo-R2 is the squared multiple correlation coefficient(R2) betweenn* and fi*.Because of the role played by variance terms,which are specificto cases, onlythe generalformof the equation can be established at thistime.This empiricalanalysisfurther corroborates the validityof expression(Al). As an aside, the extensionof this two-meansto a resultis summarizedin AppendixB. multiple-means

73

TheCase ofa Pearson Product-Moment Correlation Coefficient

72 71 70

69

Pb

68

0.0

0.2

0.4

0.6

0.8

1.0

w

3. Plotsofthebivariate curvefortheillustrative Figure examples; theapproximation curveisdenoted line(...), andthe bythedotted oftheexactn*valuesis denoted scatterplot bysolidcircles( . ). the PuertoRico digitalelevation model(Dem). (B): the (A): Smelter site. Murray

PuertoRico

n-i nxy 1+-1 n S

(1 W.9

q-

1.589968 -- wu)1.5968

(1 - w)1.5968 =

Spatial scientistsare often interestedin measuring relationshipsbetween,ratherthan means of,geographicallydistributedvariables.Details forcomputingeffectivesamplesize in thiscontext,again assumingnormally distributedvariables,appear in Appendix C. The followingapproximaterelationshipholds between the individual mean-based univariateeffectivesample sizes and theircorrespondingbivariatecorrelation-basedeffectivesample size:

(1 w1.5899+ 0.9998,

-

' w)1.5968

x

n

pseudo-R2 (5A)

and

+

i(Px

+

PY)1.161

0.04vnxn

(6)

This approximationresultsin: fiy = n when spatial auis absent;iy asymptotically tocorrelation convergingon 2 frombelow when Px = py~ 1 and Pxy=1; and, fi~y asymptotically converging on roughly 5 when Px = PY 1 and Pxy = 0, highlightingthat it is an

Table 1A. Selecteddescriptive statistics forthePuertoRico digitalelevationmodel(DEM) and Murraysmeltersitesoil contaminants Landscape Puerto Rico(n= 73) smelter site(n= 253) Murray

Variable DEM meanelevation (e) DEM elevation standard deviation (se) Arsenic (As) Lead(Pb)

Standard deviation

?

0.80270 3.54528 3.46316 7.69417

0.83139 0.67638 0.53180 0.49363

Bivariate correlation n* 0.68102 0.74775

6.24 12.69 68.24 76.95

NOTE:Griffith ofthesetwodatasets. TheMurray, is a superfund elevation model. (2003)provides descriptions Utah,landscape digital site.DEM denotes

Griffith

746

desiredvalues here are 1 approximation;theoretically, and 2. ResultsforfiveThiessen polygon-basedempirical examplesappear in Table iB; these resultsdemonstrate that Equation (6) furnishesa good approximationfor Equation (C1). These findingshighlightthe implicationthatimpacts of spatial autocorrelationcan be mitigatedto some extentby incorporating redundantgeoreferenced attribute informationinto an analysis,a natural formof which arises in space-time data series. Lahiri (1996, 2003) notes that this is one way of regainingestimatorconsistencywhen employinginfill asymptotics(i.e., the sample size increases by keeping the study area size constantand increasingthe samplingintensity).

j#

3 1

2r

1

1

+2 r3

n

d

i= 1 j=1

n

dij r,

+'~1+5.12~;1 nexponential 1+ Z

i=l j=l

Suggest Documents