Spatial Analysis I Lex Comber University of Leicester
[email protected]
Aims • Basis of Spatial Analysis in Health Sciences • GIS Operations – Generating area counts – Linking to other data – eg demographics
• Categories, Classification • Visualisation, including the components of maps
Spatial Analysis • What kinds of data are analysed in health sciences using GIS? – Point data: incidence of disease – Area data: some explanatory variable relating to the characteristics of a particular area – Surfaces: describing some trend
• What kinds of analyses are used to investigate this data? • This depends on the question or hypothesis
Spatial Analysis • In many cases it is to identify either – Explanations (causes, looking back) OR – Implications (consequences, looking forward)
• Example – Impacts of existing or new coal power station on downstream resident health – Association between socio-economic deprivation and public health choices (eg smoking) or impacts (eg infant mortality)
Spatial Analysis • Health data may often be at the individual level – Know something about each case, but maybe not everything – Causes / consequences information may not be available for each case
• Many different types of socio-economic variables are available for • Areas – eg census data – no. of people, age profiles, profession, income, religion, ethnicity, car ownership, travel to work, etc, • even health status!
– Such ‘demographic’ data is very useful – BUT it only tells us things about the area
• Surfaces – eg air quality, flood risk, ozone,
Spatial Analysis • We use a GIS to link the individual to the area and the characteristics associated with the area
Spatial Analysis • We use a GIS to link the individual to the area and the characteristics associated with the area Event or Process - Pollution or Poverty
GIS
Observed condition - Public Health
Spatial Analysis - relationships between process and observed pattern - SPATIAL relationships
Spatial Analysis • There are many applications for GIS in health research: – Examine disease rates over space – Identify disease clusters – Identify variables which cause disease – Examine variation in health and uptake of health services over space (and time)
• Link population, environmental conditions and health care
Spatial Analysis • Epidemiology is concerned with the causes and prevention of disease... • Most basic function of GIS is for mapping the spatial pattern of disease • Q: Where are the incidences of a disease located and how does this vary? – Lyme Disease – Bacterium (Borrelia burgdorferi) – Vector is ticks
Spatial Analysis • Reported incidence of Lyme Disease
Spatial Analysis • Risk map, 4 categories of risk
Spatial Analysis • Service access is another important aspect of public health • GIS can be used to determine travel times to health facilities
Spatial Analysis • Summary – Many applications in public health – A GIS allows us to
Query
• examine what is happening where • link spatial pattern and process
– Locations (eg of disease incidence) should be combined with other data in the GIS to attempt to explain the location/ spread of the disease
GIS
GIS Operations • So GIS allows us – to link pattern and process – to identify what happens where
• Reminder: including the location or the geography – More informative answers – More ‘nuanced’ answers
• Illustrate this with a case study – The spatial distribution of EMS cases
GIS Operations • EMS case study – based on some of the practical data to illustrate the way that GIS can be used – Link observed spatial pattern to underlying process.
• In this case – to examine the relationships (and the spatial relationships) between critical Emergency (EMS) cases and population
GIS Operations • 1107 serious EMS cases • 2076 census areas
GIS Operation • Can use the GIS to generate counts of EMS in each census area – (you will do something similar in the practical later!)
• Mapping the proportion of EMS cases against other variables shows the spatial variation of the phenomenon
GIS Operations
GIS Operations • We can see that there is much variation in the numbers of EMS cases in each census area • BUT • Areas with high populations may be expected to have high counts of EMS cases
GIS Operations • Normalize and perform a simple statistical analysis – – – –
EMS / Total Population 1,107/ 813,847 1.4 cases per 1000 pop EMS / Old Population 1,107 / 166,593 6.6 cases per 1000 old people
• BUT • Global statistic may hide much local detail • GIS allows us to calculate these statistics in each census
GIS Operations
GIS Operations
GIS Operations • Can see that the proportion of people and old people in different areas is not evenly distributed • amongst the general population – 1 or 2 ‘hotspots’
• amongst the older population – Much more spatial variation
GIS Operations • Integrating health data (in this case points) with area data (in this case census data) – Allows EMS cases to be related to other variables associated with those areas – Population, demographics, socio-economic status etc – Allows areas or regions with surprising (high or low) values to be identified – Clusters
• Can be used to direct further investigation –relating to the causes or location specific features (eg environment, demographics, etc)
GIS Operations • So what kinds of processes have been used here? – If you think about the data and the outputs can you guess?
• Data:
Outputs:
‘Did some GIS’ Points
Areas
Proportions
GIS Operations • Because we have data that has an explicit location, spatial overlays are possible • Do you remember the Set Theory? • This is where it comes in – Although you are usually not aware of it!
• Intersect and Union are very common overlay procedures
AND NOT OR XOR
GIS Operations • In the above example we used an Intersect operation • Counts the number of EMS points in each Census area • Stores the value in the Attribute Table of the layer that is created
GIS Operations • Summary – Space and location are crucial components of GIS operations – Combining layers and spatially interrogating one layer using another allow new information to be generated as a result of analysis – This is one of the key aspects of GIS that make it different from standard statistics
• In the practical you apply some of these techniques – Overlaying data – Generating area counts – Selecting features based on their distance to other features
GIS Operations • NOTE – I have introduced a small component of a wider study – just to give you an example – This is real data from a real analysis (I have spatially anonymised the EMS data to protect patient confidentiality) – This work is reported in full in Sasaki et al (2010) seeks to optimise EMS response times – It uses current EMS data and census data to determine the relationships between demographic variables and EMS cases. – It combines this model with future projected population changes – It shows how the ‘best’ places for ambulances (to minimse response times) changes over between 2010, 2030 and 2050 – Allows for more informed spatial planning and health planning
Categories and Classification • Classification is a way of grouping similar features • In a GIS context it involves – Selecting what is to be included in any class • What goes into the class of “high”? • Choice of thresholds for each class or group
– The number of classes
• These aspects are critical • A map is a very powerful way of communicating a large amount of information in a very efficient way
Categories and Classification • Consider the following map – What can you tell me about it?
• Some hotspots • Generally the same trend
Categories and Classification • What about this map? • It is the same data • The difference: the number of classes
Categories and Classification • Consider the following map – What can you tell me about it?
• Some hotspots • Generally the same trend
Categories and Classification • What about this map? • It is the same data • The difference: the number of classes
Categories and Classification • Ok so the number of classes is important • The other major thing to be aware of is HOW you decide to the boundaries between classes – There are many ways of dividing up the distribution of values – Like mean vs mode vs median for averages
Categories and Classification • This is defined by ‘equal interval’ – Eg in a range of 01 – Breaks at 0.2, 0.4, 0.6, 0.8 – Class intervals of 0-0.2, 0.2-0.4, etc
Categories and Classification • Classes in this example are based on 5 quantiles – Breaks at 20th percentile, 40th percentile, 60th percentile, 80th percentile – Based on the distribution of the data values
Categories and Classification • How you present spatial data for visualisation is important • I could do a whole course on these issues – If you did the MSc in GIS at my university you would!
• You need to be aware of how easy it is to make very different looking maps from the same data – This might be misleading – And often it is!
Visualisation • Finally, I want to talk very briefly about visualisation – Again this could be part of a whole course on visualisation
• But here I am want to talk about what needs to be included in a map – And I am hoping that you might know!
• What do you think needs to be included in a map? – Perhaps cartographic convention
Visualisation 1) A Legend – Describes the data layers and the class ranges
• What is wrong with this one? – Has the layer data name - not very user friendly – 6 significant figures!! Is that precision important?
Visualisation 2) a scale bar to show the study are size, the extent of the problem – needs units 3) A North arrow to show the orientation fo the map
Summary • This has covered a lot of ground • All of the content will be applied by you in the practical session • Key Points – GIS links ‘pattern & process’, tells us ‘what happens where’ – It can identify Explanations (causes – looking back) for events or describe their – Implications (consequences – looking forward) – The analytical tools in a GIS relate to geographic measures (eg distance, area counts etc) – Visualisation is so important • ‘How to Lie with Maps’
References • Ambulances in Niigata paper – Sasaki S, Comber AJ, Suzuki H, Brunsdon C. Using genetic algorithms to optimise current and future health planning - the example of ambulance locations. Int J Health Geogr. (2010); 9: 4. doi:10.1186/1476-072X-9-4 – Comber A, Sasaki S, Suzuki H, Brunsdon C, (in press). A modified grouping genetic algorithm to select ambulance site locations. Int J Geog Inf Sci
• Reviews of use of GIS in Health Sciences – Higgs, G, The role of GIS for health utilization studies: literature review, Health Serv Outcomes Res Method (2009) 9:84–99
• Visualisation – Edward Tufte, (2001). The visual display of quantitative information. Graphics Press. – Mark Monmonier, (1996). How to Lie with Maps, University Of Chicago Press