A place in history: a guide to using GIS in historical research


CHAPTER 7: SPATIAL ANALYSES OF STATISTICAL DATA IN GIS

 

Guide to Good Practice Navigation Bar

7.4 Spatial analysis in historical GIS

A variety of authors have made use of spatial analysis techniques to analyse historical data. Bartley and Campbell wanted to produce a multi-variate land use classification for medieval England from a single source, the Inquisitions Post Mortem (IPM), that gave a detailed breakdown of landowners' estates on their death (Bartley and Campbell 1997). To give as comprehensive coverage of the country as possible they used 6,000 IPMs covering a 50 year period. These were given place names that allowed them to be allocated to points on the map. These points represent large areas with a variety of land uses, and they wanted to create a more realistic and flexible representation of medieval land-use. To do this they reallocated the point data onto a raster surface, arguing that this offered a more valid representation of continuous data than either points or polygons. This was done using a kernel method that calculated the value of each pixel based on all eligible IPM points within 250 square miles of the cell, with nearer IPMs being given more weight than further ones. They then used cluster analysis to allocate each cell to one of six landuse classes. To take into account the uncertainty in their model, the technique they used allowed cells that were hard to classify to be given second choice alternatives. Through this imaginative and sophisticated use of spatial statistics they were able to create a detailed classification of pre-Black Death England that in turn will allow other studies of localities to be interpreted.

Bartley and Campbell use sophisticated handling of spatial data and statistical techniques to create a complex surface from a point layer. Cliff and Haggett use basic classifications, overlay and exploration through bar charts and basic statistics to integrate data from a variety of sources to explore the cholera epidemic in London in 1849 (Cliff and Haggett 1996). They believed that there were two main factors that would cause the severity of the epidemic to vary in different parts of London: drainage, and the source of the water supply. To model poor drainage they created a polygon layer that distinguished high and low risk areas based on height above the level of the River Thames. London's water supply came either from reservoirs or wells, or directly from the Thames itself. Areas supplied by water extracted directly from the Thames were believed to be a higher risk factor than those supplied from wells and reservoirs. Sub-dividing London in this way provides a second polygon layer. Overlaying these two layers sub-divides London into four types of area: those at risk as a result of both poor drainage and a polluted water supply, those at risk only as a result of poor drainage, those at risk only as a result of a polluted water supply, and those with neither risk factor. They then calculated the death rate from cholera in each type of area. Even simply graphing the data shows a clear pattern: areas with both risk factors had a death rate that was nearly twice as high as the metropolitan average. Those with defective drainage alone were also above average, those with polluted water were around average, and those with neither factor were well below. Statistical analysis using analysis of variance (ANOVA) confirms this pattern.

Gregory performs an analysis of census and vital registration data to show how manipulating the spatial and temporal component of the data can increase information about attribute (Gregory 2000). He does this by working on net migration over a 50 year period from 1881 to 1931. The census provides population figures split into five year age bands and by sex. The Registrar General's Decennial Supplements provides the number of deaths sub-divided by age and by sex. In theory this gives enough information to calculate the net migration rate in ten-year age bands and by sex. Subtracting the number of, for example, women aged 5 to 14 at the start of a decade from the number aged 15 to 24 at its end and then subtracting the number of deaths in this cohort should give the net migration rate. Boundary changes make this calculation highly error prone as any population change caused by a boundary change will appear to be net migration. By areal interpolation of all the data onto a single set of districts, age and sex, specific net migration rates can be calculated at district-level over the long term. This analysis, therefore, takes very basic demographic data and uses it to provide a long time-series of new data that have far more spatial and attribute detail than it was possible to create from manual methods.

Martin compares the 1981 and 1991 censuses (Martin 1996a). He interpolated data from the two censuses he was interested in onto a raster grid consisting of square pixels with 200m sides. In this way he was able to compare data between the two censuses with far more spatial detail than Gregory, but was unable to include earlier censuses as they did not provide sufficient spatial detail to allow accurate interpolation.

Guide to Good Practice Navigation Bar
Valid XHTML 1.0!
 

 


© Ian Gregory 2002

The right of Ian Gregory to be identified as the Author of this Work has been asserted by him in accordance with the Copyright, Designs and Patents Act 1988.

All material supplied via the Arts and Humanities Data Service is protected by copyright, and duplication or sale of all or any part of it is not permitted, except that material may be duplicated by you for your personal research use or educational purposes in electronic or print form. Permission for any other use must be obtained from the Arts and Humanities Data Service.

Electronic or print copies may not be offered, whether for sale or otherwise, to any third party.


Next Bibliography Back Glossary Contents