|
A place in history: a guide to using GIS in historical research CHAPTER 4: BASIC GIS FUNCTIONALITY: QUERYING, INTEGRATING AND MANIPULATING SPATIAL DATA
|
|
|
4.7 Integrating incompatible polygon data through areal interpolation A final issue to be discussed in this chapter is areal interpolation. This commonly occurs where there are two or more polygon layers of socio-economic data, where the polygons represent administrative units, which a user wants to integrate. Where the two sets of polygons nest perfectly because the administrative units used were identical this is a simple operation. Where they do not, for example if we are comparing census data published using registration districts with election data published for constituencies, then overlay does not provide the complete answer as it is uncertain how to allocate data to the resulting polygons. The traditional response to this is aggregation, which results in a loss of spatial detail, something that a GIS approach should attempt to avoid (see also Chapter 7).
Figure 4.10: Areal interpolation Areal interpolation can be used instead. First we overlay the layer containing the input data onto the layer we want to estimate the populations for. These are termed the source and target layers respectively. The overlay generates the 'zones of intersection' between the two layers. The problem is then to estimate what proportion of the data from a source polygon to allocate to each zone of intersection. The simplest method of estimation is termed 'areal weighting'. This is shown in Figure 4.10. The source and target units are overlaid to form the zones of intersection M, N, O and P. These polygons have all the attributes of the source and target polygons plus the areas of the new polygons as calculated by the overlay. The final column of attribute data, 'Est Pop', is added by the user. Its values are estimated based on the area of the zone of intersection compared to the source polygon. Polygon N has an area of 30 while its source polygon, 1, had an area of 100. As a result 30% of polygon 1's population is allocated to polygon N giving 15 people. The final stage is to aggregate the newly estimated data to target zone level so we estimate that target polygon 2 has a population of 15+33=48. The assumption of even population distribution is obviously extremely unrealistic. For example, registration districts in England and Wales usually consisted of a market town and its hinterland, hardly a likely candidate for an even population density. Various techniques have been devised to work away from this assumption, usually by using further knowledge that allows us to estimate where within the source zones the population is likely to be concentrated. This type of functionality is rarely properly incorporated into GIS software, and if it is then the technique and its limitations are likely to be hidden from the user. This means that the user is likely to have to implement the appropriate procedure for themselves. An example of the use of areal interpolation is provided by Gregory et al (2000). They wanted to compare three quantitative indicators of poverty, infant mortality, overcrowded housing, and unemployment, as they changed in England and Wales from the late 19th century to the late 20th. To do this they compared data from four time periods: the late 19th century, the 1930s, the 1950s and the 1990s. All the data were available as polygon data from the census or Registrar General's Decennial Supplements, but used significantly different reporting geographies. The late 19th century data were published using approximately 630 registration districts, while the two dates from the mid-20th century used approximately 1,500 local government districts (however, even these were difficult to compare as the system was extensively reformed between the two dates). Modern data were available at much more spatially detailed levels, with as many as 100,000 units. To allow direct comparison, all the data were interpolated onto the least spatially detailed units, the 630 registration districts. This resulted in the loss of significant amounts of spatial detail from the later data and also introduced some error to the results, but it did enable them to generate consistent time-series that allowed them to compare the changing patterns of inequality over time at a geographically consistent scale.
|
|
|
|
|
|
|
© Ian Gregory 2002 The right of Ian Gregory to be identified as the Author of this Work has been asserted by him in accordance with the Copyright, Designs and Patents Act 1988. All material supplied via the Arts and Humanities Data Service is protected by copyright, and duplication or sale of all or any part of it is not permitted, except that material may be duplicated by you for your personal research use or educational purposes in electronic or print form. Permission for any other use must be obtained from the Arts and Humanities Data Service. Electronic or print copies may not be offered, whether for sale or otherwise, to any third party. |