|
A place in history: a guide to using GIS in historical research CHAPTER 4: BASIC GIS FUNCTIONALITY: QUERYING, INTEGRATING AND MANIPULATING SPATIAL DATA
|
|
|
4.6 Formally integrating data through overlay In addition to simply combining layers, querying them and comparing them, layers can be combined to produce new layers through geometric intersections. This is called overlay. Any of the three types of vector data can be overlaid with any of the others, as is shown in Figure 4.5. An overlay operation combines not only the spatial data but also the attribute data. This has many potential uses. For example, a user has a polygon layer containing data about administrative units such as Irish baronies. They want to find out what proportion of each barony was covered by water using a polygon layer showing lakes. An overlay operation would produce a new polygon layer that combined the attributes of both polygon layers, as shown in Figure 4.6, and thus each new polygon would have both the barony attributes and the attributes from the water layer. It is also possible to combine point or line layers with polygon layers using overlay. For example, as inputs we might have a point layer representing towns and a polygon layer representing baronies and we want to determine which towns lay in which barony. Overlaying the two layers would produce a point layer with the barony polygon attributes added to each point giving us the required information.
Figure 4.5: Different types of overlay operations
Figure 4.6: Spatial and attribute data being combined
using an overlay operation Combining buffering and overlay allows complex spatial queries and operations to be performed. For example, with a line layer showing the road network and a point layer containing farm locations, a user may want to calculate which farms lie within 1km of a major road. This can be done as shown in Figure 4.7. First the user selects only the major roads from the road layer and copies these to a new layer. A buffer is then placed around the new layer so that a polygon layer is created in which the polygons represent areas within 1km of a major road. If only farms within 1km of a major road are required then a 'cookie cutter' overlay can be used. In this only farms on the input layer lying within a polygon on the buffer layer will be copied to the final layer. The final layer only contains five of the original farms and will contain all the attributes of both the farms and roads source layers.
Figure 4.7: Spatial manipulation to solve problems Perhaps more than any other GIS operation, overlay tests the accuracy of the input layers. If a point layer is overlaid with a polygon layer then inaccurate polygon boundaries can easily lead to a point lying near a boundary being allocated to the wrong polygon. Accuracy is tested further where two polygon layers intersect. As was discussed in the previous chapter, it is unlikely that any two operators will digitise a curve in exactly the same way even if the same source map is used. If this happens then the overlay operation will lead to sliver polygons being formed. These are very small polygons formed in the manner shown in Figure 4.8. This may seem like a trivial problem but is in fact the bane of vector overlay operations. It is possible to attempt to remove sliver polygons automatically. They tend to be long and thin and thus have a small area compared to the length of their perimeter. While this can be used to identify slivers, deleting them can still be problematic as it requires a decision on which boundary should be deleted. The problems caused by sliver polygons will depend on the scale and accuracy of the two sources, and the accuracy of the digitising. If the two layers have both been digitised to a high standard of accuracy from high quality source maps of similar scales, then the problems are likely to be minimal and can usually be solved by automated procedures within the software. If any of these three criteria are not met there is likely to be a significant job tidying the resulting output layer.
Figure 4.8: The creation of sliver polygons Overlay can also be performed on raster datasets providing they use the same pixel sizes. This is sometimes referred to as map algebra as two or more input layers are used to create an output layer whose cell values are calculated based on a mathematical operation between the input layers. An example of this is shown in Figure 4.9 where cell values on the two input layers are added to calculate values on the output layer. Other mathematical operations such as subtraction and multiplication can also be used.
Figure 4.9: Map algebra with raster data When two layers are combined using an overlay operation, the resulting layer will be at best as accurate as the less accurate layer. Unfortunately, the result is likely to be more inaccurate than this as error will be cumulatively added from both layers. This is termed error propagation and means that as layers are combined errors and uncertainty can multiply surprisingly quickly. This means that when multiple overlays are performed this must be done with the limitations of all the source layers being borne in mind. An example of the use of overlay in historical research is provided by Lee (1996). The 19th century censuses of Ireland published data using baronies. These were relatively large spatial units and Lee wanted to estimate the internal population distribution of baronies in County Antrim to provide a more realistic representation of the population distribution. She believed that the distribution was likely to be affected by the presence or absence of large water features, altitude, and the proximity and function of nearby settlements. Her GIS consisted of:
The barony, water feature, and altitude layers were overlaid to produce an output layer with 263 polygons. She then overlaid the settlement layer onto the centroids of these polygons so that distances from each settlement to each of the 263 centroids could be calculated. Finally, she used a rather arbitrary model to allocate the barony populations to each of the derived polygons based on the barony population, whether the polygon was covered in water, the polygon's altitude, and the distance from the polygon's centroid to nearby settlements. This shows how integrating a variety of disparate datasets can be used to generate new datasets. |
|
|
|
|
|
|
© Ian Gregory 2002 The right of Ian Gregory to be identified as the Author of this Work has been asserted by him in accordance with the Copyright, Designs and Patents Act 1988. All material supplied via the Arts and Humanities Data Service is protected by copyright, and duplication or sale of all or any part of it is not permitted, except that material may be duplicated by you for your personal research use or educational purposes in electronic or print form. Permission for any other use must be obtained from the Arts and Humanities Data Service. Electronic or print copies may not be offered, whether for sale or otherwise, to any third party. |