|
A place in history: a guide to using GIS in historical research CHAPTER 2: THE WORLD AS VIEWED THROUGH A GIS
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
2.2 Attribute data Attribute data are data in the form that most people understand by the term. Many GIS software packages include their own attribute databases but allow the user to link to external database management systems (DBMS) or spreadsheets. Examples of this include MapInfo, allowing the user to link to data in Microsoft Excel, ArcView to DBase, and ArcInfo to Oracle. Attribute data are frequently either statistical or textual. As the software improves and becomes more flexible, they can be in virtually any format that the DBMS used to store them can support. Increasingly this includes image formats, animations, hyperlinks, multimedia, and so on. Most of the DBMSs used in GIS are relational database management systems. This means that two or more tables can be joined together based on a common field known as a key. With historical data this can often be either a place name or an ID number (note that place names are not considered to be spatial data: to be spatial the data must have a coordinate-based location), and it allows data from various sources to be integrated without requiring spatial data. For example, a user has a table of Poor Law data organised by Poor Law Unions (these were a type of administrative unit used in England and Wales in the 19th and early 20th centuries to administer relief of the poor), a table of voting statistics organised by parliamentary constituency, and some employment statistics based on towns. A relation join will join all three tables together and all of the data for 'Bristol', for example, will appear on a single row. There are three main problems with doing this: firstly, the join has no knowledge that the entity referred to as 'Bristol' may be a different entity in each table. Secondly, problems will occur where names are not unique, such as 'Whitchurch' which appears in both Hampshire and Shropshire. Different software will handle this in different ways, the most common (and theoretically sound) being to duplicate rows of data. One way round this is to use more than one column as the key, for example, place name and county. The third problem with using place names is that their spellings must be identical to produce a match. Even minor differences in the use of hyphens or apostrophes will cause a non-match. This can be worked around using gazetteers that standardise all possible spellings and create a single spelling from an authority list, or through the use of ID numbers. Creating these can be time consuming. Many attribute databases use Structured Query Language (SQL) to allow flexible querying and joining. This is often implemented though a Graphical User Interface but follows the basic structure:
So, for example, we have two tables: 'unemp' that contains data on unemployment rates, and 'inf_mort' that contains data on infant mortality. These have the following fields:
Table 2.1: Sample tables of attribute data The SQL query:
will select the names, unemployment rates, and total populations from the table unemp for places with an unemployment rate of over 10% as is shown in Table 2.2.
Table 2.2: Sample data returned by the query above Relational joins are also implemented in this way. For example the query:
will select the unemployment rates from unemp
and infant mortality rates from inf_mort
where the values in the name fields in both tables are identical.
Table 2.3: Sample data returned by the query above While relational databases and SQL are not fundamental to an understanding of GIS, they are a useful skill that can enhance an understanding of GIS data and GIS software. Many guides to the use of SQL and relational databases are available: see the bibliography for further information. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
© Ian Gregory 2002 The right of Ian Gregory to be identified as the Author of this Work has been asserted by him in accordance with the Copyright, Designs and Patents Act 1988. All material supplied via the Arts and Humanities Data Service is protected by copyright, and duplication or sale of all or any part of it is not permitted, except that material may be duplicated by you for your personal research use or educational purposes in electronic or print form. Permission for any other use must be obtained from the Arts and Humanities Data Service. Electronic or print copies may not be offered, whether for sale or otherwise, to any third party. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||