In-Class Exercise 3 [Related Tables]
Manipulating data through queries, overlays, and other methods becomes more complex with relational databases, but understanding how to work with these data is fundamental to organizing a project in a digital structure.
Let's start by cleaning up and loading a new version of the Callalli Peruvian project:
- Create a folder called "callalli_old" in your Documents folder and move all the various files from your desktop and Documents folder into there.
- Download a fresh copy of the Callalli dataset. Download Callalli dataset.
Changes to Callalli dataset:
-
- It has been updated to the File Geodatabase format (.gdb).
- An MXD file is included with some symbology
- A Hillshade was created from the DEM and they're shaded.
- A Bing Hybrid map service link was added.
- Unzip the new Callalli folder so the files are in /Documents/Callalli
- Double click the .MXD file so Arcmap opens.
- You probably need to have an active account with ESRI (due to the Bing Hybrid service).
- Switch over to View > Data view
- Turn on the All_ArchID layer, Note the labeling, then turn off again.
Part I. Linking related tables through primary key with a spatial index
We learned that examining research data often involves working with data in related tables that are linked (Joined or Related) through unique ID numbers. The theory behind design of relational databases is known as database normalization Links to an external site..
When detailed data is held in non-spatial tables (from lab analysis, for example) it can be difficult to connect the data in these attributes tables back into physical space for analysis and mapping. Recall that we used a Primary Key (ArchID) for indexing all spatial proveniences and then a CatID key for individual artifacts within those spatial areas.
The ArchID Centroid approach
One expedient solution for assigning space to non-spatial attributes is to have a single point type feature that contains a single point for every ArchID number regardless of whether it is the location of a locus of ceramics inside of a site (a polygon inside a site polygon), an isolated projectile point (a geographical point), or a single core at a site (point inside a polygon). All ArchID numbers are reconciled in a single point feature class and therefore space can be assigned to every item in some form and all these non-comparable attribute tables can be effectively “zipped together” into space.
What about the artifacts collected from line and polygon features? How do you mix spatial location for points and polygons into a single file? For the purposes of organization, you can turn lines and polygons into centroids. Centroids are defined in a number of ways, but think of them as the center of gravity for a line or polygon. In the Callalli database the All_ArchID shapefile is a collection of point locations and centroids for all of the ArchID numbers from the Callalli area. Keep in mind that a number of items with different CatID (from lab work) can be assigned to a single ArchID (hence, a single point), and that there are a number of limitations to using centroids for analysis.
This method is a solution for dealing with the “Many to Many relationships” situation.
Queries and related tables in Callalli data
You have loaded the Callalli geodatabase Download Callalli geodatabase into Arcmap.
A. Query and display relates
Quick exploratory task: Make a map showing the locations of all obsidian bifaces and projectile points because we want to know if they are close to rivers.
One possible solution to this task is as follows
- Open Attribute Table for non-spatial table “Lithics_Lab2”
Look at the data organization. As this is a non-spatial table it may not appear in the Table of Contents unless you have the “Source” tab selected at the top of the Table of Contents.
The major problems are:
- These are non-spatial, so how do you show them on the map?
- The attributes are spread across two fields (Lit_Mat and Form) so how can you easily select them for display?
- Many of the artifacts were collected Site or Locus polygons so they don't appear in the Lithic_p point shapefile for symbolizing.
One solution to this situation is to connect them all through the centroid point layer "All_ArchID"
Take a look at the Attribute table for the ArchID layer. This is derivative feature containing centroids for lines and polygons, and point locations for all instances of each ArchID field. Look in the table and notice that it’s a Point type geometry file with a single Point record for each ArchID. The “Filetype” field tells you what file type the original ArchID number belongs to so we can relink these centroids to their original polygons if needed down the road. Also notice the PhotoURL field contains links to site photos. We’ll explore that in the future.
We need to link the table “Lithics_Lab2” with All_ArchID points… but there are many Lab results for each ArchID so we can not use the JOIN function because this is a one to many (1:M) situation.
Instead we will use the RELATE command
- Right click All_ArchID and choose Joins and Relates… > Relates…
- Read the description and then use these values in the Relate box:
- ARCHID
- Lithics_Lab2
- ARCHID
- type in “ArchID-LithLab2”
Now we need a selection in the Lithics_Lab2 table. Here are several useful ways to make selections in Arcmap:
Select by Sorting on a field
- Open the table to view Lithics_Lab2.
- Right-click Lit_Mat field and choose Sort Ascending
- Scroll down until you get to the Obsidian artifacts and click and drag down in the left-most column to select these records. Now all Obsidian artifacts should be highlighted. The next problem is that we only want some of the obsidian artifacts.
Select by Attributes command to make or modify a selection:
- click Table Options... (top left menu on Table) and choose “Select by Attributes”. We can use this box to do a modified SQL query of the table. In the Method field choose “Select from current selection”.
- Double-click [Form] and notice how it pops up in the text box below. We’re going to build up an SQL query using graphical tools. Click “=” then Click “Get Unique Values”. From this list double-click “Biface Broken”. Click “OR” then click [Form] and ‘=’ again, then choose “Biface”, then repeat these steps adding “Proj Point” and “Proj Point Broken” to your selection criteria.
- Click the Verify button. Did you pass the test?
Your line should look like this
[Form] = 'Biface Broken' OR [Form] = 'Biface' OR [Form] = 'Proj Point Broken' OR [Form] = 'Proj Point'
- Click “Apply”
You’ve got a subset of the table. Notice the number of artifacts that are selected. Now in order to see these geographically we need these features selected in the All_ArchID point layer.
- From the same table click on the "Table Options"… button and go to “Related Tables…” and choose AllArchID-LithicsLab2.
The All_ArchID attribute table will appear with only those records selected that contain obsidian bifaces and proj points. How many are there? There are much fewer, what happened to the number of records highlighted in the previous selection?
Close this attribute table and you will see on the map that those point locations are highlighted. Are they close to rivers?
In this part we learned that a single point feature (All_ArchID) can be used to give geographical position to non-spatial tabular data provided that they have some reference to spatial data (such as a unique ARCHID number). Through the use of Relates and linked through a common reference table the All_ArchID table links different data sets using space as a common referent. You can quickly create selections in one dataset and move that selection into other datasets. Think of the All_ArchID table as the spinal column that links different elements of the skeleton.
Note that we used two methods of querying: the first was to just Sort Ascending for a particular field and select from that group. The second was to use text-based queries to construct an explicit query. For simple selections the first method is easier, but for complex queries, non-contiguous groups, or to store the query for later reference the latter is better.
Keep in mind that All_ArchID is only centroids so they don’t have the full information of the original points, lines, and polygons. However, this single spine can be used to jump between related selections.
For example, if we Relate All_ArchID to the Sites_A field we could find out the size of the sites that have obsidian bifaces. Are the sites with obsidian predominantly large or small sites?
Lets try that. Without losing your previous selection,
- right-click All_ArchID and choose Joins and Relates…> Relates… and for relates use
- SITEID
- Site_A
- ARCHID
- call this “All_ArchID-Site_A”
![]()
- Now right-click “AllArchID” and go to the Attribute Table.
- Go to the menu Table Options… > Related Tables… note that you now have several Relates to choose from. Choose the one to Site_A.
- Now look at the Site_A attribute table. How many sites are selected? Notice how our selection is getting smaller as data is aggregated into these inclusive groups such as Sites.
- Click the Selected table at the bottom of the screen. You can see that the Shape_Area field shows the m2 (area) of these sites. A frequency distribution on site sizes can be found by right clicking the Shape_Area field and choosing “Statistics…”.
These techniques, combined with Pivot Tables and Joins that you learned in labs during weeks 1 and 2, give you the basic tools to explore and manipulate data in databases organized through relationships. One of the advantages of using the ESRI Geodatabase instead of the older Shapefile format is that you can build permanent Relationships directly into the geodatabase so you don’t have to recreate them for each Arcmap MXD project.
There are still some cumbersome elements to related tables organization, for example symbolizing and labeling through relates is still not easily available in Arcmap. However with clean data organization you can move through joins and relates to produce special joins for specific map projects that require this kind of symbology.
PART II – Buffers
Buffers are extremely versatile in GIS. They are commonly used for expanding or contracting a polygon by a constant value, however they can also be used for other useful functions. For example, you can use buffers to convert points and lines into polygon areas (corridors). You can use buffers to evaluate spatial proximity from one theme, like a road, on another theme, like an archaeological site.
Exploratory question: what is the relationship between the presence of obsidian in sites and the proximity to rivers in the Callalli area?
We will use the “Obsidian by Site” table you created in In-Class Exercise 2.
Recall that this table aggregated lithic locus data by site in order to compare the density of obsidian and non-obsidian by site.
There are numerous ways to calculate distances in Arcmap. We will explore two methods today.
A. Multiple Buffers
- Buffer the rivers by opening to the red Toolbox icon choosing Analysis Tools > Proximity > Multiple Ring Buffers…
- Input Features: choose rivers in “Hydro_lines100k”
- Output Feature Class: hydrobuff (specify your output location by clicking the Folder icon)
- For distances use 25, 50, 250, 500, 5000 (click the Plus between each one).
- Click Ok.
The Obsidian_By_Site table is not a-spatial so you have to Join that table to a spatial table. Let’s use All_ArchID because we know that every value in the Obsidian_By_Site table will have an entry in the All_ArchID point file (can you think of any problems with using points instead of site Polygons for this analysis?)
- Right-click “All_ArchID” and choose Joins and Relates…. > Joins
- SITEID
- Obsidian_By_Site table
- SITEID
- Then click Advanced… and choose “Keep only matching records…” (why are we doing that step?... read the text. This is an exclusive, inner join)
Now we need to know where these features fall with respect to the hydro buffers layer underneath it. Use the toolbox Analysis Tools > Overlay > Identity… to compile all the attributes from the two layers into a point layer.
Use All_ArchID as the Input Feature and HydroBuff as the Identity feature. Click OK.
Look in the attribute table of All_ArchID_Identity” and notice you have a DISTANCE field (distance from river, by buffer class) and Sum_Obsidian. These results could be explored numerically (is there a meaningful relationship?), or they could be symbolized together.
- For example, right click “All_ArchID_Identity” , 'properties' and choose Symbology.
- Show: Quantities, Graduated Colors.
- Value: Sum_Obsidian
- Normalization: Distance
B. Quantitative Proximity Distance
Before proceeding to the this next step, first select 'customize' (at the top of your screen), then, 'extensions" and then make sure that the spatial analyst tool is enabled.
What if you want to know the actual distance in meters between a point and the river, and not the grouped buffer class of these distances? For this we need to use the Raster data model (continuous values).
- In the Toolbox choose
- Spatial Analyst Tools > Distance > Euclidean Distance…
- Input Raster/Feature: Hydro_lines100k
- Rename: EucDist_hydr2
- Cellsize: 90
- Click OK
What happened? Zoom to Layer of the output layer (EucDist_hydr2). (by right clicking the layer and choosing "zoom to layer")
Look carefully at the classes produced. Those are JUST for visualization.
- Under Spatial Analyst toolbar choose “Zonal Statistics with Table…”
- Zone Dataset: All_ArchID_identity
- Zone Field: All_ArchID_ARCHID
- Value Raster: EucDist_hydr2
Have a careful look at the a-spatial output table (by default its called zstat) and see if you understand the results. You should see two ways of determining the distance from each ArchID location the closest river: Categorical and a Ratio variables. Which column is which?
Note that the Value field is ArchID. That means you can use a Join to connect All_ArchID to the zstat table and connect the output table to spatial locations through that join.
Again, these can be symbolized or explored numerically.
C. Title the presentation
Switch over to View > Layout mode and Insert>Title with the text
"Obsidian presence and distance to water"
Move the Title box around so the text is legible and zoom the map to an appropriate scale.
Export > Save to PDF... and title appropriately. This will be part of what is turned in.