Spatial Join Example: A Spatial Self Join

Spatial joins between drawings use spatial relationships between the geometry of objects in the source, joined drawing and the geometry of objects in the target, original drawing to join data into the original drawing.   The Edit - Join dialog provides spatial joins between drawings that are layers in maps.    Spatial joins done by the Join dialog are an easier way to accomplish classic GIS "overlay" operations.    Spatial joins work between drawing layers in the same map.  Layers can be in different data sources.

 

Overlays in ESRI nomenclature are called spatial joins in the data science / IT world, two different terms to describe exactly the same thing.  The Edit - Join dialog is a more modern way to do the same jobs with easier workflow.  Edit - Join, for example, updates the target table or drawing in place.   Even skilled ESRI people often prefer Edit - Join.

 

Add fields using a spatial join:

 

  1. With the focus on a drawing layer in an open map window, choose Edit - Join.
  2. In the upper right box choose the joined drawing in the map from which fields will be joined.

  3. In the second row of boxes, choose the geometry fields to use and the spatial method, such as contains or contained in.

  4. Click the Add button to choose a field in the joined drawing's table that will provide data for a new field in the original table.

  5. Double-click a cell to edit the field name (new fields) or to choose a different source field (joined table) or to choose a different transfer method.

  6. Press Join.

Example

This example provides a drawing to drawing spatial join where the same drawing is used both as the source, joined drawing and the target, original drawing.   A self join is when a drawing or table is joined to itself.    This is one of those nuanced concepts in data science and SQL that the Join dialog makes easy.  We consider a classic example, doing a self join using a spatial relationship.   We then apply that example to find small gaps in line sequences, a typical job when cleaning up sequences of lines that are to be used as road or other networks.

 

Consider a drawing, called Lines, as seen below, that has six line objects in it, each of which consist of a single straight line segment.  

 

 

The two lines on the left do not touch any other lines.   The four line objects on the right have been drawn using snap, so that the beginning of the next line is exactly coincident with the end of the prior line.  Each of the four line objects on the right touches one or two other line objects.   Adjacent for a line means coincident ends: lines which terminate at an inner coordinate within another line, and not at one of the ends of that other line, are not adjacent.  

 

In this example, we use touch instead of adjacent, to allow a direct comparison to other GIS packages which have a touch spatial predicate but not an adjacent spatial predicate.  

 

To use touch, we should first run the Normalize Topology transform to clean up an minor topological inconsistencies in the drawing.

 

We would like to find all lines that touch another line.  Invert that selection, and we find all lines that do not touch another line.   This is a classic task in cleaning up drawings that are supposed to contain road networks, where each line that makes up a road must touch the prior line in a road, with no small gaps between lines.   

 

 

The drawing's table has an mfd_id field that is a unique ID value for each object, and a Geom field that gives the geometry for each line object.  

 

We begin by using the Schema dialog (not illustrated) to add an TouchCount field, data type int32, to the drawing's table.    As added, the TouchCount field is empty, full of NULL values.    We will populate it using the Join dialog.  

 

With the focus on the opened Lines drawing, in the main menu we choose Edit - Join.

 

 

The Join dialog opens by default using the Lines drawing as both the original drawing and also the joined drawing.  That makes the join operation a self join, jointing data from the same drawing back into that drawing.  

 

We choose touches as the join method, and then we double-click into the joined fields cell for the TouchCount field and we choose mfd_id as the joined field.   We double-click into the transfer method cell in that row and, from the long pull down menu of choices, we choose count as the transfer method.

 

What we are commanding is for the system to count up the number of different mfd_id values, which is a count of the number of different objects, where geoms, that is, objects, in the drawing touch each object, and to put that number into the TouchCount field for each object.

 

Press Join.

 

 

Instantly, the table is updated with values in the TouchCount field.  The field reports the number of lines that touch each line.   Since a line object touches itself in a self-join, the TouchCount value will be 1 for each line object that has no other touching lines, and it will be greater than one for each line object that has one or more line objects that touch it.

 

In the simple case of our example, line objects at the beginning or end of a sequence of adjacent lines have only touching neighbor, so they have a  TouchCount value of 2, the line itself and the neighbor.   Those lines within a sequence of adjacent lines have two neighbors that touch it, one at the beginning and one at the end, so they have a TouchCount value of 3, the line and the two neighbors.

 

 

We can select all lines that have one or more adjacent neighbors using the Select pane, choosing the Greater template to find all TouchCount values with a value greater than 1.

 

 

That selects all adjacent lines in the table, and also in the drawing, as seen below.

 

 

Instead of using the Select dialog we could have used a very simple SQL query to select objects, such as

 

SELECT * FROM [Lines]

  WHERE [TouchCount] > 1;

 

We might use a query to create a new table and then a new drawing, of just those lines that are not touching another line, as in:

 

SELECT * INTO [NotTouching Lines]

  FROM [Lines]

  WHERE [TouchCount] = 1;

 

Example:  Find Small Gaps in Line Sequences

We can continue the example above to find small gaps in what should be continuous sequences of adjacent lines, for example, in a road network, where there should be no microscopic gaps between the end of one line object and the next line object in what is supposed to be a continuous road.

 

 

The drawing above illustrates what is supposed to be a continuous sequence of adjacent lines that represents a road.  But it contains a gap, which we have made a large gap, when in real-life road networks the gaps might not be at all visible if we do not know where they are and zoom far in to see them.

 

The illustration shows a sequence of three straight lines that are adjacent, followed by a gap, and then followed by a sequence of eight straight line segments that are adjacent.  

 

 

The drawing's table starts with the same three fields as before, including an integer TouchCount field that initially is full of NULL values.

 

We either relaunch the Join dialog using exactly the same settings as before, or we simply right click onto the update query that we saved before and we choose ! Run to run the saved update query.

 

 

Instantly, the TouchCount field is populated with the number of lines that touch each line object.  Values of 2 indicate lines that are at the end or the beginning of a sequence of adjacent lines, while values of 3 indicate lines that are within a sequence, with neighbors touching them both at their beginning and end.

 

 

Selecting for TouchCount equal to 2 or less we get all lines at the beginning or end of a sequence of adjacent lines.   We select for 2 or less, to also pick up any line segments that are isolated, with no touching neighbors.

 

 

Right away, the beginning and end segments appear in red selection color in the drawing.   These are easy to see and whenever we see some red color appear in "the middle" of a road, as opposed to only the very ends, we know there is a gap at that location and we can zoom in to see it and to repair it.

 

Notes

Terminology - The original table also may be called the target table, and the table providing data also may be called the source table.

 

Quick reference - With the Join dialog open, press F1 to open a quick reference guide in a browser window.

 

Guessing the key field - When we launch the dialog with an original field and when we choose a joined field, the dialog will try to guess what fields we might want to use as key fields and will load those first by default.  If it guesses wrong, we can choose the field we want.   For the original table, the dialog tries to use a field (other than mfd_id) that has a BTREE / BTREENULL index, with a data type preference for numbers, then text, then everything else, and a name preference for field names with id in them, such as ... id or ..._id, then ...id and then everything else. For the joined table, the dialog uses similar logic, but first tries to use a field with the same name as in the original table.

 

Added fields are static - Added fields are static, that is, if data in the joined table changes the fields in the original table that received data in the Join operation from the joined table will not automatically be updated.  We can easily update the table any time we want using a saved update query.  

 

Spatial joins are automatically parallel - Joining data between drawings uses parallel GeomOverlay function variants.  Joins from images to drawings are parallelized through a nested SELECT using THREADS.

 

Videos

See the Join Videos list of videos showing how to use the Join dialog.

See Also

Maps

 

Tables

 

Queries

 

Drawings

 

Images

 

Editing and Combining Data

 

Join

 

Join Videos

 

Join Examples

 

Command Window

 

JOIN Statements

 

Editable Results Tables