Example: Find Percentages of Open Space in ZIP Code Areas

This example shows a typical GIS task, start to finish.  The point of this example is not to show the shortest, fastest workflow, but to teach various facilities within Manifold using a tutorial that explains and illustrates how various dialogs can be used for a real life task.  The example is therefore much longer than a simple "do this" recipe.

 

Given a drawing showing ZIP codes as areas (polygons), and another drawing showing open spaces like parks and nature preserves, add a field to each ZIP code area that gives the percentage of open space in that ZIP code area.  If we thematically format the ZIP code areas by percentage of open space, the resulting display at a glance can show which ZIP code areas have a higher percentage of open space, providing useful data for citizens, policy makers and planners.

 

The workflow we use is step by step workflow as many GIS people would do it.  Someone with good SQL skills could accomplish this example in a single step using a short SQL query, as shown in the All Workflow in a Single Query section at the end of this topic.  Run that single query and the entire example in this topic gets done with a single click.   If data changes, the entire task can be redone with a single click.

 

However, breaking the workflow up into multiple steps and mixing use of point and click dialogs with a simple bit of easy SQL shows how a simple, step by step approach can be used by anyone to accomplish very useful tasks without requiring more advanced skills.    Within this example we also show two different ways to accomplish a key step, to provide additional opportunities for education.

 

See the video version of this topic in the Find Percentages of Open Space in ZIP Code Area  video.   The video version of this topic takes a slightly different approach in the latter part of the example.  Instead of using the Join dialog to join an total_open field into the zcta table and then using the Schema dialog to create a percent_open computed field in the zcta table, the video uses an SQL query to simultaneously do a join, compute the percent open, and create a new table, from which a new drawing can be created.

 

That approach is efficient for combining two big steps into a single query, but it might not be clear for people new to SQL.   This topic uses the Join and Schema dialogs in a step by step approach that may be easier to understand.

 

Our workflow:

 

  1. Re-project drawings into a sensible coordinate system, if need be.
  2. Using the Schema dialog, simplify the name of a field.
  3. Slice open space areas to ZCTA boundaries using the Transform pane.
  4. Sum open space within each ZCTA.
  5. Join open space totals into the ZCTA table.
  6. Add a computed field giving percentages of open space.
  7. Style the ZCTAs by percentage of open space.

 

The workflow is shown step by step, which results in many illustrations.  In real life, many steps requiring multiple illustrations take but a few seconds.

 

 

We start with a project that contains two drawings that we have downloaded from a Connecticut state government website.

 

 

The map has a Bing streets background layer and the two drawings as layers: a ct_zcta drawing that shows US Census Bureau ZIP Code Tabulation Areas (ZCTAs) in the state of Connecticut as areas, seen above, and a ct_open_space drawing that shows parts, nature reserves and other open space in the state of Connecticut as areas.   The ct_open_space layer has been turned off in the illustration above, so we can better see the ct_zcta layer.

About ZCTAs

ZIP codes in the US are postal codes.  As explained in the ZIP Codes are Not Areas  topic, ZIP codes are not geographic areas but instead are routing codes to send mail to specific delivery systems.   Because such routing codes often cluster within given regions, many people think of ZIP codes as geographic regions.  

 

Although that is a misconception, it is such a useful misconception for many demographic purposes, in which the characteristics of populations such as per capita wealth or education are gathered using the ZIP code of residence as an identifying attribute, that the US Census Bureau has created maps of polygonal areas, ZCTAs, that approximately represent ZIP codes as geographic regions.  Using ZCTAs allows us to pretend that ZIP codes are polygonal areas.  The ct_zcta drawing seen above is the US Census Bureau's drawing of ZCTA areas for Connecticut.

Projections

Both the ct_zcta drawing and the  ct_open_space drawing have been projected into the US State Plane NAD83 Connecticut coordinate system.   They were originally imported as shapefiles that used Latitude / Longitude projection and then immediately re-projected into US State Plane NAD83 Connecticut projection.   See the video version of this topic for the quick command to do that.  See also the Example: Reproject a Drawing topic.

 

 Whenever measuring lengths or areas, for example, as in measuring the areas of open spaces, never use Latitude / Longitude projection as a coordinate system for the components involved in measurement.  Instead, take a moment to re-project into a projection that uses linear measures and, ideally, provides accurate measurements in the region of interest.  Standard projections used by governments in the region of interest are a good default choice.

Housekeeping: Simplify the Name of a Field

One of the fields we will use from the ct_zcta Table has a slightly confusing name, so we will take a moment to simplify it.

 

 

In the Project pane, double-click the ct_zcta Table component to open the table.   The zcta5ce10 field gives the ZCTA code for each area in the ct_zcta drawing.  That is an unwieldy name for the field, which we will rename to zcta for clarity.

 

With the focus on the open ct_zcta Table window, we choose Edit - Schema.

 

 

We double-click the zcta5ce10 field.

 

 

In the Field dialog, we enter zcta for the Name and then we press OK.

 

 

Press Save Changes.

 

 

That renames the field to zcta, which will make it easier to keep track of what we are doing in subsequent workflow.  The renamed field appears at the end of the table.  We can use the Layers pane to move it to whatever position we like.

 

 

Back in the map, we double-click the ct_open_space layer tab to turn on that layer.  We have styled the ct_open_space drawing to use bright green for area fill color.

 

 

We zoom into the map, into the Northeast of Connecticut, to take a closer look.   In the illustration above we have used the Layers pane to set 70% opacity for the ct_open_space layer.    That allows us to see how some of the open space areas overlap more than one ZCTA area.

Slice Open Space Areas to ZCTA Boundaries using the Transform Pane

When we derive the percentage of open space within each ZCTA, we want to count only that part of each open space area that falls within the ZCTA.   To make that easy, we will slice each open space area into constituent regions that fall within each ZCTA, using the Overlay Topology, Intersect transform.    Conceptually, it is if we use the borders of each ZCTA as wire frame cutters to slice each open space area into constituent regions.   See the illustrations for Intersect in the Topology Overlays topic.

 

In this particular case, we can either slice the ct_zcta layer using the ct_open_space layer, or vice versa.  Either way we get the pieces we want.

 

With the focus on the open map window, in the Transform pane we choose the ct_zcta layer in the map and the Geom field.   We double-click the Overlay template to launch it.

 

 

 

In the Overlay template we choose the intersect operation.   

 

We choose ct_open_space as the Overlay drawing.

 

For the Result we choose New Table, and then we enter open_zcta for the name of the New drawing.   As we enter the name for the drawing, the pane will automatically fill in an analogous name for the table.   We can change that if we like.  The name could be anything, but we will use a name that reminds us of the operation used to create the drawing.

 

Press Transform.  

 

 

That creates a new drawing, open_zcta, and the drawing's table, open_zcta Table, in the project.

 

 

 

Dragging and dropping the open_zcta drawing into the map, we see it shows the various constituent regions into which the open space areas have been sliced.    We have turned off the ct_open_space layer in the illustration above.

 

The default gray formatting used for both layers makes it difficult to distinguish the gray areas in the open_zcta layer from the gray areas in the ct_zcta layer, so we will style the open_zcta layer to make clearer what is going on.

 

With the focus on the open_zcta layer we choose the Style pane.

 

 

We apply the thematic format seen above, using the CB Paired palette with the aland10 field, which is different for each ZCTA.   We use the equal count method and ten Breaks for our thematic formatting intervals.    Press Update Style.

 

We use the aland10 field and not the zcta field because the zcta field is a text field (the leading zero character is significant in ZIP codes), so it is limited to use of unique values as a transfer method.  Using a numeric field like aland10 is a simpler way to do a casual thematic format.   For the purposes of a useful illustration, all we need is that the various slices of open space areas that fall within different ZCTAs are colored differently.

 

 

The resulting thematic format shows how a big open space area that overlapped four different ZCTA areas has been sliced into four constituent regions, each aligned to the boundary of the ZCTA area that contains it.  Each constituent slice has also been assigned the zcta field, containing the identifier for the ZCTA area that contains it.   

 

In the next step, we will use that field to get a total amount of open space area within each ZCTA.   There are two ways to do that, either using the Join dialog, or by using a query.   

 

Using a query has fewer steps and is simpler.   However, since many people do not feel comfortable with even small queries, we can use an almost completely point and click approach using the Join dialog.  That takes more steps, because we have to prepare the attributes we will use in a multi-step procedure, but since each step except one is totally point and click, that may be easier for some users.

 

Since this is a tutorial example, we will show both ways.  We will begin with the longer, but more point and click, approach.   Users who want to use the quicker approach that uses a query can jump directly to the Sum Open Space within Each ZCTA using a Query section.

Sum Open Space within each ZCTA using Schema and Join

Now that we have sliced the various open space areas into constituent portions that are entirely within each ZCTA area, our next step is to add an attribute field to the open_zcta drawing that gives the area of each constituent portion.  We do that by using the Schema dialog to add a computed field to the open_zcta table.

 

 

We right-click the open_zcta tab in the map and choose Open Table from the context menu.

 

 

That opens the drawing's table.   Most of the fields are now irrelevant, since we do not care about them for our task or because they report values for the entire area and are now inaccurate for the slices.  For example, the aland10 and awater10 fields report the total areas of land and water regions for what was the entire open space area, and must be recalculated or reanalyzed for each constituent slice.   We do not care, since all we want is the total area of each constituent slice.  We will get that by adding a computed field we will call area.

 

With the focus on the open table window, we choose Edit - Schema to launch the Schema dialog.

 

With the focus on the open_zcta Table window, we choose Edit - Schema.

 

 

We click the Add button and choose Field, to start adding a new field to the schema.

 

 

In the Field dialog we enter the name area for the new field.  Since we want to create a computed field, we press the Edit Expression button to launch the expression builder dialog.

 

 

In the expression builder, we enter the expression

 

GeomArea([Geom], 0)

 

in the upper pane.   If we do not remember how to get the area of objects, we simply copy and paste that expression from the example in the Computed Fields topic.

 

Press OK to get back to the new field dialog.

 

 

Press OK to get back to the Schema dialog.

 

 

Press Save Changes to add the new computed field to the table.

 

 

That adds a new computed field, called area, to the table which gives the area of each constituent portion of the open space areas.   

 

We can now use the Join dialog to sum the areas of each constituent portion found within the same ZCTA to get the total open space within each ZCTA.

 

 

Click on the ct_zcta tab to move the focus there.  With the focus on the ct_zcta layer in the map window, we choose Edit- Join to launch the Join dialog.

 

 

In the Join dialog, in the upper right corner we choose the open_zcta drawing as the joined drawing, and we choose contains as the join criterion.

 

We click the Add button to get a pull down menu of available fields from the open_zcta drawing table that we can add to the ct_zcta drawing table.   

 

 

There are very many attribute fields in the table, too many to show all of them in a pull down menu, so we choose the Fields option at the end of the pull down menu to see all fields.

 

 

We scroll down to the area field, which is shown with a computed field icon, and we check the box for that field to add it.   Press OK.

 

 

That adds a row to the Join dialog to add a new field from the open_zcta drawing, the area field.  We click into the center box to change the aggregate method from copy to sum.  That will sum the areas from each of the joined objects.

 

 

Finally, we double-click into the name of the new field to be added to the ct_zcta table, the left most box in the row, and we change the name to total_open, since that field will give the total area of open space in each ZCTA.

 

Press Join.

 

 

A new total_open field appears in the ct_zcta drawing's table.  This field gives the sum of all of the various open_zcta bits and pieces that are contained within each ZCTA area.

 

A much quicker way to do the above is to use a query, which is then directly used by the Join dialog.  The query shows the power of SQL to do in a single query what otherwise takes a series of point and click dialog steps.

Sum Open Space within each ZCTA using a Query

We would like to know the total amount of open space area within each ZCTA.  We can get that using our new open_zcta drawing by adding up the area of each open space area within the ZCTA.  That is easy to do, using a small snippet of SQL.

 

We create a query called open sums query which operates on all objects within the open_zcta Table.    Those are all the constituent parts of the open areas, that we have sliced up by ZCTA.  Creating a query is easy.

 

In the main menu, we choose File - Create - New Query.  We can also right-click into the Project pane and choose Create - New Query.

 

 

In the New Query dialog, we enter open sums query as the name for the new query, and we press Create Query.

 

 

The new query appears in the Project pane.  We double-click the new query to open it.

 

 

The query opens in a blank Command window.   We enter the query text:

 

SELECT [zcta], Sum(GeomArea([Geom], 0)) AS [total_open]

   FROM [open_zcta Table] GROUP BY [zcta];

 

There is no need to save the query, as entering text into the query's command window automatically updates the query.

 

The query uses the GeomArea geometry function to calculate the area of each object.   The query groups objects by zcta value and, using the Sum aggregate function, returns the sum of areas that have the same zcta value, together with that zcta value.   The AS alias provides a memorable name for the column that reports the sum.

 

Running the query by pressing the ! Run button in the main toolbar, we get a results table:

 

 

We do not actually have to run the query for it to be available for our use in the next step, but running it shows what the query does for the purposes of providing a useful illustration in this topic.   The query creates a results table with two fields, a zcta field and a total_open field.

 

For each zcta value, the total_open field in the results table reports the sum of the areas of the various open area polygons within that ZCTA from the open_zcta drawing.    

 

We can now do a simple join to combine the data generated on the fly by the query into our ct_zcta Table.

Join Open Space Totals into the ZCTA Table

We use the Join dialog to join the total_open column that the query generates into the ct_zcta Table, using the zcta field that is in both the query results and in the table as a key field for the join.

 

 

With the focus on the open ct_zcta Table as seen above, we launch Edit - Join.  

 

The ct_zcta Table, is automatically used as the destination, original table.    We choose open sums query in the upper right box as the source, joined component, and we choose the zcta field as the key field for both query and table.    

 

 

We click the Add button to get a pull down menu of available fields from the open sums query that we can add to the ct_zcta Table.   We choose the total_open field.

 

 

A new row appears, indicating we will add a new field to the ct_zcta Table called total_open which will be filled by copying the total_open field values from results generated on the fly by the open sums query.    

 

Note that unlike the alternate approach, the Join dialog need not do any summation of open space bits within each ZCTA because the query already did that to generate a total open value within each ZCTA.  The Join dialog simply copies that result from the query into a permanent field within the ct_zcta Table.

 

Press Join.

 

 

A new field called total_open appears in the ct_zcta Table.   It is filled with the total_open value for each ZCTA taken from the total_open table.

Add a Computed Field giving Percentages of Open Space

We now have all information we need in our ct_zcta Table to get the percentage of open space for each ZCTA.   For each ZCTA we have the geometry of that ZCTA, from which a quick use of GeomArea will provide the area of the ZCTA.  We also have the total area of all open spaces in that ZCTA.   A simple division of total open area by the area of the ZCTA gives the percentage.

 

That is easy to do by adding a computed field to the table, using the Schema dialog.   With the focus on the open ct_zcta Table window, we launch Edit - Schema.

 

 

Click the Add button and choose Field to add a new field.

 

 

We enter the name percent_open for the new field and choose a data type of float64.   We click the Edit Expression button.

 

 

In the Expression dialog we enter the expression:

 

[total_open] / GeomArea([Geom], 0) * 100

 

That is a simple division of the total area of open space in each ZCTA by the area of the ZCTA as calculated using GeomArea.   Multiply by 100 to get a percentage number.

 

Press OK.

 

 

Back in the Field dialog, we confirm the expression for the computed field is what we want, and we press OK.

 

 

Back in the Schema dialog, to commit the change to the table and to add the computed field, press Save Changes.

 

 

A new percent_open field appears in the table, with the contents computed automatically to show the percentage of each ZCTA that is open space.  

 

We can use that new field to thematically format the ZCTAs by how much open space they have.

 

 

With the focus on the ct_zcta layer, we choose the Style dialog.

 

 

We choose the fill color for areas, and the percent_open field as the basis for thematic formatting.  We choose equal intervals as the method, with 10 breaks.    We choose the Yellow-greens Color Brewer palette.

 

Press Update Style.

 

 

ZCTAs with a higher percentage of open space are shown more green, while those with lower percentage of open space are shown more yellow.

 

 

We can turn on the ct_open_space layer to see how having more open space results in a greater percentage of open space in our calculations, as shown by the thematic formatting based on the percentage of open space.

 

 

Zooming in, we can see how having more open space areas in a ZCTA does indeed result in darker green color for the ZCTA.

All Workflow in a Single Query

This topic at the very beginning mentioned how an experienced user might accomplish the entire, multistep workflow shown in this topic with a single SQL query.   Here is one possible version of such a query, written to be directly analogous to the multistep workflow done above in this topic:

 

CREATE TABLE [percent_open_zcta Table] (

  [mfd_id] INT64,

  [zcta] VARCHAR,

  [Geom] GEOM,

  [percent_open] FLOAT64,

  INDEX [mfd_id_x] BTREE ([mfd_id]),

  INDEX [Geom_x] RTREE ([Geom]),

  PROPERTY 'FieldCoordSystem.Geom' ComponentCoordSystem([ct_zcta])

);

CREATE DRAWING [percent_open_zcta] (

  PROPERTY 'Table' '[percent_open_zcta Table]',

  PROPERTY 'FieldGeom' 'Geom'

);

PRAGMA ('progress.percentnext' = '100');

INSERT INTO [percent_open_zcta Table] (

  [zcta], [Geom], [percent_open]

) SELECT [ct_zcta].[zcta], [ct_zcta].[Geom],  

   [open_sums].[total_open] / GeomArea(([ct_zcta].[Geom]), 0) * 100

   FROM [ct_zcta]

   INNER JOIN

      (SELECT [zcta], Sum(GeomArea([Geom], 0)) AS [total_open]

         FROM

           (SELECT [s_zcta] AS [zcta], [s_Geom] AS [Geom]

              FROM CALL GeomOverlayTopologyIntersectPar( [ct_zcta], [ct_open_space],  0,

              ThreadConfig(SystemCpuCount())) )

         GROUP BY [zcta] )

      AS [open_sums]

   ON [ct_zcta].[zcta] = [open_sums].[zcta];

 

A Manifold query can contain more than one statement ending in a semicolon ; character, with each statement being executed in sequence.  The query above contains four statements: two CREATE statements, a PRAGMA, and then finally an INSERT INTO.

 

The query has two main parts:  the first part creates a table and a drawing, and the second part populates the table (which automatically, of course, populates the drawing).    

 

Let us unpack the query to see how it works.   The first part is simple:

 

CREATE TABLE [percent_open_zcta Table] (

  [mfd_id] INT64,

  [zcta] VARCHAR,

  [Geom] GEOM,

  [percent_open] FLOAT64,

  INDEX [mfd_id_x] BTREE ([mfd_id]),

  INDEX [Geom_x] RTREE ([Geom]),

  PROPERTY 'FieldCoordSystem.Geom' ComponentCoordSystem([ct_zcta])

);

CREATE DRAWING [percent_open_zcta] (

  PROPERTY 'Table' '[percent_open_zcta Table]',

  PROPERTY 'FieldGeom' 'Geom'

);

PRAGMA ('progress.percentnext' = '100');

 

The first two CREATE statements create a table and then create a drawing based on that table.   The table is created with four fields: mfd_id, zcta, Geom, and percent_open.   Two indexes are created, one on mfd_id and one on Geom.   The table is created using the same coordinate system as the ct_zcta drawing.

 

Drawings in Manifold contain no data but just consist of a minimum of two properties that say which table they should use for their geometry and which field in that table contains the geometry.   Creating a drawing in Manifold in SQL is thus trivial.

 

The PRAGMA statement creates a progress bar that shows progress as the rest of the query proceeds.

 

The second, main part of the query accomplishes the spatial tasks we did in the example, and then populates the newly-created table:

 

INSERT INTO [percent_open_zcta Table] (

  [zcta], [Geom], [percent_open]

) SELECT [ct_zcta].[zcta], [ct_zcta].[Geom],  

   [open_sums].[total_open] / GeomArea(([ct_zcta].[Geom]), 0) * 100

   FROM [ct_zcta]

   INNER JOIN

      (SELECT [zcta], Sum(GeomArea([Geom], 0)) AS [total_open]

         FROM

           (SELECT [s_zcta] AS [zcta], [s_Geom] AS [Geom]

              FROM CALL GeomOverlayTopologyIntersectPar( [ct_zcta], [ct_open_space],  0,

              ThreadConfig(SystemCpuCount())) )

         GROUP BY [zcta] )

      AS [open_sums]

   ON [ct_zcta].[zcta] = [open_sums].[zcta];

 

The core part of the query is a SELECT statement that accomplishes a JOIN, and within that is a SELECT that does an aggregate Sum, and within that is a SELECT that does an overlay topology intersect operation.   The result of each SELECT is a table that is fed to the next SELECT statement in turn, with the last SELECT feeding the INSERT INTO.   

 

Tech tip:  The classic statement we all learn first in SQL is SELECT, where we pull some fields of interest from a table using the general form SELECT <some fields> FROM <a table>.   For example, we might write:

 

SELECT [Name], [Population] FROM [Cities Table];

 

That creates a results table containing the Name field and the Population field from a table called Cities Table.  But because the result of a SELECT is itself a table, we can use a SELECT within a query where a table is used, as in:

 

SELECT [Name], [Population] FROM (SELECT <some fields> FROM <a table>);

 

In fact, we can use SELECT statements within other SELECT statements as many levels deep as we like, as the main part of our query does.  Remember: no matter how intricate a SELECT statement may be in terms of using clauses like JOIN or GROUP BY, the end result of a SELECT statement is always a table.

 

Back to our query:  the main part of the query accomplishes the four tasks we did step by step in this example, plus the fifth task of updating the newly-created table:

 

 

Here are the different portions, each colored differently and with comments to call out the different portions:

 

-- Update table

INSERT INTO [percent_open_zcta Table] (

  [zcta], [Geom], [percent_open]

) SELECT [ct_zcta].[zcta], [ct_zcta].[Geom],

 

   -- Calculate percentages

   [open_sums].[total_open] / GeomArea(([ct_zcta].[Geom]), 0) * 100

   FROM [ct_zcta]

 

   -- Join open space per ZCTA into each ZCTA

   INNER JOIN

 

      -- Sum open space within each ZCTA

      (SELECT [zcta], Sum(GeomArea([Geom], 0)) AS [total_open]

         FROM

 

           -- GeomOverlayTopology, Intersect

           (SELECT [s_zcta] AS [zcta], [s_Geom] AS [Geom]

              FROM CALL GeomOverlayTopologyIntersectPar( [ct_zcta], [ct_open_space],  0,

              ThreadConfig(SystemCpuCount())) )

 

         GROUP BY [zcta] )

      AS [open_sums]

 

   ON [ct_zcta].[zcta] = [open_sums].[zcta];

 

The result of the SELECT in the GeomOverlayTopology, Intersect portion is a table that has two fields, [zcta] and {Geom].   We can learn how to write that portion of the query by setting up the Transform pane to accomplish the same thing and then pressing the Edit Query button.

 

That resulting table is used in the FROM clause for the next level, the SELECT statement that computes the sum of areas of open spaces within each ZCTA.   The result of that SELECT is also a table (of course: the result of a SELECT is always a table) with two fields in it, [zcta] and [total_open].     That is similar to the small query we wrote in this topic, without the INTO clause.

 

That resulting table is used in the FROM ... INNER JOIN ... clause for the next level, the SELECT which both accomplishes a INNER JOIN and which also uses fields from that INNER JOIN to calculate percentages.   The result of that final SELECT are three fields, [ct_zcta].[zcta], [ct_zcta].[Geom] and the result of the percentage calculation.  Although JOIN implies INNER JOIN by default, and so the "INNER" is often left out, it is a good idea to explicitly write INNER JOIN as we do here.

 

The three fields from that final SELECT are used by the INSERT INTO statement to update the three [zcta], [Geom] and [percent_open]  fields in the new drawing's table.  Note that there is no need to calculate or to update the [mfd_id] field, because that is a special field that automatically updates itself.  

  

Tech tip:  When learning to write SQL and when developing new queries, try running portions of the query to verify they work as expected, and then build up the main query from constituent parts that are known to work correctly.    For example, in the query above, the SELECT statement shown in green can be executed standalone.  If it throws an error when run standalone, the query that uses it will not work either.

 

Open up more than one Command Window, so parts of the query can be copied and pasted into a second session.  For example, copy the SELECT statement in green color, paste that into a second Command Window, and run it.  The results table that appears in the Command window will show if it runs as expected without obvious errors.   Next, Copy the dark blue text, starting with the -- Sum comment, as well as the green text within, and run that, and so on.  

 

The statements create results tables that are virtual (shown only in the Command window and not being written into the project as a table),  and do not modify anything in the project until we get to the INSERT INTO, where the target table is updated, so there is no harm done if they do something incorrectly.   For that matter, we can always delete (in the Project pane) and then create the table again using a third Command window that contains just the CREATE statement.   Using SQL in such an interactive way is a great way to try out ideas and to learn by doing.

 

See a similar sequence of SQL statements in the SQL Example: Extract Airport Runways from an OpenStreetMap PBF  topic.

 

Notes

Terminology - The original table also may be called the target table, and the table providing data also may be called the source table.

 

Quick reference - With the Join dialog open, press F1 to open a quick reference guide in a browser window.

 

Added fields are static - Added fields are static, that is, if data in the joined table changes the fields in the original table that received data in the Join operation from the joined table will not automatically be updated.  We can easily update the table any time we want using a saved update query.  

 

Why setup, and then update? -  Why does the Setup Join and Edit Query button first add the new field desire and then generate an UPDATE query, instead of generating a query that both adds the new field and also does the UPDATE?  Two steps are used to enable repetitive use of the same query to adjust what data is joined into the new field.   If the generated query also created that new field, to run that query multiple times, each time we ran it we would first have to delete the newly created field from the schema, so that part of the query would not cause an error.   By generating only an UPDATE query we can modify it and re-run it however many times we want, to try out different variations of how we UPDATE the table.

 

Why only in the same data source? - The Join dialog works only with tables and queries within the same data source to keep the dialog simple and fast.   An important part of keeping the dialog simple is to show only those tables or queries in the pull down list for the joined table that can be used in a join.   Scanning all tables and queries in the same data source is reasonable, but if there are many data sources or many levels of nesting (data sources within data sources) there could be hundreds of thousands of tables and queries within the hierarchy to scan.  For people working in map projects without such elaborate data sources, most of the time what they do will be within the .map project data source so everything works by default.  To use tables and queries from other data sources as the joined table in the Join dialog we can quickly add a simple aliasing query, as illustrated in the Join Example: Joins between Data Sources topic.  This method also makes it possible for the same Join dialog to work within other data sources, when aliasing queries refer back either to the .map project data source or to other data sources.   Almost all queries can be used as the source, joined table.  The Join dialog also can also use a query as the target, original table, to be used to copy data into the table through the query, as shown in the Join Example: Join into a Query and Join Example:  Join into a Query into a Selection topics, if the result table is writable and the Join does not attempt to create new columns in the table.

 

Buy and read a Fehily book on SQL - Manifold recommends Chris Fehily's excellent books on SQL.  Chris makes learning SQL really easy.  

 

Videos

Join Videos

 

Find Percentages of Open Space in ZIP Code Area - Find the percentage of open space in each ZIP code area given a layer of polygons representing ZIP codes and a layer of polygons showing open spaces like parks and green spaces. This video shows how to do that start to finish in a few simple steps, from initial importing of shape files to final results, in just five minutes, with an additional few minutes of explanation what each step does. Works in Manifold Release 9 or using the free Manifold Viewer.

See Also

Tables

 

Queries

 

Schema

 

Join

 

Join Examples

 

Topology Overlays

 

Command Window

 

Command Window - Query Builder

 

Editable Results Tables

 

CSV Servers

 

ZIP Codes are Not Areas

 

SQL Example: Extract Airport Runways from an OpenStreetMap PBF - We write a simple SQL query using INNER JOIN to extract runway lines from an OpenStreetMap PBF of Cyprus, and to save those lines to a new drawing and table.