Example: Trace Vector Areas from Raster Pixels

This example follows the Trace Vector Areas from Raster Pixels video on the Gallery page.   We use the Trace template in the Transform pane for images to create a drawing with vector areas covering regions of similarly-colored pixels.  Next, we use a simple query to add classification codes from a USGS table of classes to the resulting drawing, using a simple INNER JOIN SQL statement.   

 

This example uses the same data set as is used in the Example: Import CTG Grid Cell File and Style topic.   Download the project used in this example from the Downloads page.

Import a Raster Image

We will first import a raster image.  We then explore the image, opening the image's table and taking a look at how it is styled, to understand what data that image shows.

 

Import a CTG LULC Grid Cell file containing raster data showing land use and land cover as shown in the Example: Import CTG Grid Cell File and Style topic.

 

 

There are four images created, called grid_cell Census, grid_cell Hydro, grid_cell LULC and grid_cell Political,  all created from the same grid_cell Tiles table.

 

 

Opening the table we see it has four different tile columns.   Each of the four images is created from a different tile column.   This is a typical arrangement for data imported from CTG format.

 

 

We are interested in the grid_cell LULC image, so we drag and drop that into a map, with a background layer using a Stamen Terrain data source (available in the JOSM sources project that can be downloaded from the Downloads page) and an upper layer consisting of Google's transparent street maps layer.    We can see that the grid_cell LULC image covers a region to the east and south of San Jose, California.

 

 

To simplify the presentation, we remove the other layers from the map and zoom in to the grid_cell LULC layer.  LULC stands for Land Use and Land Classification.   The image uses a limited palette of colors where each color represents a specific type of land use.   

 

 

Opening the Style pane for the image, we can see it uses a single channel and has been formatted using a palette.    The palette is used for purely visual presentation.   

 

A more technically accurate palette would have used unique values and not equal intervals as the tally method.  The method used does not really matter since how the image is styled for presentation does not in any way change the data in the pixels.   When we create areas, the Trace template will look only at the data content of the pixels and will not care how that data is styled for presentation.

 

 

Most raster images that we work with in GIS will be based on a table where there is only one tile field.   This image is created from a table that has four tile fields, so we will need to know which tile field it uses.   To find that out, we right-click on the grid_cell LULC image and we choose Properties.

 

 

The Properties dialog tells us the image uses the Tile1 field in the table. Press OK or Cancel to close the Properties dialog.

Trace Areas

Actually tracing areas takes very little time.    With the focus on the map, in the Transform pane we choose the grid_cell LULC layer and the Tile 1 field.  We then double-click on the Trace template to launch it.

 

 

In the Trace template, we enter 1 for Similarity, since we want each area to be created only for a single, integer value in pixels.   We check the Split into shapes option so that non-contiguous regions that have the same value for their pixels will result in separate area objects.

 

The only allowed Result destination is New Table, since a new drawing must be created from the raster image.   For the name of the new drawing we enter Trace areas, using an analogous name for the new table.   We could use whatever names we like, but it is wise to specify names that remind us what the new components contain.

 

If we would like to see a preview, we press the Preview button.  We do not have to do a preview before applying a transform, but taking a moment to check our setup with a preview can be a great way to avoid errors.

 

 

A preview appears in blue preview colors, in a virtual layer above all other layers and contents in the map window.   A blue preview caption bar appears at the top of the window giving the name of the template being previewed.   We can right-click the caption bar for a context menu of commands to control the preview display.

 

 

For example, if we would like to see layers below the preview, we can right-click the caption bar and choose 50% opacity to render the preview with partial opacity, so what is below can be seen.  

 

 

We can right-click the caption bar and choose Left or Right to show the preview only on the left or right side of the window, dragging the thin blue vertical separator line left or right to adjust the width of the preview.

 

If we like what we see, to apply the template we press Transform.  

 

A new drawing and table called Trace areas appear in the Project pane.   

 

 

We drag and drop the Trace areas drawing into the map window as a layer.

 

 

We drag and drop the new drawing into the map.  It appears in default gray formatting.

 

 

We open the new drawing's table.  We see that each new area that covers a region of similarly-colored pixels has a Value attribute that provides the number of the color for the pixels in that area.   The template has created approximately 1,800 area objects, with each classed with a value corresponding to the pixel color value in the original raster.  

 

The Split into shapes option, when checked (default), will automatically split all areas for the same class of color into separate area objects, as seen in the table above.   Unchecking this option will result in the creation of a single, multibranched area object for each class.  This makes for a simpler results table and the ease of selecting all like areas with a single Ctrl-click, but when starting with larger raster data it can result in very large areas, consisting of millions of coordinates to cover all of the branches.

 

 

For example, had we unchecked the Split into shapes option, and then run the transform operation,  the template would have created about 30 different area objects, each of which would have had many branches.    The table below is an example:

 

 

What would still appear in the drawing to be 1,800 areas would be fewer than 30 area objects, each of which has many branches.  Each area is be multi-branched, so that all of what would appear to be independent areas with a Value of, say, 11, would all be branches of one very large and complex area.  

Styling the New Drawing

We can create a more appealing display by styling the drawing so it does not use default, gray format.

 

 

With the focus on the grid_cell LULC drawing layer in the map, in the Style pane we click on the right-hand side of the Fill Color button, we choose Value as the field to use, unique values as the method for 100 values (there are fewer than 100 unique values but using 100 ensures we will get all of them), we choose the Color Brewer CB Paired palette, and then we press Update Style.

 

 

Adding pretty colors makes the display more appealing.  We have zoomed in to the view to see how individual areas are colored differently.

 

 

We can zoom out to see how the new drawing shows many different areas, each with a Values code corresponding to the raster color from which it was created, which all together show the various land uses in the region near San Jose.

 

At this point we are done with the tracing task.  We have used the Trace transform template to create a new drawing and to fill it with areas that each cover a region of similarly-colored pixels in the image.

Adding Classification Codes

For extra credit, we can continue on in this example by adding classification codes to the new drawing's table.   

 

 

The drawing's table lists a Value for each area.   We would like to add an additional column to the table that provides a Code field providing the text description of what each Value is intended by USGS to mean.

 

 

To do that, we will use a table of USGS codes.   We have created a table called LULC_classes (available in the project used in this example that may be downloaded from the Downloads page) that lists the classification codes defined by USGS for LULC raster data.   Each record contains a Value and a Code.     

 

Our task is to add a text column to the Trace areas table called Code and to then populate that column for each record with the text descriptions for Code taken from the LULC_classes table that match the Value of the area.   We have two ways to do that in Manifold, and will show both in this example:

 

 

Add a Column using Edit - Join

With a few clicks we can add the desired column using the Edit - Join dialog.

 

 

With the focus on the Trace areas Table window, we choose Edit - Join in the main menu.

 

 

In the Join dialog we choose LULC_classes from the pull down menu for the joined table box.   Manifold guesses we want to use the Value field as the key field for the joined table.   That is a good guess, so we leave it in place.   If Manifold guessed wrong, we could easily pick the Value field from the pull down menu.

 

 Press the Add button to add a new field.  The pull down menu shows fields in the LULC_classes table.  We choose the Code field, which gives the text description for each class.

 

 

Press Join.   Done!   

 

 

The new Code field appears in the Trace areas Table, correctly populated with the right text description for each Value, automatically copied from the LULC_classes table.

 

Joins are really easy using the Edit - Join dialog.

Doing a JOIN with SQL

If we want to learn more about doing JOINs, a very powerful capability of SQL, we can achieve the same task using a query.   This requires a bit more work than using the point and click Join dialog, but mastering SQL for such tasks opens the door to endless automation and flexibility.

 

 

If we open the Schema dialog for the LULC_classes table, we see that the table has a btree index called Value_x on the Value field.  This is an important detail, because the query which we will run below requires that index.

 

Our task is to add a text column to the Trace areas table called Code and to then populate that column for each record with the text descriptions for Code taken from the LULC_classes table that match the Value of the area.   

 

We can do that with a simple query that uses ALTER TABLE to add the column and then UPDATE and INNER JOIN to populate the column.  The example project contains the query text in a query called Update with class codes.   We can run the query by right clicking it in the Project pane and pressing the ! Run button, or we can double-click the query to open it in the Command Window to take a look at the query text.

 

 

The first part of the query adds a Code field of type nvarchar and it adds a btreedup index on the Value field that will work with the index on the Value field in the LULC_classes table:

 

ALTER TABLE [Trace areas Table] (

  ADD INDEX Value_x BTREEDUP (Value),

  ADD Code NVARCHAR

  );

 

The second part of the query uses UPDATE to fill in values of the new Code field for each record, using an INNER JOIN to associate the correct Code string for each corresponding Value in the record.   

 

UPDATE (

  SELECT

    a.mfd_id,

    a.Code AS a_Code,

    c.Code AS c_Code

  FROM [Trace areas Table] AS a

  INNER JOIN [LULC_classes] AS c

  ON a.Value = c.Value

  )

SET a_Code = c_Code;

 

The query text uses AS to alias the longer names of the tables to single characters, aliasing Trace areas Table to the letter a, a mnemonic for "areas" and aliasing LULC_classes to c, a mnemonic for "classes".     That enables the query text to be written in much more compact form.   For example, instead of writing [Trace areas Table].Code we can simply write a.Code.   Instead of writing [LULC_classes].Code or [LULC_classes].Value we can simply write c.Code or c.Value.    Besides saving much keyboarding and avoiding typographic errors, the use of AS to alias long table names to short aliases will also help make queries more legible.  

 

Note that even though the use of AS to alias the longer table names to single characters happens in the FROM and INNER JOIN parts of the query, we can all the same use those alias right at the very beginning just after SELECT, with constructions like a.mfd_id.  The aliasing with AS is all part of the same SELECT so the query engine accepts it throughout.

 

We can also use AS to rename an aliased field in the results table, as in the constructions

 

  a.Code AS a_Code,

  c.Code AS c_Code

 

Without the renaming...

 

SELECT

  a.mfd_id,

  a.Code,

  c.Code

 

... the SELECT (shown in red since it is incorrect) would be a command to create a results table with an mfd_id field and two fields called the same name, Code, since the results table takes fields as they are named, without prefacing them with a [table name] and a dot.  We can't have two fields with the same name in a table so we rename them in the SELECT results to a_Code and c_Code.  We do not have to rename both of them, since we could have, for example, a field in the results table called Code and a second field that was renamed to be called c_Code, but it is good for clarity to be consistent and rename both.

 

 

When we run the query the Trace areas Table acquires a new Code field that is populated for each record with the correct Code string for that record's Value, the string being taken from the corresponding Value in the LULC_classes table.

Using the New Field

Whichever way we have added the new Code field, either using the Edit-Join dialog or by using SQL, we can now use that new Code string value however we like.  

 

 

For example, we can dynamically see what the meaning is of each land use area by Alt-clicking the area in the Trace areas Drawing.

 

 

That picks the area as the subject for the Info pane, automatically popping open the Info pane to show the values for that record.    The area that was Alt-clicked is contiguous, that is, a single area that is not branched.  The template can create very large, complex areas, with tens of millions of coordinates.

 

 

We can see that the area we Alt-clicked is coded with a Value of 42, which represents Evergreen Forest Land.  

 

 

We Alt-click another area.

 

 

This other area has a Value of 43 which is the Code for Mixed Forest Land.  

Notes

Indexes and INNER JOIN - For the INNER JOIN in the query above to work, we need an index on the Value field in one of the tables that will participate in the INNER JOIN.  The LULC_classes table has been created with an index on the Value field.  The example shows creation of a btreedup index in the Trace areas table.  This is not really necessary in such a small data set, but if we were working with a very large result table, the process goes faster with a btreedup index on the Value field of the trace areas results table.  With indexes in both tables on the field that participates in the INNER JOIN, the join can work much faster.  We use a btreedup (allows duplicates) index on the Value field of the in the Trace areas table, because if the Split into shapes option is checked (the default), many areas with the same Value, that is, duplicates, could be created.  

 

Area borders - The area borders in this topic have been styled to use a border thickness of 0.5 points, not the default 1 point.

 

Classification Codes - The value for each pixel in the grid_cell LULC image is a classification code that can be matched to a palette color to provide a more understandable display,.  USGS defines the following codes for LULC data like this data set:

 

1 Urban or Built-Up Land

11 Residential

12 Commercial Services

13 Industrial

14 Transportation, Communications

15 Industrial and Commercial

16 Mixed Urban or Built-Up Land

17 Other Urban or Built-Up Land

 

2 Agricultural Land

21 Cropland and Pasture

22 Orchards, Groves, Vineyards, Nurseries

23 Confined Feeding Operations

24 Other Agricultural Land

 

3 Rangeland

31 Herbaceous Rangeland

32 Shrub and Brush Rangeland

33 Mixed Rangeland

 

4 Forest Land

41 Deciduous Forest Land

42 Evergreen Forest Land

43 Mixed Forest Land

 

5 Water

51 Streams and Canals

52 Lakes

53 Reservoirs

54 Bays and Estuaries

 

6 Wetland

61 Forested Wetlands

62 Non-forested Wetlands

 

7 Barren Land

71 Dry Salt Flats

72 Beaches

73 Sandy Areas Other than Beaches

74 Bare Exposed Rock

75 Strip Mines, Quarries, and Gravel Pits

76 Transitional Areas

77 Mixed Barren Land

 

8 Tundra

81 Shrub and Brush Tundra

82 Herbaceous Tundra

83 Bare Ground

84 Wet Tundra

85 Mixed Tundra

 

9 Perennial Snow and Ice

91 Perennial Snowfields

92 Glaciers

 

Oceans may be coded with a value of 2000000102 or 0.

 

Videos

Trace Vector Areas from Raster Pixels - An early version of the topic using SQL.

 

Join Dialog Part 1 -  Spatial Joins - See the new Join dialog in action, effortlessly doing spatial joins without SQL.  Using a map with two layers, a cities layer showing cities in the US and a states layer showing states in the US we add the name of the state for each city to the cities attributes.   Next, we take data from the cities layer and add it to the states layer, summing up the populations for cities in each state to get a total population for the state, plus maximum and minimum city populations in that state.  Works in Viewer, too.

 

Join Dialog Part 2 - Joins between Tables - A fast-paced video showing three examples of joins between tables, with no SQL required:  We add a Publishers field to each title in a list of books (one to many) and then we show how with a single click we update the table to incorporate changes made and new data added.  Next, we add three new fields to the Publishers table to harvest data from books (many to one) to create fields that contains a list of titles, the average sales price of a book and the last publication date.   Finally, we show how to alter the saved query to update the list of titles into JSON format. Wow!  Works in Viewer, too.

 

Join Dialog Part 3 - Raster to Vector Joins - The third, action-packed video on the new Join dialog shows effortless raster to vector joins using a point and click dialog with no SQL required. Everybody thinks of joins as just between tables, but taking data from a raster to load into attributes in vector objects is just another join.  We use a single-channel raster that shows terrain elevation near Livermore, California, and use a point and click join to transfer the height of the terrain to points in a drawing layer above that terrain.  Next, we switch to a vector layer of parcels as areas, and transfer the average height of the terrain under each parcel, the maximum height and the minimum height, all done effortlessly with a point and click.  Along the way we show how to customize the update query to round values.  Easy!  Works in Viewer, too.

 

Find Percentages of Open Space in ZIP Code Areas  -  Uses an SQL JOIN query.  Given a layer of polygons representing ZIP codes and a layer of polygons showing open spaces like parks and green spaces, find the percentage of open space in each ZIP code area.   This video shows how to do that start to finish in a few simple steps, from initial importing of shape files to final results, in just five minutes, with an additional six minutes of explanation what each step does.  Works in Manifold Release 9 or using the free Manifold Viewer.

 

See Also

Style

 

Info Pane

 

Style Pane

 

Transform Pane

 

Transform Reference

 

Tables

 

Queries

 

Edit - Join

 

Command Window

 

Command Window - Query Builder

 

SQL Operators

 

JOIN Statements

 

Example: Use Edit - Join to Map a Pandemic - Creating a data source for a CSV web server, we acquire current data on cases and deaths worldwide for the Covid-19 pandemic.  We then use the point and click Edit - Join dialog to rapidly join that data, automatically aggregated as we desire, into a world map for visualization.

 

Example: Create a Map Showing OSM Use by Country - A start-to-finish real life example of map creation that combines various Manifold capabilities, including use of Edit - Join.  Copying a table of numbers from a web site, we create a map that is thematically colored to show usage of OpenStreetMap by country in proportion to the population of that country.

 

Example: Import CTG Grid Cell File and Style - A companion topic to the Example: Import GIRAS vector LULC File and Style topic.  We import a CTG LULC Grid Cell file containing raster data showing land use and land cover and then we use Style to provide a more understandable display.

 

Example: Vector to Raster using Kriging - Using the Interpolate, Kriging Transform pane template, we take a vector drawing of contour lines where each line has a Height attribute and we create a raster image that is a terrain elevation surface.   This example, uses a point and click Transform template to accomplish the same task as shown using an SQL query in the SQL Example: Kriging topic.

 

Example: Contour Areas and Contour Lines - In this example we use the Contour Areas transform template in the Transform pane for images to create a drawing with vector areas showing height contours at desired altitude steps.   We color the areas using the attribute fields automatically created by the template.  Next, we apply a similar procedure using the Contour Lines transform template to create a drawing with vector lines showing height contours at the desired intervals.