Street Address Geocoding

Street address geocoding is the process of determining an estimated latitude and longitude position for the location of a street address.  Reverse geocoding produces a list of street addresses near a given latitude and longitude position.   Given a geocoding data source, Manifold can do street address geocoding.  If the data source in use supports reverse geocoding, Manifold can do reverse geocoding as well.

 

A very common task when working with spatial data is to take a table where each row has a street address and to add latitude and longitude fields for the estimated location of that address.

 

For example, suppose we have a table of ten records where each record gives the restaurant number and the street address of an In-N-Out fast food restaurant in California, Utah or Texas.  This is the same table used in the Example: Street Address Geocoding topic.

 

 

The classic GIS way to street address geocode that table would be to add the estimated latitude and longitude of each restaurant.

 

 

In Manifold we normally jump straight to creating a geom field that contains a point at the right location for each record.

 

 

We can then see the location of each restaurant in a drawing.

 

 

The map above shows the geocoded restaurants drawing as a layer above a Bing Street Map layer.

The Geocoding Process in Manifold

Manifold geocodes a table of addresses using geocoding data sources, which are normally web-based geocoding servers like the Google geocoding server.     We use SQL within Manifold to use a Manifold SQL geocoding function such as GeocodeAddress to geocode records in the table, either to update the table or to SELECT ... INTO to create a new table.   

 

The GeocodeAddress function takes as arguments a data source in the project and the name of an address field that contains an address as a string such as a varchar data type.   The function returns the longitude/latitude location of that address as given by the data source, providing it in Latitude / Longitude projection (WGS 84 base) as decimal coordinates in a float64x2 value.

 

At the present writing Manifold has nine web-based geocoding data sources built in, from Bing to Yandex.   Geocoding servers vary wildly in their characteristics with some being free to use, some requiring an API key, some allowing a limited number of free geocodes per day or per second with additional geocodes requiring a payment.   Google, for example, used to provide free access very generously.  These days, you probably will need an API key from Google to use their geocoding server.

 

These examples use Microsoft's Bing geocoder.  We can get a free developer key from Bing by visiting the Bing Maps Dev Center, creating an account (free), logging in, and then under My account drilling down to get a free API key, suitable for testing and development, but not allowing large scale geocoding for production use.

Create a Geocoding Data Source

We must have a geocoding data source in our project to do any geocoding.   We will create a geocoding data server using Microsoft's Bing geocoder.   

 

In the main Manifold menu, choose File - Create - New Data Source.  

 

Choose More... to launch the New Data Source dialog.

 

 

Enter a short, memorable name for the geocoding data source.  We will be keyboarding this into queries that do geocoding, so we will keep it short.  In the example, we use bing.  Choose Web Server: geocodeserver as the Type and then in the long list available in the box below, choose Bing Geocoder.

 

When we visited Microsoft's Bing Maps Dev Center, signed up and obtained a free API key, we copied and pasted that API key to some safe storage location on our computer.  We now Copy that API key, and then we Paste it into the API key box where the phrase "Paste an API key here" has been inserted into the illustration above.

 

Press Create Data Source and a new data source called bing will appear in our project.

 

To try out the data source, launch a Command Window and enter the expression

 

VALUES (GeocodeAddress([bing], 'Chicago'));

 

 

Press the ! button in the main menu to run that query.  

 

 

Bing will be happy to return the result

 

[ -87.63240051269531, 41.88322830200195 ]

 

which gives the longitude and latitude of where Bing considers the center of Chicago to be.  

 

Geocoding servers also vary in their ability to digest what we might think is a reasonable street address.  Because most were designed to handle geocoding for web sites where users tend to specify addresses in a wildly inconsistent manner, most are reasonably good at being able to parse a wide range of addresses.   For example, most geocoders can parse an address that consists of the name of a big city, like our example above.   

 

Why do geocoders like Bing think that the string 'Chicago' fed to them means the big city of Chicago in Illinois and not the locations also named Chicago in South Africa, Zimbabwe, Guatemala or Mexico?  Most likely because web searches for Chicago seek the big city in Illinois over the others by a wide margin.

 

If we provide a more detailed street address most geocoding servers can do even better.  Bing is happy to recognize an address such as the address for the In-N-Out restaurant in Arroyo Grande, California:

 

VALUES (GeocodeAddress([bing], '1170 W. Branch St., Arroyo Grande, CA 93420'));

 

 

In the illustration above we have arranged the query to use two lines.  SQL is not sensitive to white space, so we can arrange what we write on multiple lines, with indented lines as we like to improve legibility.    Press ! and Bing will return the result of

 

[ -120.597913, 35.127259 ]

 

...which we can copy directly from the result table below:

 

 

That means that Bing thinks the In-N-Out restaurant at that address is located at latitude 35.127259 and longitude -120.597913.

 

The above examples show how we can quickly test that a geocoding server we have added as a data source is working correctly.  We do not actually geocode one address at a time, but instead we use Manifold to geocode many addresses at once, as shown in the Example: Street Address Geocoding topic.

Geocoding Functions

The following Manifold SQL functions assume a geocoding server as a data source.   We use the Bing geocoder as an example, but we could have used one of the other geocoders, such as the Google geocoder:

 

GeocodeAddress(<dataSource>, <address>): <pointx2>

Given a geocoding data source and an address, returns the longitude/latitude coordinates as a float64x2 value.

 

Example: using a Bing geocoding data source, returns the longitude,latitude coordinates for the specified street address in California.

 

? GeocodeAddress([bing], '1170 W. Branch St., Arroyo Grande, CA 93420')  

  

Example: using a Bing geocoding data source, returns the longitude,latitude coordinates for 'Chicago'.

 

? GeocodeAddress([bing], 'Chicago')

 

Running the above Command Window example requires creating a Web Server: geocodeserver data source from the Bing Geocoder and naming it bing.  We can then use [bing] in the function.

 

GeocodeAddressMatches(<dataSource>, <address>): <table>

Given a geocoding data source and an address, returns a table of matches. Each match is a string with the format of the string depending on the geocoding server in use.   Most geocoding servers return JSON.

 

? CALL GeocodeAddressMatches([bing], 'W. Branch St., Arroyo Grande, CA 93420')

 

(In the Command Window when trying the above, hover the mouse over a Results table cell to see the JSON for that cell in a tooltip.)

 

Running the above Command Window example requires creating a Web Server: geocodeserver data source from the Bing Geocoder and naming it bing.  We can then use [bing] in the function.

 

GeocodeAddressSupported(<dataSource>): <value>

Given a geocoding data source, returns true if the data source supports GeocodeAddress and GeocodeAddressMatches functions.  

 

? GeocodeAddressSupported([bing])

 

Running the above Command Window example requires creating a Web Server: geocodeserver data source from the Bing Geocoder and naming it bing.  We can then use [bing] in the function.

 

GeocodeLocationMatches(<dataSource>, <pointx2>): <table>

Reverse geocoding.  Given a geocoding data source and a longitude/latitude location as a float64x2 value, returns a table of matches around that location.

 

? CALL GeocodeLocationMatches([bing], VectorMakeX2(-120.597913, 35.127259))

 

(In the Command Window when trying the above, hover the mouse over a Results table cell to see the JSON for that cell in a tooltip.)

 

Running the above Command Window example requires creating a Web Server: geocodeserver data source from the Bing Geocoder and naming it bing.  We can then use [bing] in the function.

 

GeocodeLocationSupported(<dataSource>): <value>

Given a geocoding data source, returns true if the data source supports the GeocodeLocationMatches function.   See the Street Address Geocoding topic.

 

? GeocodeLocationSupported([bing])

 

Running the above Command Window example requires creating a Web Server: geocodeserver data source from the Bing Geocoder and naming it bing.  We can then use [bing] in the function.

 

GeocodeSearchMatches(<dataSource>, <centerx2>, <radius>, <address>, <filter>): <table>

Searches for geocoding matches for an address with a filter (eg, 'hotel') near a lat/lon location with or without radius in meters.

GeocodeSearchRectMatches(<dataSource>, <boundsx4>, <address>, <filter>): <table>

Searches for geocoding matches for an address with a filter in a lat/lon rectangle.

GeocodeSearchSupported(<dataSource>): <value>

Checks whether a geocoding data source supports GeocodeSearchMatches and GeocodeSearchRectMatches functions.  

 

If we create a data source that we call bing for the Web Server: geocodeserver choice called Bing Geocoder, we can try:

 

? GeocodeSearchSupported([bing])

 

 

See the Example: Street Address Geocoding topic for examples using the GeocodeAddress function.

 

Reverse Geocoding

Reverse geocoding is the process of finding street addresses near a given latitude/longitude location.   To do reverse geocoding, we use the SQL query function GeocodeLocationMatches in an SQL query.

 

In Manifold, reverse geocoders are built into whatever geocoding data source we are using, if that geocoding data sourcesupports reverse geocoding.  

 

We can see how that works when we create a new geocoding data source using File - Create - New Data Source.  

 

 

 When we created our bing geocoding data source, we could have pressed the [...] browse button to open the Web Login dialog within the New Data Source dialog.

 

 

The upper part of the Web Login dialog shows the connection strings that will be used by the bing geocoding data source we create to access various Microsoft geocoding services, such as reverse geocoding.  We can edit those strings if Microsoft changes the URLs to be used for the various services, so we can keep using them until the next Manifold build updates them.

 

We therefore can use the same bing geocoder we created earlier to do reverse geocoding.   We simply use a different geocoding function, the GeocodeLocationMatches function.

 

The syntax of the  GeocodeLocationMatches function is:

 

GeocodeLocationMatches(<dataSource>, <valuex2>)

 

To try out the reverse geocoder, we will reverse geocode the longitude and latitude we obtained for the In-N-Out restaurant in Arroyo Grande, California.   in the Command Window we enter the expression:  

 

? CALL GeocodeLocationMatches([bing], VectorMakeX2(-120.597913, 35.127259))

 

We use a CALL (no semicolon at the end) because the function we use returns a table.   We use the VectorMakeX2 function to create a valuex2 type vector value from the two numbers we provide, which are the longitude and latitude values we obtained from our test for geocoding the 1170 W. Branch St., Arroyo Grande, CA 93420, address.

 

 

The illustration above formats the query using two lines, so it will fit into the illustration.   Press ! to evaluate the expression.

 

 

The result is a table with an index, zero meaning the first record, and a very long Value that is what Bing returned in JSON notation.

 

{"__type":"Location:http:\/\/schemas.microsoft.com\/search\/local\/ws\/rest\/v1",

"bbox":[35.123396282429326,-120.60420985363777,35.131121717570679,-120.59161614636224],"name":"1170 W Branch St, Arroyo Grande, CA 93420","point":{"type":"Point","coordinates":[35.127259,-120.597913]},"address":{"addressLine":"1170 W Branch St", "adminDistrict":"CA","adminDistrict2":"San Luis Obispo Co.","countryRegion":"United States","formattedAddress":"1170 W Branch St, Arroyo Grande, CA 93420","intersection":{"baseStreet":"W Branch St","secondaryStreet1":"Browns Rd","intersectionType":"Near","displayName":"W Branch St and Browns Rd"},"locality":"Arroyo Grande","postalCode":"93420"},"confidence":"High","entityType":"Address", "geocodePoints":[{"type":"Point","coordinates":[35.127259,-120.597913],"calculationMethod":"Rooftop", "usageTypes":["Display"]}],"matchCodes":["Good"]}

 

We have to read Bing's documentation to understand what Bing returned, but even without doing that we can guess that Bing found the 1170 W. Branch St., Arroyo Grande, CA 93420, address at that location.

Geocoders Provide Approximate Locations

As anyone knows who has followed a web-based mapping application to a location provided for a given street address, such addresses are usually only approximate and can be wildly inaccurate.   Geocoders usually do not know where a specific address is located; instead, they maintain a database of street segments with a range of addresses for each particular street segment.   If a Main Street segment along a particular city block is stored with a range of addresses from 20 to 40, an address at 30 Main Street will usually be interpolated as halfway down the block without knowing exactly where the address is located.   In regions of sparse inhabitation, such as rural areas, geocoders can be wildly inaccurate and may interpolate address locations that are kilometers away from the actual location.

 

As the largest web providers become better at data mining their users, triangulating travel using the locations of surveilled cell phones and at blending information from multiple data sources, more refined geocoding strategies have become possible.   Geocoders will often no longer do interpolations within well-known urban areas but instead will utilize databases of structures maintained by cities or private providers to correlate addresses to specific locations.   Geocoders are also getting better in rural locations as sweeping efforts to digitize the exact locations of specific addresses (down to an access road or gate in the case of ranch and farm properties) in service of better emergency response have resulted in the accumulation of databases of exact address coordinates in many rural areas.

 

But even in such cases the exact location where to place a dot to denote the location of a particular address is often a matter of choice.   Should a dot be placed at a centroid of the real estate parcel associated with a given street address?  For large parcels, such as the one at 1600 Pennsylvania Avenue in Washington, a dot placed at the centroid would be far from any access road or entry gate.   Should the location dot be placed at the main entry gate?   At the postal box for the parcel?  At the center of the main building?   At the center of the facade of the main building that fronts the road used for the address?  Should the point be on the boundary of the parcel or along the edge of any sidewalk?

 

Such factors may seem like minor details, but they can have a big effect on how locations derived from street addresses are used for emergency service response, parcel delivery, computations involving real estate parcels or even simply finding a restaurant.  In the examples above several of the In-N-Out restaurants are located in shopping centers with the address of the restaurant being the address of the shopping center.   Find a dot at the centroid of a shopping center may be easy enough but then locating a restaurant that could be hundreds of meters away might not be so easy if it is not immediately in sight.

 

See the Example: Street Address Geocoding topic for an example of how two different geocoders may place the same address at two significantly different locations.

 

Getting Full Information from a Geocoding Server

Most geocoding data servers return a JSON string when asked to geocode a given address.   Some servers provide additional information within that JSON string, which we can acquire using the GeocodeAddressMatches function and then extract those parts of the JSON string of interest using the StringJsonValue function.

 

 

Suppose we have created a data source called bing which is a geocodeserver data source using the Bing geocoder web server.    Instead of "Paste an API key here" we would enter the Bing API key we have obtained from Microsoft (use Google to learn how to get a free Bing API key).  

 

The following query geocodes an address and returns a record for each match with the name and confidence returned by Bing for each:

 

 

The query text is:

 

SELECT

  StringJsonValue([value], 'name', true) AS [name],

  StringJsonValue([value], 'confidence', true) AS [confidence]

FROM CALL GeocodeAddressMatches([bing], 'New York, Park Ave, 4');

 

 

Press ! in the main menu to run the query.

 

 

The result shows two matches with the confidence for each, which we have arranged into an understandable table using Manifold's StringJsonValue function.

 

The following query expands the above by adding the coordinates returned for each match, plus the latitude and longitude values, plus the calculation method returned by Bing.

 

 

That is a larger query, so we have resized the Command Window to make it larger.  The query text is:

 

FUNCTION decodeMatches(@matches NVARCHAR) TABLE AS (

  SELECT

    [index],

    CAST (StringJsonValue([value], 'coordinates', false) AS FLOAT64X2)

      AS [coord],

    StringJsonValue([value], 'calculationMethod', true) AS [method]

  FROM CALL StringToJsonArrayValues(@matches)

) END;

SELECT

  StringJsonValue([value], 'name', true) AS [name],

  StringJsonValue([value], 'confidence', true) AS [confidence],

  SPLIT CALL decodeMatches(StringJsonArray([value], 'geocodePoints'))

FROM CALL GeocodeAddressMatches([bing], 'New York, Park Ave, 4');

 

The query defines and uses a function, to compartmentalize the process of decoding the JSON values returned by Bing.   When we press ! to run the query, we get more information:

 

 

See Microsoft's documentation for Bing for a current list of what information Bing returns when geocoding addresses.

Notes

Got keys?  -  Most of the many geocoding servers Manifold lists provide free API keys for developer use.  Some provide some "reasonable" number of free geocodes per day with no key required.  Policies change quickly, and some geocoding server providers may without warning ban a user they feel has exceeded the guidelines for free use.  It is best to get an API key which allows some specific number of free or paid geocodes and to then follow the rules.   If we are in search of a free geocoder and Bing and Google do not work for us, try one of the others.  

 

NULLs - When a geocoding server returns a NULL for an address that usually means one of several  things:

 

 

 

Manifold Geocoding Database - A data source could also be the Manifold dataport for connecting to the legacy Manifold Geocoding Database (GCDB) used with Release 8 and built on the US Census Bureau's TIGER database.  That data source should be considered deprecated (that is, do not use it) given the greater accuracy of more contemporary data sources.   When using the GCDB as a data source, the full GCDB data source must be linked.

 

See the Example: Street Address Geocoding topic for an example using GCDB as a geocoding data source.

 

See Also

Tables

 

Queries

 

Data Types

 

Web Servers and Image Servers

 

Geocoding Servers

 

Command Window

 

Example: Add a Spatial Index to a Table - A typical use of an index is to provide a spatial index on a geom field in a table, so the geom data can be visualized in a drawing. This topic provides the step by step procedure for adding a spatial index.

 

Example: Create a Geocoded Table from a Drawing - A partner example to this topic.  A geocoded table has records with a latitude and longitude for each record.   This example starts with a table for a drawing of points where the geom field in the table contains geometry information for each point.   We extract the Y and X locations for each point  from the geom field to create latitude and longitude fields in the table for each record.

 

Example: Street Address Geocoding -  Geocode a table of street addresses using the Google Geocoder.

 

Example: Create a Drawing from a Geocoded Table - A partner example this topic.  A geocoded table has records with a latitude and longitude for each record.   This example starts with a table containing a list of cities with a latitude and longitude field for the location of each city.   We create a geom from the latitude and longitude fields using a template in the Transform pane and then we create a drawing that shows the cities as points.  This example shows all the infrastructure steps involved.