CSV Servers

Manifold can read Comma Separated Values (CSV)  and similar formats that represent tables within text files, where each line is a record and the values for fields are separated by a character such as a comma (CSV). Each CSV file contains one table.   A similar format uses a | "pipe" character and is often known as Pipe Separated Values, or PSV file format.    This topic shows how to acquire CSV data from web servers providing data in CSV format.  A related topic, CSV, ASC, PSV, TAB, shows how to acquire CSV data from local files.   

 

Please read the CSV, ASC, PSV, TAB topic for important information about CSV format and how Manifold handles different options which occur within CSV format.

 

We can bring data from CSV servers into Manifold using two, related techniques:

 

 

 

 

Connecting to a CSV server:

 

  1. Choose File - Create - New Data Source in the main menu, or right-click in the Project pane and choose New Data Source.

  2. Choose More... in the dropdown menu to launch the New Data Source dialog.

  3. Choose Web Server: csvserver in the Type box.

  4. Enter the URL for the CSV data in the Source box.

  5. Default settings for other options will usually work for most CSV data.   Press Create Data Source.

  6. A new data source appears in the project.  Open it to see the table that is linked in from the CSV file.

 

Creating a local copy of the CSV table:

 

  1. Click on the table linked in from the CSV to highlight it.

  2. Press Ctrl-C or click the Copy button in the Project pane toolbar.

  3. Click anywhere in the main, Manifold .map part of the project outside of the new data source's hierarchy.

  4. Press Ctrl-V or click the Paste button in the Project pane toolbar.

 

Tables that are linked from CSV data are neither selectable nor editable since they do not have any indexes.  We cannot add an index to a linked CSV table since CSV format does not support indexes.   We can, however, add a key field and index to a local table that has been copied and pasted from a linked CSV table.  Such local tables are fully capable, native Manifold tables.   We can easily add a key field and index to them to enable editing and selection, using the procedure given in the Add an Index to a Table topic.

 

Adding an index to a table:

 

  1. Double-click on the table to open it.

  2. Choose Edit - Schema.

  3. In the Schema dialog click the Add Identity button.

  4. Press the Save Changes button.

 

Create a New Data Source

In the main menu choose File - Create - New Data Source.

 

 

Choose More... to launch the New Data Source dialog.

 

 

Name

Name for the new data source, "Data Source" by default.  Specify a more memorable name as desired.  If we forget the origin of a data source we can hover the mouse over the data source and a tool tip will provide connection information.

Type

Choose Web Server: csvserver in the Type box to connect to a CSV format source over the web.  Choosing the CSV server type will use the CSV server dataport regardless of the file extension, either .csv, or .txt, or other extension.

Source

A connection string to the web server CSV data.  The connection string can also be entered using the Web Login dialog launched by the [...] browse button.

 Browse button

Click to launch the Web Login dialog, to allow use of a login and password plus use of a proxy server if desired.   The Web Login dialog is also handy for providing a Test button that can be used to test the connection.

Open as read-only

Open the data source read-only.  Has no effect with CSV servers since they are read-only in any event.

Cache data

Cache data downloaded from the server while the project is open.  Provides better interactive performance and greater flexibility with read-only data sources.

Save cached data between sessions

Save the cached data for the next time this project is opened,  within the .map project itself in a Cache sub-folder in the System Data hierarchy.   Caution: checking this box can result in very large .map files when the results of browsing very large data from web servers are all saved.  However, having such data cached in the .map is handy for offline browsing of the project.

First line contains field names

Set to auto detect (filled square) by default. Click this entry to toggle between auto detect (filled square), checked (check mark), and unchecked (empty square).  The first line in most CSV data gives the names of fields.  Some CSV data sources do not provide the names of fields in the first line.  If some of the field names are strange, like a single number, make sure to switch this from auto detect to checked.

Read all fields as text

The system will try to guess what fields contain text and what fields contain integers or floating point numbers.  In some cases, such as Zip codes in US postal addresses, what appears to be a number should be treated as text.  Checking this box imports all fields as text even if they are not marked as text and even if they appear to be numeric fields.

Allow multiline text values

When checked, ignore linefeed and other characters normally denoting the end of a line which occur within text, that is, within text qualifier characters.  Check this box to deal with "csv" data that include multiple paragraphs within a single text field, as often occurs in CSV data originally harvested from web pages, KML or other web-oriented formats.

List delimiter

The character used within the CSV data to separate fields within a record. The default (auto) setting usually guesses correctly.   Enter any character or choose it from the list of standard characters: comma, colon, semicolon, pipe, space or tab.

Decimal separator

The character used within the CSV data to denote the decimal fractional part of a number, almost always either a period or a comma.    The default (auto) setting usually guesses correctly.   Enter any character or choose it from the list of two standard characters: period or comma.  A period is used in North America and many former UK colonies.  A comma is used in Europe, South America and many African countries.

Text qualifier

The character used within the CSV data to wrap values intended to be text.  The default (auto) setting usually guesses correctly.   Enter any character or choose it from the list of two standard characters:  a double " quote character or a single ' quote character.

Create Data Source

Create the new data source in the project pane and close the dialog.

Edit Query

Launch the Command Window loaded with a query that creates the data source using the given settings.  A great way to learn how to use SQL to create data sources.

Cancel

 Exit the dialog without doing anything.

 

 

 Pressing the browse button next to the Source box launches the Web Login dialog, to specify a server and connection characteristics.

 

 

Server

The connection string for the server.  This may be a simple URL or a very lengthy URL/connection string that embeds parameters such as keys that grant access or other parameters

Use login and password

Check this box for servers that require logging in with a login name and a password, providing the required name and password in the Login and Password boxes.

Use proxy server

Check this box when connecting through a proxy server.   The Proxy, Login, and Password boxes allow specifying the connection string to the proxy server as well as the login name and password required to use the proxy server.

User agent

Identifies what application (Manifold) is asking for a connection.  Some web servers want to know what client software is connecting, for compatibility or for business reasons.  The default string optimizes compatibility (Mozilla is very generic) while also identifying Manifold Release 9 as the client.  Users can adjust the string as necessary to comply with any special server requirements.

API key

Provide a key that authorizes use of an API when connecting to a proprietary data source that requires such a key.  This option is disabled for server dataports that do not use it.

Application key

A secondary application key or authentication code for those servers, such as here.com (also known as wego.here.com) that may require it.  This option is disabled for server dataports that do not use it.

Timeout

Specify a time in milliseconds to wait for connecting to the specified server.  Use 0 for the default timeout or specify whatever is the desired time to wait before giving up on the server.

Test

Press the Test button to try the connection using the specify parameters.   If successful, a Connection established information dialog will pop open.

 

Example: Connect to the World Bank CSV Server

Visiting the World Bank Climate Data API page we see that the World Bank publishes historical data for the average yearly temperatures by country in CSV format, accessible through their web server given the right URL.  Reading the API documentation carefully we see that the correct URL to get the "historical" (it is estimated data in many cases) data for a country in CSV format is:

 

http://climatedataapi.worldbank.org/climateweb/rest/v1/country/cru/tas/year/iso3.csv

 

...where iso3 should be replaced by the three letter ISO code for a country, for example, CAN for Canada.   To get a temperature of average annual temperatures for Canada in CSV format, we would use the connection string:

 

http://climatedataapi.worldbank.org/climateweb/rest/v1/country/cru/tas/year/CAN.csv

 

Launch Manifold and choose File - Create New Data Source.  The dropdown menu provides a list of favorites to choose from as well as a More... option.  

 

 

Choose More... to launch the New Data Source dialog.

 

 

We enter Average Annual Temps Canada as the Name.  We choose Web Server: csvserver as the Type.   In the Source box we enter the URL connection string:

 

http://climatedataapi.worldbank.org/climateweb/rest/v1/country/cru/tas/year/CAN.csv

 

Leaving the other options at their default settings, we press Create Data Source.

 

 

A new data source appears in the project pane.  We expand it to see it contains a table called CAN.  

 

Manifold web server dataports such as the CSV server or ECWP server and similar which produce a single image, drawing or table will compose names for the components they create by taking the URL object name and cutting out extensions, so that CAN.csv becomes simply CAN, and parameters are cut, so that data? becomes simply data.

 

 

We open the table and see it contains read-only data (from the gray background of fields) giving the average annual temperature for Canada by year.   The API page tells us the temperatures are in degrees Centigrade.    The table is read-only because it is generated by the World Bank server.    

 

If there are any changes in the data on the World Bank server, when we refresh the table, for example, by pressing View - Refresh or when we open the project again after closing it, the latest data will appear.  It is not likely the World Bank will change its mind about whatever it estimates the average temperature in Canada was in 1902, but if we had connected to a CSV server that provided more dynamic data, such as the latitude and longitude locations of ships, we would be happy to have dynamically updated data.

Create a Read/Write Table

If we want a read/write table imported into the .map portion of our project, we can easily do that.   Importing data, of course, means that we create a local, static copy, and not dynamically updated data served from a server.  See the Importing and Linking topic for discussion of the difference.

 

We can Copy the table from the data source and Paste it into our project to create a read/write table.

 

 

Components in Manifold must have a name that is unique through the project, but the CAN table that we paste into the project does not pose a naming conflict because the name within the project is simply CAN, while the name of the CAN table within the data source is CAN:Average Annual Temps Canada.   We double click our new, local table to open it.

 

 

Except for the shorter name, it looks just like the table we copied and pasted from the CSV server data source, including the read-only gray background.   To make it read-write, we launch the Schema dialog, which we can do by pressing Edit - Schema or by using the keyboard shortcut Ctrl-E.

 

 

 We press the Add Identity button.  

 

 

Manifold instantly adds a key, identity field and an index on that field, using the traditional mfd_id and mfd_id_x names.   We press Save Changes.

 

 

The table's background for fields changes to white to indicate it is now read/write.  It is now fully editable and selectable.   The new mfd_id field appears as well.  We can hide that if we like using the Layers pane.

Example: Connect to a CSV File through the Web

It is not clear if the World Bank serves CSV data that is generated on the fly, or if the URL we used connects to a file maintained on their server.   A very common use of CSV servers is to access data published as CSV files and placed on web server for each access by anyone.    

 

For example, if we have a website called www.manifold.net and we want to make a CSV file available to Manifold users through the CSV server dataport, all we need do is to place that file somewhere on our site where it can be accessed with a URL.   Suppose we have a file called cathedrals.csv that provides a list of French gothic cathedrals.   We could put it into a folder in our site called /files and the URL to connect to it would be:

 

http://www.manifold.net/files/cathedrals.csv

 

We can try that now (the link above is a live link).   Choose File - Create - New Data Source.

 

 

Choose More... to launch the New Data Source dialog.

 

 

We enter French Gothic Cathedrals as the Name.  We choose Web Server: csvserver as the Type.   In the Source box we enter the URL connection string:

 

http://www.manifold.net/files/cathedrals.csv

 

Leaving the other options at their default settings, we press Create Data Source.

 

 

A new data source appears in the project pane.  We expand it to see it contains a table called cathedrals.  

 

 

We open the table and see it contains read-only data (from the gray background of fields) giving a list of cathedrals, the height of the nave in meters for each, some comments, and the latitude and longitude location of each.    The table is read-only because web servers publish data one-way, and we do not have rights to change data within the manifold.net server.

 

If there are any changes in the CSV file placed on the manifold.net website, for example, if we add additional cathedrals to the list, the cathedrals table will be updated when we refresh it.   

Using CSV Server Data in a Query

To use the linked cathedrals table within our project, we could copy and paste it as a local copy, as in the prior example, or we can use it from a query.  

 

We right-click into the .map portion of the project, that is, outside of any of the data sources, and we choose Create - New Query.  

 

 

We name the new query cathedrals.  It appears in the Project pane as seen above.   We double-click the query to open it and we enter the following SQL:

 

 

TABLE CALL TableCacheIndexGeoms((

  SELECT [Name], [Nave_meters], [Comments], [Latitude], [Longitude],

    GeomMakePoint(VectorMakeX2([Longitude], [Latitude])) AS [Geom]

  FROM [French Gothic Cathedrals]::[cathedrals]

), TRUE);

 

This is an adaptation of the query used in the Example: Create a Drawing Dynamically from a Geocoded Table topic, copied and pasted into the Command Window we use to edit the query (not illustrated), with the names of the table and fields adjusted to the CSV server table we are using.

 

 

We can now right-click onto the cathedrals query in the Project pane and choose Create - New Drawing.

 

 

We choose cathedrals Drawing as the name for our new drawing.  

 

 We click the coordinate picker button to set the coordinate system to Latitude / Longitude, and then we press Create Drawing.

 

 

A new drawing appears in our project.

 

 

We can create a new map, add a few image server sources for background, a Bing satellite layer and a Yandex "skeletal" layer (contributed by a Manifold user in a post on the georeference.org forum) showing labels and major roads with transparency in between, allowing the satellite layer to show through.  

 

We drag and drop our cathedrals Drawing into the map, using Style to show each cathedral location as a star glyph, rotated slightly for better visual energy, with a drop shadow in black.  We color the fill color for the points using a thematic format to color them by nave height, using the Color Brewer Spectral palette, with blue being the lowest nave and red the highest nave.

 

In the above display, if the CSV file on the Manifold web site changes, when the data source refreshes, the CSV data pulled by the CSV server dataport will change and the results generated by the query will change, to add, delete, or move points in the drawing based on how the data in the CSV file on the Manifold server changes.

 

That much is very useful, but for fuller access to the data, such as clicking on one of the points and seeing the attributes, such as the comments field, we need to do more than such a simple query.   The usual approach with CSV data to which we want fuller access is to use a query which copies the data from a CSV server table into a local table in the .map project, with the query adding a key field and index.   That makes the local table fully read/write and selectable, so all the usual features like selection, zoom to selection, alt-clicking a point to launch it in the Info pane so we can see attributes, and so on, all will work.

Example: Link to New Zealand Statistics Data

Many organizations publish data as CSV files.   

 

In search of thrilling statistics, we visit the ever-popular stats.gov.nz site to obtain statistics in CSV file format, at

 

https://www.stats.govt.nz/large-datasets/csv-files-for-download/

 

 

The site lists numerous data sets that are available for free download as CSV files.   

 

We are using the Opera browser, so we can take advantage of automatic VPN anonymity (just say No! to geo-tracking) and automatic ad-blocking within the browser itself.  

 

 

We skip the boring economic statistics and scroll down to a more interesting subject, statistics on alcohol available for consumption.  We right-click on the link and choose Copy link address.  

 

In Manifold, we choose File - Create - New Data Source.  

 

 

Choose More... to launch the New Data Source dialog.

 

 

We provide a sensible Name and then we Paste the URL we copied into the Source box.  We press Create Data Source.

 

 

A new data source appears in our project.  We expand it, and, all the way from New Zealand, we see a table appear.

 

 

We double-click the table to open it, and we see it contains statistics on alcohol available for consumption.   A world of data at our fingertips!

Notes

Dates - Different languages have different ways for expression dates.  When recognizing date values as dates, Manifold is guided by whatever languages are specified in the Help - About dialog in the Lang entry, the presumption being that the date formats in use are those employed by whatever language is in use.    Reading a CSV file allows date fields to use multiple different languages.  For example, one field can use US English while another field can use German.

 

Problems connecting - Check the Log Window to see what is going on behind the scenes if an attempted connection does not work.     The problem is usually a wrong connection string or URL, failure to provide required credentials such as an key string that authorizes access, wrong choice of protocol (the server uses WMS and the user picks something else), the server is not a FeatureServer or ImageServer or Mapserver, an incredibly slow server, a server that is offline or a server that is wrongly configured and which is not correctly using the protocol it claims to use.  

 

Visit the Manifold community forum and talk out difficulties with other users.  Make sure to post full information on what you are doing, the connection URL you used, all details of how you tried to connect (including all settings in the data source dialog), what happened, and what the Log Window reported.    If other users cannot help you, spending a tech support incident will produce an authoritative analysis of the issue.

 

Try the URL in a browser - Checking the URL by launching it in a browser can reveal many problems with the URL or with the web server.  If a URL does not work in a Manifold web server dataport, try exactly the same URL in a browser.  If a browser cannot connect to the URL, the Manifold web server dataport will not be able to connect to it either.  If a browser cannot connect to that URL, that indicates the problem is the URL or the web server.   For example, the web server might be offline.  Or, for example, If the browser connects to a page other than the actual web service endpoint, such as, to a web page that lists various options for web servers, that shows the URL is not a URL for a web server but a URL to some other sort of web page.   

 

Connection problems are often caused by incorrect URLs.  There might be a typographical error in the URL or the URL might not be an endpoint to a functioning server but instead a URL to some other web page.  The server responding to the URL may have geographic restrictions (surprisingly common) that does not respond to connections from IP addresses that are thought to be in a canceled country.  Trying the URL in a browser will fail in such cases.  Web servers may also have other restrictions, such as only allowing connections from white listed IP addresses, from paying clients, or from those clients that use a special security scheme.  

See Also

Tables

 

Data Types

 

Schema

 

File - Create - New Data Source

 

CSV, ASC, PSV, TAB

 

GeoJSON Servers

 

JSON Servers

 

Example: Use Edit - Join to Map a Pandemic - Creating a data source for a CSV web server, we acquire current data on cases and deaths worldwide for the Covid-19 pandemic.  We then use the point and click Edit - Join dialog to rapidly join that data, automatically aggregated as we desire, into a world map for visualization.

 

Example: Import CSV and Create a Drawing - Import a CSV file by creating a New Data Source, copying and pasting the table conveyed by the CSV, prepping the table for use as a geocoded table and then creating a drawing.  

 

Example: Create a Drawing Dynamically from a Geocoded Table - A more general, cooler way to create a drawing from a geocoded table, without adding any fields to the table.   We create a small query that generates geometry on the fly, and then we create a drawing from the query.   This is how more advanced users often do it, using a technique that is perfect for creating drawings from tables in remote DBMS packages or read-only files.