Tables

Everything in Manifold is stored in a table.  Every bit of data, regardless of its nature - vector, raster, text, whatever, - is stored in a table.   Most GIS users naturally understand what a table is when they see data, such as attributes, in a classic row and column table presentation, but it may be a new notion to think of tables as also storing data that is used to create visual displays, such as geometry data for drawings or tile data for images.  

 

 

 

To create a new table:

 

  1. Choose File - Create - New Table, or use the context menu in the Project pane.

  2. Specify a Name for the new table.

  3. Press the Edit Schema button and use the Schema dialog to add fields and indexes and computed fields as desired to the table.

  4. Press Create Table.

 

 

We can easily create tables as in the above, but most often the tables we work with are created as a result of importing or linking data, either tables or drawings or images, or they are created by SQL we write.

Index Required

To be editable and selectable, tables must have an index and, of course, to be editable they cannot be read-only.  If a table has a light gray background in all of the columns it is not editable.  

 

Having an index is a good idea in any event because it makes work with tables faster and it enables facilities like interactive selection.   We can add a key field and index with one click in the Schema dialog.  See the Add an Index to a Table topic for a quick example.

All Data is in Tables

When we open visual windows like drawings or images, in reality those are just a different way of showing data in a table somewhere, for example, geometry data for features in drawings or tile data for pixels in raster images.   Drawings and images are just a specialized way of displaying data in tables.  A drawing itself contains no data: it has only a few bytes of information saying what table to use, what field in that table contains the desired geometry data, and how geometry data from the subject table should be displayed.  All of the data shown in the drawing is in the table.

Drawings Show Data from Tables

The shape and locations of objects in drawings are stored in geometry fields in a table, using a Manifold data type called a geom.   Each object in a drawing is one record in that drawing's table.   To create the drawing, for each record in a table Manifold reads the geometry data in that record's geom field and draws whatever object is stored in the geom.  

 

 

In the illustration above, each record in the table has a geom field that describes the shape and location of a province in Mexico.    We have selected the row for Guanajuato to show it in red selection color, which causes the corresponding object in the drawing to appear in red selection color as well.  The geometry data from the Geom field in that row in the table is used to draw the corresponding object in the drawing.

 

By tradition, the geom field in a Manifold drawing's table is named "Geom," so when we look at a list of fields in a table we can tell right away from the name of the field which one is the geom field.   There is nothing special about that name.  The geom field could be called "Harry" if we like, although such an odd name would not at a glance indicate it contained geometry data.  

Images Show Data from Tables

The shape and locations of objects in images are stored in tile fields in a table, using a Manifold data type called a tile.   The default name for a tile field in a table is named "Tile," so it is obvious what data that field contains.   As with the default name of "Geom" in drawings, there is nothing special about the default name "Tile".  It just makes more sense as a default name instead of, say, "Josephine."

 

 

There are many different types of tiles to allow storing the different types of raster data found in images.   Manifold uses the word "images" to refer to all forms of raster data, including raster data used for photographic images, multichannel raster data from satellite sensors, and other raster data such as terrain elevation data.  Any raster image in Manifold is composed of a checkerboard of tiles, where each tile is a record in that image's table.  

 

In the illustration above, the aerial photograph showing a recreation area in Boston is composed of many tiles, each one of which is 128 pixels by 128 pixels in size, with four channels of data for each pixel.  The selected row in the table contains the tile that is displayed in red selection color in the image.

Labels Show Data from Tables

The locations of labels and the text they display also are data in a table, the table which defines the parent drawing of the labels.   The geom field in that table specifies the location of the label and a text field in the table provides the text the label displays.   Each label corresponds to a single object in the parent drawing and thus a single record in that drawing's table.  

 

 

Because all data in Manifold is stored in a table somewhere and is fully exposed for us to use however we prefer, that means that every bit of data can be accessed via SQL, and everything that can be done with other user interfaces can be done with SQL or with other table interfaces like the Transform pane.  That makes it very easy to automate what we do and greatly magnifies the power and convenience of Manifold as a GIS and as an incredibly powerful tool for spatial data manipulation.

 

For example, to format one drawing we can use the Style pane interactively with a point and a click.  But to format a thousand drawings we do not want to point and click on each drawing.  Suppose our organization has decided on a standard background color for all drawings and we want to apply that color to a thousand drawings in a project? That is just a simple update of records in a meta table as in this example.   A few mouse clicks in Manifold can do that for thousands of drawings at once by using the Select and Transform panes, easily automating what could take hours or days in a system that did not expose all data of interest in tables.

Table Windows

Tables are displayed in table windows as a row and column grid.  Each row is a record and each column is a field.  

 

 

 

 

 

 

Double-clicking open a table in Manifold will show it in a table window, as seen above.   Tables show data in a classic grid presentation, with rows and columns.  Each row is one record in the table and the various columns show the values for fields in each table.    

 

Manifold uses the words row and record as synonyms and the words column and field as synonyms as well.   If we say column handle we mean the same thing as field handle, for example, and if we say row handle we mean the same thing as record handle.    In addition, GIS people will often refer to fields as attributes.  

 

Default actions, such as clicking the Enter key to begin editing a cell, will apply to the current cell, which is marked by a dotted line box called the cursor.  

Controls

<gray background>

Values shown with gray background are read-only and cannot be edited.

Enter

Ctrl-Enter

Enter to open the current cell for editing, or double-click into a cell.  Enter closes the cell for editing, leaving it in blue preview color until a Ctrl-Enter commits the change.

Esc

When edits are in process, pressing Esc closes the editing session without applying changes.  

Ctrl-backspace

Abandon all edits for all cells shown in blue preview color.   To abandon edits for only one cell shown in blue preview color, right-click onto that cell and choose Undo Changes.

Right-click column head

Right-click on a column head for a context menu of useful commands.

Ctrl-click column head

Ctrl-click on a column head to sort the table by that column in ascending order.  Ctrl-click again to resort the table in descending order.

Shift-Ctrl-click column head

When a table is sorted by a column, Shift-Ctrl-click additional column heads to sub-sort by those fields.

Ctrl-click row

Ctrl-click anywhere in a row to toggle the selection status of that record.

Ctrl-click-and-drag

Ctrl-click and drag a selection box in the table  to add to the selection all rows that are touched by or are within the box.

Alt-Ctrl-click-and-drag

Alt-Ctrl-click and drag an unselection box in the table  to remove from the selection all rows that are touched by or are within the box.

Arrow Keys

Move current cell cursor up / down / left / right.

Home

Move current cell cursor all the way to the left-most column.

End

Move current cell cursor all the way to the right-most column.

Ctrl-Home

Move current cell cursor to the top of the table.

Ctrl-End

Move current cell cursor to the bottom of the table.

Page Up

Page Down

Move current cell cursor up or down one display page's worth.

Scroll bar

A vertical scroll bar appears when there are more records than can fit into the window.   Scrolling the display does not move the current cell cursor.

Scroll bar clicks

Clicks in the scroll bar region:

 

  • Click and drag onto scroll bar handle - scroll the display.

  • Click above or below scroll bar handle - Same as Page Up or Page Down.

  • Shift-Click above or below scroll bar handle - Jump to that position.

Column Head Context Menu

Right-clicking onto the name of a column calls up a context menu.

 

 

Order Ascending

Sort the table using values in this column so that lower numbers or letters nearer to A are at the top.  For sub-sorts, use Shift-Ctrl-Click on additional columns to sub-sort the table by those as well.

Order Descending

Sort the table using values in this column so that higher numbers or letters nearer to Z are at the top.  For sub-sorts, use Shift-Ctrl-Click on additional columns to sub-sort the table by those as well.

 Align Left

Align column contents to the left margin of the column.  The default.

 Align Center

Align column contents to center the contents between the left and right margins of the column.

 Align Right

Align column contents to the right margin of the column.

Best Fit

Arrange the width of this column so that the values in this column will fit better.

Best Fit Title

Arrange the width of this column so that the name of the column will fit better.

Style

Choose a display style for the contents of the column, including language-country format.   Options depend on the data type of the field.  For example, numeric fields will allow choice of a currency display style, while datetime fields will have a choice of different calendar and time styles.  Style can be applied to fields that are binary, boolean, datetime, geom, numeric, tile, uuid, or vector data types.

Show in Layers

Pop open the Layers pane if it is closed, and scroll this field into view as the context field.

 

 

Right-click a column header and choose Align Right to align the contents of the column to the right margin of the column.

 

 

Experienced users will Ctrl-click a column head to sort the table by that column.   Order Ascending and Order Descending are in the menu to help new users.  For sub-sorts, use Shift-Ctrl-Click on additional columns to sub-sort the table by those as well.

 

The Best Fit command examines a representative sample of values for the clicked field to determine a reasonable width for the column.  When used on a binary data type field, the command will not retrieve values, but instead will use a width that will fit reasonable values.

 

When looking at a field in a table that has many fields, the Show in Layers command is a quick way of bringing that field into view in a Layers pane that might contain a very long list of fields.   Choosing a layer with Show in Layers will pop open the Layers pane and move the list cursor onto that layer, scrolling the list and opening any folders as necessary to bring the layer into view.

 

Conversely, if we see a field in the Layers pane in a very long list of fields, we can Alt-click that field in the Layers pane to pick it, which will scroll it into view in the Table window.  

Styling Fields

We can change the format, that is, the style, for how fields are displayed in tables by right-clicking the column head for the field and choosing Style.  That allows us to choose different ways of displaying text as URLs, different ways of displaying dates and times, or different styles for latitude and longitude numbers, such as decimal degrees or degrees, minutes, and decimal seconds, all with automatic adaptation to different language and country styles for displaying numeric values and dates.

 

 

For example, we can change a Unit Price field shown using default numeric style to showing currency in Euros, using European language style for decimal points:

 

 

See the Styling Table Fields topic for complete details and examples.

Cell Context Menu

Right-clicking onto a table cell calls up  a context menu:

 

Add Filter

One-click choice of filters to filter the table based on that cell's value.    Typical filter choices:

 

  • =  - Filter for all records with values equal to the value in this cell.

  • <=   - Filter for all records with values less than or equal to the value in this cell.

  • >=   - Filter for all records with values greater than or equal to the value in this cell.

  • not NULL  - Filter for all records that do not have NULL in this cell.

  • <>   - Filter for all records with values not equal to the value in this cell.

  • NULL  - Filter for all records that do have NULL in this cell.

 

 Undo Changes

Appears if a cell in a picked row has been edited and is still in the preview state with blue color background.  Abandons the edit.

Copy

Copy the contents of the cell into the Clipboard.

Paste

Paste the contents of the Clipboard into the cell.

Delete

Delete the value in the cell, resulting in <NULL>.

Edit

Open the contents of the cell for editing in a larger text pane.

 

To apply a filter based on the contents of a cell, right-click on the cell, choose Add Filter, and then choose the filter desired.

 

 

To clear a filter that has been applied, choose View - Filter - Clear Filter.   See the Filters topic.

Scroll Bar Context Menu

Right-clicking onto the scroll bar zone calls up a context menu:

 

Scroll Here

Drag the scroll bar handle to the spot right-clicked and scroll the display accordingly.

Top

Scroll the display to the top.

Bottom

Scroll the display to the bottom.

Page Up

Scroll the display up one page.

Page Down

Scroll the display down one page.

Scroll Up

Scroll the display up one row.

Scroll Down

Scroll the display down one row.

 

For example, to scroll down one page, right-click on the scroll bar zone and choose Page Down.

 

Schemas

A table has one or more fields with each field being a particular data type.  A table can also have zero or more indexes and zero or more constraints. All of these aspects of a table's structure taken together are referred to as a table schema.   We can edit schemas to add, remove, or rename fields, constraints and indexes using the Edit - Schema dialog.

 

Everything in a schema has a name and each name must be unique in the schema.    For example, we may not have a field named x and an index named x in the same table, but we may have a field named x in one table and an index named x in a different table.   We can also have indexes named x within several different tables in the same database.   The names of indexes do not have to be unique within a database in Manifold.

 

Important:  A table must have a btree index in it to support selection as well as commands like Select All, Select Inverse and Select None and Copy, Cut and Delete on records.  Many tables created by Manifold, such as the table created for a drawing when drawings are imported or created, will automatically have a btree index created in them but tables that are imported or opened in external data sources might not.    

 

The mfd_id Field

Tables that contain data for drawings must have a key field with unique values and an index built on that key fields.  That is such a common requirement that Manifold has a special, built-in facility to make that easy and automatic: the mfd_id field.   When a table contains a numeric field named mfd_id, Manifold will automatically populate that field with a unique, non-NULL value for every record.  If a table has an mfd_id field and we add new records to that table manually, we can, if we like, provide whatever value we like for that mfd_id field, so long it is not duplicate the value in the mfd_id field for some existing record, and Manifold will be happy to use the value we provide.   But if we add a new record and we do not provide a value for the mfd_id field, Manifold will automatically put a unique value in that field for us.  

 

The mfd_id field is a special case provided to make it easier to create drawings from tables for GIS use.   No other fields in a Manifold table are like the mfd_id field.  The name (as with any name that begins with mfd_) of the field is reserved.   Although the mfd_id field can be any numeric type, the numeric data type should be large enough to handle the total number of records we expect to have in that table.   When creating tables for drawings, Manifold automatically creates the mfd_id field as an INT64 data type.  By tradition, when adding an mfd_id field to a table, Manifold users will also create it as an INT64 data type.

 

Tables must have an index to be fully operational, for example, to enable manual selection and to be editable.    An index requires a key field, so if we manually add an index to a table many Manifold users will add an mfd_id field to the table to use as a key field.    See the discussion in the Indexes topic and in the Add an Index to a Table topic.

Key Dialogs

Tables can have just a few records or they can have up to two billion records per table.  Each individual record can be extremely large, allowing terabytes of data per table, but at the present time there can be no more than two billion records per table.   That limit is an arbitrary setting on temporary caches and other internal structures.  It could be increased in future updates should user demand require.  

 

Manifold provides a variety of controls to provide information and to adjust the display of table windows as we like.

 

 

 

 

 

 

 

Scrolling the View

Manifold tables use standard Windows scroll bars with standard Windows moves and shortcuts for navigation.

 

 

Right-clicking anywhere in a scroll bar outside the scroll handle calls up a context menu.

 

 

Choose Scroll Here to jump to the location that was right-clicked, or choose one of the other scroll commands.

 

 

A short cut to jumping directly to a scroll location is to Shift-click where desired.

 

 

The view will immediately jump to that location.

Managing Columns in Tables

With the focus on the opened table, choose the Layers pane  to manage which fields appear in a table and the column widths used for each field.

 

 

 

 

To fit into this documentation, illustrations show an artificially small Manifold desktop, with only a few panes, docked to the right side.   In real life we use a much larger Manifold desktop, and all panes would be turned on, with some panes docked to the left and others docked to the right.

 

 

 

 

The display above shows a table with 13 fields that is open on the left side of the desktop with the Layers pane on the right side of the desktop.   

 

 

 

 

We can adjust the width of a column by either double-clicking into the numeric width of the field, for example, changing the 72 point default width of the mfd_id column to, say, 38 points, or by dragging the column's border in the table.

 

 

 

 

If we drag the border of the mfd_id column to make it a narrower column the point size readout in the Layers pane will automatically be updated.

 

 

 

 

To hide a field, double-click the small box at the right to toggle it off/on.  In the illustration above we have hidden the Country field.

 

To change the order in which fields are displayed, Ctrl-click the row for that field in the Layers pane to select it, and then use the up and down arrows to move the field up or down in the list.  In the illustration above we have moved the mfd_id field down one row, so that it displays between the Town column and the Year column in the table window.

 

A change will apply to all selected rows.  If we select four rows and we double-click one of them off to hide it, all four of the rows will be hidden.   If we select three rows and then press the Move Down arrow button, all three rows will move down.  

 

 

 

 

For example, suppose we select all rows where the column width is 96 points.   We double-click into the first such selected row to change the width of the column to 72 points.  

 

 

 

The moment we press Enter to accept the edit, all of the selected rows will have their column widths also changed to 72 points and the width of the columns in the table window will be adjusted.

 

The usual selection moves and keyboard shortcuts work.  Ctrl-A to select all, Shift-Ctrl-A to unselect all, Ctri-I to invert the selection.   Ctrl-click on a row to toggle it selected or unselected.   Ctrl-click on a row to select it and then Shift-ctrl-click on another row and all the rows in between will also be selected.

 

Saving a project will save current Layers pane settings for a table.  

Query Results Tables

Query results tables are virtual tables in that they are constructed on the fly to show the results of a query.  To make them permanent we would use SELECT ... INTO to create a real table.  However, for more effective browsing of results tables we can use many display features such as the Layers pane and Filters with query results tables.

 

 

 

For example, in the illustration above we have altered settings in the Layers pane to hide the mfd_id field in the results table and we have adjusted widths of columns.  Such settings go away with the next run of the query but they are so quick to specify they can be very convenient when browsing a complex results table.  

 

If we are only interested in two fields out of twenty in a results table, we Ctrl-click on those two fields in the Layers pane to select them, we Ctrl-I to invert the selection, and we double-click any of those selected fields OFF to hide all eighteen of the selected fields.  That is a very quick way to hide all but two fields in a results table, taking but three seconds for an experienced Manifold user.

 

We can export the results shown in a results table to a file by choosing Edit - Export Results in the main menu, to export to popular formats such as CSV, GDB, GPKG, and others.  

 

If the results table contains a field with an index, like the mfd_id field, that would make a table selectable and editable, then the results table is also selectable and we will be able to edit field values shown in the results table and have those edits propagate into the tables from which the records and fields shown in the results table were drawn.   If any records in a results table are selected, the Export Selected command in the Edit main menu will be enable, allowing us to export the selected records to a file.

Sorting Tables

Ctrl-click on a column head to sort the table by that column.   Shift-Ctrl-click additional column heads to sub-sort by those fields.   See the Sorting Columns topic for details.

Tooltips

A tooltip is a small information box that appears when the mouse cursor hovers over an item.   Many fields will have tooltips that show their contents in more expanded form.   For example,  the expression used for a computed field appears in the tooltip for that field's column handle.

Tooltips can be extensive and can display Unicode characters, as in the screenshot above of a tooltip that shows the contents of an nvarchar data type text field.

 

 

Tooltips on column headers will show the name of the field as well as the data type:

Selection

Selection in an interactive, Manifold sense is a different concept but is related to the use of SELECT in SQL.   Selection in Manifold is an interactive user interface way of marking some records in tables or some objects in drawings as being of interest.

 

Ctrl-click anywhere in a row to select or to deselect that record.  Selected records will be shown in red selection color.

Ctrl-click another row to add that record to the selection.  

 

 

Ctrl-click and drag will draw a selection box that will select all records any part of which falls within the box.   Alt-Ctrl-click and drag will draw a deselection box that will deselect all records any part of which falls within the box.    We can also use the Select pane to select records in the table.   See the Selection topic for more info and examples.

 

Table windows, including the results tables from queries in a command window, can export selected records to a choice of formats, such as CSV, GDB, GPKG, and others.  When any records in a table window or results table are selected, the Export Selected command in the Edit main menu will be enabled.  Choose Edit - Export Selected to export the selected records to a file.

 

To export all records in a table to an external file, right-click on the table in the project pane and choose Export.

 

All selection methods, such as the Select pane or the selection commands in the Edit menu such as Edit - Select Inverse, always apply to the entire table, and not just those records fetched as part of a table fill strategy for big tables.

Show only Selected Records

The View - Filter command allows us to show only selected records in the table window.

 

With two records selected as in the above examples we choose View - Filter - Selected.

Immediately the table window re-displays to show only selected records.   If we wanted the table window to show only those records that were not selected we could have chosen View - Filter - Unselected.    To show all records again in the table window we choose View - Filter - Clear Filter.

Filters

Filters are a great way to show only desired records in a table window.  They are incredibly fast and handy because they require no keyboarding.   Right-click on a cell that has a desired value and then choose a filtering expression based on that value.  Only records which have that expression value for that field will be displayed.

 

 

To display only those records that the fclass field says are supermarkets we right-click on an fclass field cell that already has supermarket in it.

 

 

In the Add Filter menu we choose fclass = 'supermarket' as the filter we want.

 

 

The table window immediately re-displays to show only the supermarket records.    We add more filters using other fields if we like, and we can combine filters with selection.   When working with big tables, the View - Filter command allows us to check or uncheck the Filter Fetched Records Only setting as desired to apply filters to only those records fetched by the table fill strategy (the default), or to the entire table.

 

See the Filters topic for details.

Big Tables

Working with smaller data may have accustomed us to going back and forth between views of the data as a drawing or as a table window.   When a drawing has only a few tens of thousands of features it is still realistic to switch to a table view of the objects that allows us to interactively scroll through a table.  Table windows in Manifold also allow easy interactive work with such data as well.   

 

But the situation changes for extremely large drawings and tables, which can involve billions of records, as discussed in the Table Windows and Big Data essay.  In such cases, queries are the way to rapidly and productively examine tables for data of interest.

 

That is a situation we might not have experienced if we have been working with older GIS software that cannot handle really big data.  Life is different when tables have billions of records.  Even just a few million records are too many to manipulate interactively with a table display that shows only a few dozen records at a time.  Seeing a tiny sample of the data in a table window may be useful, but actually manipulating the entire table through a keyhole view of a table window?  That's like trying to paint a line on the ground across the entire United States by working through a window that allows painting only one screen at a time.    See the Note at the end of this topic on Really Big Tables for why a comparison to the size of the United States is actually an understatement.

 

Even when working with very large tables where table windows are not a productive interface, Manifold nonetheless allows using interactive tools like filters and non-SQL selection methods:

 

 

 

Despite the availability of the above, using table windows to interactively browse tables that have very many records is not a good idea.  Use SQL instead, perhaps using the ability of the Select pane to write example queries as a way to get started.

Table Fill Strategies

Manifold employs several strategies to handle tables that may involve billions or trillions of records while still providing the interactive convenience of table windows in a form that GIS people expect.  The core idea behind these strategies is to provide the expected user experience in a way that protects inexpert users from launching a process that could take years with a really big data source.  Manifold has the ability to connect to such truly immense data sources that such safety measures are a good idea.  

 

A fill strategy is how Manifold populates records into ready status for display in a table window.   Strategies are necessary because when a large data source, such as the Google table for imagery of the entire Earth, contains trillions of records it is unrealistic to try to fetch them all when at most only a few thousand could ever be seen by the user in the table window and only a few tens of billions would fit on the machine in any case.

 

Fill strategies include:

 

 

 

 

 

The combined effect of the above strategies results in very fast table population even with immensely large data sources.   

 

Important:  To see the total number of records in a table, open the table and click on the Info pane.

 

 

To jump to the end of a table, with the focus on the table window we press Ctrl-End

 

 

In the above illustration we have pressed Ctrl-End to jump to the end of a table window that shows the first 50000 records from a large table.   The icon in the next to last row indicates there are more records in the table.

 

 

The Info pane reports how many records the table contains.

Placeholder Records

 

Table windows display two special records at the very bottom of the table:

 

The fill record placeholder is a row at the bottom of the table that shows where records being filled into the table window will appear.   Small table windows fill so rapidly we will never see it.  When a table is fetching enough records from a data source that is slow enough (such as from a slow web server source or other slow data sources), the fill record placeholder will show blue preview color indicating records are still being fetched.   If the table window finishes fetching records before the display limit is reached the fill record disappears.   If the table window hits the display limit of 50000 records yet the table contains still more records the fill record remains with a background color of gray and a fill record icon in the record handle.  The table window above displays records from a table with many records, so only the first 50000 are displayed and the fill record remains to alert us that the table contains more than 50000 records.

 

The new record placeholder is a row at the bottom of the table, marked with an asterisk * icon in the record handle, that allows us to manually add new records to the table in the usual way.  See the Editing Tables topic.

How Records are Ordered for Display

How are records ordered for display?  As with most database systems, in Manifold they are not at all ordered by default.  

 

Tables in modern databases should not be presumed to display records in a fixed order the way a list in a word processing document might.  Big data sets tend to be complex with many different possible ways of sorting the data for ordered display.  Since sorting big data is computationally expensive, modern databases focus on performance and flexibility by storing data within efficient structures that can most easily be resorted as desired to show records in some particular order.  

 

Manifold does the same, storing data efficiently in unordered form.   To see records in any desired sort order we utilize a query employing constructs such as ORDER BY.   That can come as a surprise if we are not used to working with databases and expect records to always be displayed in some fixed order, perhaps in the order in which they were manually added to a table.  But part of learning to work with databases is remembering that what we see in a table window is unsorted.  Even as we scroll the order of records can change due to insertions or deletions by us or by some other process also connected to the same data source.

 

As noted in the Essay on browsing table windows, browsing a table by scrolling through records in a table window has limited utility, as in the case of big tables we see what is usually a small fraction of records among very many, and we see them in no particular order.  The order in which we see records in a table window is a function of performance enhancing mechanisms that allow us to reach into what could be very, very large tables, possibly in remote data sources to which connections have limited bandwidth.  That sort order can change.

 

For example, to improve performance Manifold table windows cache those records that have been displayed.  If we scroll a table window after inserting new records the window will seamlessly reset the display so that as the new records scroll out of cache they will reappear in the order set by the primary key.  We can see that effect with a simple experiment.

 

 

We create a new table with nvarchar type fields called Name and Comments and we add three records as seen above.  Manifold will automatically populate the mfd_id primary key ID field by auto-incrementing it starting from 1 as we add records to a new table, or we can manually populate that field with unique values.   In the screenshot above we have manually specified 8 for the value of the mfd_id primary key for the third record we have added.

 

 

We next add a fourth record, also specifying the mfd_id key manually to a value of 6.  At this point it would be a mistake to expect the records will always stay in this order when the table is displayed.  Remember, tables are not ordered: only the results of queries where we specify some order will be ordered.

 

 

 

If we close the table and then re-open it again Manifold will show the records in order set by the primary key as seen above.    Records added to a large table will also be seamlessly reordered by primary key if we just scroll about the table.  That may seem confusing if we added a record to a large table and then after scrolling about the table or jumping about in it from beginning to end we no longer see the record at the end of the table, where we think we added it.   If we added the record to the table it is still there, just ordered for display by the primary key.  

 

That records are displayed in table windows by primary key when the scrolling cache is reset may tempt us into the kludge of using the primary key mfd_id as a means of sorting tables by default.  But that is an unreliable kludge only good for small tables that are not much used or edited.  As  records are added and deleted there is no guarantee either manually-specified or auto-incremented primary keys will stay in the order desired.   So, rather than make the mistake of assuming something that should not be assumed,  in smaller tables we should sort on column heads as described above, and for bigger tables we should use the SQL tools provided to guarantee the sort order desired.   If the problem is lack of SQL skills the solution is not to thrash about with kludges but to learn to use basic SQL, which is easy to learn with a bit of focus.

Data Types

As with most database systems, fields in Manifold have standard data types, with very many different types available for use.   Compared to Manifold products prior to those using the Manifold engine there are new types for x2/x3/x4 vector types for each base numeric type plus new types such as uuid and tile.

 

There are two types within Manifold that get a lot of use in GIS applications and spatial data: tiles, which are used for raster data, and geoms, which store vector data.

 

Tiles store rectangular regions of pixels of the same type, plus a boolean mask for each pixel. Tiles are used for pixel-based, raster data as used in images and terrain elevation surfaces.    In a table that provides data for an image each record provides a tile for some rectangular portion of the image.   An image component knows from the properties of that image how to assemble the tiles from a particular table's records into an image by using the X and Y field values that give the intended position for each tile.   See the Example: How Images use Tiles from Tables topic for illustrations.

 

Tiles are used within Manifold primarily for performance reasons, as computers and database systems can work  faster with pixels when using batch processing to operate on many pixels instead of each operation working on a single pixel.  For example, it is quicker to store or retrieve a single record that stores a 128 x 128 tile of pixels (containing over 16000 pixels) than it is to store or to retrieve 16000 individual records each of which contains a single pixel

 

Geoms are a Manifold data type for storing vector information within spatial context.  They are used within records that store information, such as defining shapes and locations, for objects that appear in drawings and maps.   There, too, performance is a key factor in the use of geoms because it can be far more efficient to store an entire object's worth of coordinates within a single geom in a single record, potentially many thousands of coordinates, than to use a more decomposed storage where each coordinate is in a different record.

 

Geoms contain one of three basic geometry values that represent the three basic object types used in vector geometry in Manifold: points, lines or areas.  Areas are defined by their boundary lines, of which there can be multiple sets or branches within a single area to represent boundaries within boundaries, that is, islands or holes, in the same area object.    Lines and areas are composed of segments that can either be linear segments where each segment is a straight line, or curvilinear segments where each segment is a type of curve.  Curvilinear segments can be circle arcs, ellipse arcs or splines.  A line or an area can be made up of both linear segments and curved segments.   Operations that combine or analyze geometry will automatically linearize curves as necessary for geometric computations.

 

See the Data Types topic for a complete list of available types.

Vector Numeric Values

Vector numeric values , also known as composite numeric values, are pairs or triples or quads of numeric values of types XXXx2, XXXx3, XXXx4 respectively, where XXX can be either of the supported numeric types. for example, float32.   We refer to vector numeric values as x2, x3, x4, or, taken together, xN vector values. Vector values are very handy for representing points in 2D, 3D and 4D spaces, coordinates of rectangles, values in various color spaces like BGR, BGRA when adding an alpha channel to RGB color space, and so on.

 

Example:  Taking the numeric type float64 the xN types available are float64, float64x2, float64x3, and float64x4.   float64x3 is a data type consisting of a triplet of numbers each of which is a float64.

Indexes

Tables will often contain indexes of various types.   Tables without indexes are inconvenient: we cannot edit them and we cannot make selections in them.   Manifold therefore automatically creates an mfd_id unique identifier field along with an mfd_id_x index within tables that are brought into Manifold by dataports for most formats and most data sources.  On the rare occasion when a table is brought into Manifold or created without an index, it is easy to create an index in the table.   See the Adding an Index to a Table topic as well as the topics it in turn recommends.

 

BTREE indexes are used to store unique, ordered values. BTREEDUP indexes are similar to BTREE indexes but can also store duplicates. RTREE indexes are used to store geometry values in a form suitable for fast searches within rectangular regions.  Such indexes are known as spatial indexes and are easy to add: see the Example: Add a Spatial Index to a Table topic.

 

Manifold uses several different variations of RTREE indexes with various specific data sources.  All are similar but have different internal implementations.

"Deleting" mfd_id and mfd_id_x

In an ideal world all tables that are brought into Manifold would be imported with no modifications whatsoever, with no extra fields added.   In the real world we prioritize faster performance and greater reliability and ease of use.  When balancing logical rigor with ease-of-use Manifold makes the small compromise of automatically adding an identifier field that is guaranteed unique and non-NULL, mfd_id, to most tables with most dataports also creating a btree index on that field called mfd_id_x.  Adding those fields greatly simplifies life for users in many ways, for example, by greatly enhancing the efficiency with which drawings may be displayed.   

 

But doing so also changes the structure of an imported table by adding a field and possibly a field and an index to the schema of that table.   That can cause surprises and unexpected errors in things like copying and pasting between different databases if the mfd_id field is sometimes handled separately and not intuitively or sometimes not being handled separately.

 

To deal with such issues Manifold allows the mfd_id field to be writable so long as the new value is unique.  That allows, for example, specification of the mfd_id field during INSERT operations or changing the mfd_id field of an existing record during UPDATE queries.

 

Manifold also allows the mfd_id field and mfd_id_x index used by a table within the .map project to be "deleted," the word used in quotes because the deletion is a virtual deletion that just pushes the operation of such infrastructure down further into Manifold internal workings where it is hidden.   Once "deleted," that is, hidden, the mfd_id field and mfd_id_x index can be restored.

 

See the SQL Example: Using the mfd_id Field and mfd_id_x Index topic.

Computed Fields and Constraints

Manifold tables can contain fields that are computed on the fly.  These fields are called computed fields.  Computed fields compute values based on an expression using SQL syntax, for example, [population] / 1000  or  TileAbs([Tile]).    Computed fields can also be styled by right-clicking the column header for the computed field and choosing the Style command.

 

For example, a table may have a field that stores data as tiles and the table may also have a computed field which takes tile data from the first field and converts color values in the tiles into gray scale values. Whenever we change data in the first field, the system automatically will recompute the second field.  See the Example: An Image using Computed Fields in a Table topic for step-by-step example of an image created from a computed field.  See the Example: Add a Computed Field to a Table topic for a step-by-step example of how to add computed fields.    

 

Computed fields allow specifying one or more statements to prepare the execution context for an expression which computes values. This allows expressions for computed fields to use functions and scripts.   Each computed field uses its own execution context: functions and values defined in the context of a computed field are inaccessible from other fields.

 

Constraints are related to computed fields in that they also are expressions computed on the fly.  Primarily used for data integrity checks, constraints are  expressions which must evaluate to a boolean value of true for all table records.   Every time we try to insert a record into a table the system evaluates all constraints for the data about to be inserted and allows the insertion only if all constraints are satisfied. The system also checks constraints whenever we try to update a record.   If the attempted update fails one or more constraints it is rejected in full and the record is not modified.    See the Example: Create a Table with a Constraint topic for a step by step look at constraints in a table.

 

Like computed fields, constraints also allow specifying one or more statements to prepare the execution context for an expression which accepts or rejects records.  This allows expressions for constraints to use functions and scripts.   Each constraint uses its own execution context: functions and values defined in the context of a constraint for a given field are inaccessible from expressions for constraints for other fields.

 

When loading a .map file for a project that contains tables with computed fields and constraints, the system will resolve expressions for those computed fields and constraints to allow successfully loading expressions from cross-referencing tables.  When loading a .map file containing expressions in computed fields or constraints that fail to load, for example, because they refer to tables that do not exist, such as tables within linked data sources that are no longer accessible, the system will nonetheless start up and will preserve all other data.   All existing data, including data in computed fields with expressions that failed to load, will be fully and safely accessible. Computed fields and constraints with expressions that failed to load then can be safely removed.

 

See the Computed Fields and Constraints topic.

Nulls

Values stored in table records can be NULL.   To change the value of a cell to NULL, right click onto the cell in a table window and choose Set to NULL.  When adding records to tables, for a new record all editable fields are NULL by default.

 

Note that in Manifold empty strings are not NULL.   If we double-click into a text field for a new record to start editing but then we press Esc to cancel the edit box or we press Enter without making any changes, the system will leave unchanged what was there before we double-clicked into the cell to start editing: a NULL.

 

If we start editing again and keyboard some text and then press Enter that text will go into the field.  If we now edit that text by double-clicking  into the cell and then deleting all of the text and then pressing Enter the result in the field will be an empty string, not a NULL. If we want to reset the field to NULL we can do so by right-clicking onto the cell and choosing Set to NULL

 

The behavior of NULL values can be counter-intuitive.   Although different databases tend to handle NULLs differently, Manifold tries to stick to the notion of a NULL value denoting an unknown value which one day might be specified or which might remain unknown.  Doing that means NULLs just get skipped as if they weren't there to begin with. For example, most aggregates skip NULLs and NULLs are normally only handled for special cases such as First, Last or Count(*).

 

One interesting consequence is that NULL Xor TRUE returns NULL, but JoinXor() on two records with NULL and TRUE returns TRUE.  That is similar in its apparent counter-intuitiveness to how NULL + 1 returns NULL but Sum() on two records with NULL and 1 returns 1.

Some guidelines on NULL values:

 

Reserved Names

All names starting with mfd_ (case not significant) are reserved for use by the system.  That is literally all names, including names for fields, indexes, constraints, tables, components other than tables, properties... everything:  Do not name anything beginning with mfd_ or MFD_ or in any upper or lower case combination of those characters.    mfd_id is a reserved name for a field in tables, to be used only for the identity field that Manifold maintains as an automatically unique value field.

Maximum Table Sizes

Tables in a Manifold project can be stored either within the .map project file or, if they are linked into the project, stored outside of the .map file within some external data source.  See the Importing and Linking topic.   The general rule is that Manifold can work with all such tables the same, whether they are stored within the .map or within an external data source.   But there are some differences, like possibly different maximum table sizes.

 

Different data sources will allow different maximum table sizes.  Any single table stored within a .map file is limited to 2 billion records (with effectively unlimited size of each record), but that is a limitation of .map project file storage and not a limit of Manifold.   When Manifold stores tables in external data sources such as Oracle,  a table can have many more than two billion records, up to whatever is the limit imposed by that data source.

 

Some data sources, for example, .mdb file databases using Microsoft Access MDB format files, do not have a maximum number of records but instead a maximum data size for tables, such as 2 gigabytes.    Storing large records means the maximum data size for the table will be encountered with fewer records than if the records stored are small.  Such limits are not Manifold limits but are limits of the external format.

Limitations

256 GB max size for changes to data - .map files can work with data far larger than 256 GB when that data is stored in external data sources, such as an enterprise class DBMS.  Although there is no theoretical limit to the size of data Manifold can work with on the desktop, a hard limit on the amount of data that can be changed in a single operation is 256 GB.   For example, if we have a single image that is 1 terabyte in size we can display that and manipulate it, but if we want to change all of the values in all of the pixels in the image we will have to do that in four operations, each up to 256 GB in size.   The 256 GB limit is the current, somewhat arbitrary, setting for temporary caches utilized internally by Manifold.   This limitation will be increased in future builds.

 

2 billion records in .map tables - Individual tables stored within the .map file cannot have more than two billion records in the table.  Each individual record can be extremely large, allowing terabytes of data per table, but at the present time there can be no more than two billion records per table.   This limitation will be increased in future builds.   When Manifold stores tables in external data sources such as Oracle,  a table can have many more than two billion records, up to whatever is the limit imposed by that data source.

 

2 GB limit on text strings - an individual text string of any kind cannot be larger than 2 GB.  Some datasets that use JSON, a text format, to store large objects might run into this limitation.   This limitation will be increased in future builds.

 

See the Limitations topic for a current list of limitations.

 

One More Thing...

Tech tip: Have we mentioned that tables are editable available only if there is btree or similar index in a table?   See the Adding an Index to a Table topic.

 

Notes

All Data is in Tables - We are accustomed from DBMS applications to click open tables and to see them in some sort of row and column table browser, and we are accustomed from applications like graphics editors to click open images or CAD drawings and to see them as something other than tables, as photographs or other images, for example.  Classic GIS applications will also often encapsulate geometry data for the objects shown in drawings within the drawing, and will use tables only for "attribute" data for each object.  The geometry data for the object is somehow hidden within the drawing and, other than being displayed in the drawing, might be accessible only through programmatic means.

 

So it could be that when we first click open an image in Manifold and see it as a photograph we might think, "Hey... that's not a table, that's a photo."    But the data for that image really indeed is stored in a table somewhere in Manifold.   The image seen in a window is simply that table's data visualized by  the display handler of the image window, which knows how to interpret the table's data to present it for us in visual form.

 

If we doubt the data is coming from a table we can right click on the image component and choose Properties to see the name of the table that stores the data for the image.  If we open that table we can see the records which contain the data for the image, most likely in the form of tile data within each record.  It is the same with objects in drawings, where the drawing window is just a means of visualizing geometry data in the table.

 

That Manifold explicitly exposes such data as a field in a table is a bit different than how systems like an Adobe graphics editor or a classic GIS might operate, but providing explicit and easy access to that data greatly expands our ability to utilize that data for fun and profit.  For example, given a table with geometry data in the form of a Manifold geom data type we might want to change that data into some different form, such as geometry data in GeoJSON form or Well Known Binary (WKB) geometry form.  When we can see the data as a field in a table such transformations are trivially easy with a few clicks in the Transform pane.

 

When geometry data is in plain sight at our fingertips in a table we can also use SQL to manipulate it, to analyze it, to slice and dice it, to transform it and so on.   Such processes become far easier and more flexible than when data is stored in some secret form within a drawing where only those facilities thought of in advance by the product's designers can get at it.

 

Really Big Tables - Tables in Manifold can be very big data, so big that it does not make much sense to use a table window to manipulate such truly enormous table data.  A famous thread in the georeference forum discussed a LiDAR point cloud data set that contained 1.72 billion records in the table.   How big is a table that shows 1.72 billion records?  If we displayed the table in a series of screens where each page full of records was the height of a typical computer monitor screen the total length of the table would be over 8600 km, or about 1.35 times the radius of the Earth.   That is such a large table that no amount of interactive viewing of the table would show anything more than the tiniest fraction of the table.   Such large tables can be handled with SQL or programmatically, or they can display their contents in drawing or image layers where most of the data will not be displayed as zoomed out views are averaged down and simplified, but they cannot be productively browsed with interactive table windows.

 

Limitations - Manifold at present is limited to two billion records in a single table.  Each individual record can be extremely large, allowing terabytes of data per table, but at the present time there can be no more than two billion records per table.   That limit is an arbitrary setting on temporary caches and other internal structures.  It could be increased in future updates should user demand require.  

 

Styling fields in Tables - We can change the format, that is, the style, for how fields are displayed in tables by right-clicking the column head for the field and choosing Style.  That allows us to choose different ways of displaying dates and times, or different styles for latitude and longitude numbers, such as decimal degrees or degrees, minutes, and decimal seconds.  See the Styling Table Fields  topic.

 

Sorting and Collation -  When sorting a table window by clicking on a column handle there is no option to specify collation.  If we would like to specify collation, we should use an SQL query using SELECT ... ORDER BY.

 

Sorting and Slow Data Sources - Table windows showing tables in fast data sources will fill before we can do anything.   However, when filling table windows from slow data sources the table window might not be filled before we manage to click a column handle to command a sort.  Clicking a column handle to sort a table window that is in the process of filling will stop filling the window.

 

Autogeneration of Queries - For an "inside look" into how components can be created from tables by using SQL we can take advantage of Manifold's query window.  Choose View - New Command Window - SQL to open a command window for queries.    The topmost pane is where we would write query text, but that topmost pane has intelligence of its own.  We can Copy a component (for example, an image or a drawing) in the Project pane and then Paste into the query text pane and Manifold will write in the query pane the SQL statements which generate the pasted component.   In fact, we can Copy  more than one component from the Project pane and Paste them into the query text pane and Manifold will generate multiple CREATE statements for us.  See the Examples in the Queries topic.

 

Connection Prerequisites - Connections to data sources may depend upon having necessary .dll files on our system or other non-Manifold considerations.   For example, like other Manifold products Manifold System utilizes Microsoft routines to connect to Microsoft formats such as .mdb, which can lead to some 32-bit / 64-bit integration issues depending on the vintage of our Windows system, what other Microsoft software we've installed and so on.  The rule of thumb when running 64-bit systems is to launch Manifold in 32-bit operation when connecting to Microsoft "Jet" formats such as .mdb, .xls and so on, to ensure that the predominantly 32-bit orientation of Microsoft routines to connect to Access-style file databases won't cause the connection to fail to work on a 64-bit system.  

 

Other database systems may also have their integration quirks to enable connections.  For example, to connect to .sqlite file databases we must have the required sqlite3.dll files either in the execution PATH for our system or in the same folder from which the manifold.exe file which we use to launch Manifold is located.   Put the 64-bit sqlite3.dll file in the ~\Bin64\ folder along with the 64-bit manifold.exe executable and the 32-bit sqlite3.dll file in the ~\Bin\ folder along with the 32-bit manifold.exe executable.

 

Compiled commands - Compiled commands, as used by the Manifold query engine in EXECUTE statements, are used not only in OLE DB and ODBC data sources but also for many others, for example, ADO.NET, MySQL, Oracle, PostgreSQL and SQLite.

 

MTA - OLE DB data sources use MTA (Multi-Threaded Apartment threading model) whenever possible for better performance.  Also greatly helping performance is that fetching large amounts of data from a remote database need not be done in pages.

 

Testing for NULLs -  Manifold tests for a null using the IS NULL operator. Many DBMS applications also have operators such as IS TRUE, IS FALSE and sometimes IS UNKNOWN, which is more or less synonymous to IS NULL.    Manifold does not use IS TRUE and IS FALSE since they are way too redundant given that Manifold's equality operator works with boolean values.    For example, instead of <something> IS TRUE we can use <something>  = TRUE or just plain <something>.   Instead of  <something> IS FALSE we can use <something> = FALSE or NOT <something>.  

 

Widths in printer's points - Why are the widths of columns in tables specified in printer's points as a unit of measure?  Tables display values using fonts that are specified in printer's points, with displays and printouts normally scaling to show those fonts in reasonably accurate real-world sizes.   Setting the width of columns using the same units of measure allows table column sizes to scale the same way as the fonts they contain.

 

Read-only data - The Layers pane recognizes when the data it displays is read-only, and disables controls and commands that cannot be used with read-only data. Temporary layouts and temporary maps are always writable. Tables and queries always appear writable with changes to tables on read-only data sources being kept in the window and being discarded after the window is closed.

 

Automatic generation of intermediate levels when pasting into tables - Pasting an image table with an RTREE index on the x-y-tile field automatically generates data for intermediate levels.

 

Centre - The discussion of nested data sources refers to the Centre region of France.  More accurately the name of the region is  Centre - Val de Loire, but it is often referred to as simply Centre for short.

See Also

Getting Started

 

User Interface Basics

 

Sorting Columns

 

Filters

 

Editing Tables

 

Data Types

 

Styling Table Fields

 

Indexes

 

Queries

 

Drawings

 

Copy and Paste between Drawings or Tables

 

Images

 

Labels

 

Info Pane

 

Computed Fields and Constraints

 

Table Windows and Big Data

 

Schema

 

File - Create - New Drawing

 

Edit - Join

 

Edit - Export Results

 

Export Selected

 

View - Filter

 

View - Order

 

Examples - Do not miss!  Browse through the many examples for step by step tutorials.

 

Example: Closing without Saving - An example that shows how File - Close without saving the project can affect local tables and components differently from those saved already into a data source, such as an .mdb file database.

 

Example: Select a Range of Records in a Table - We can select or deselect a range of records between any two records in a table window using fast and easy mouse commands.

 

Example: Drawings use Geom Fields in Tables  - An essential discussion on how drawings are created from geom fields in tables, including how the drawing knows which coordinate system to use.

 

Example: Multiple Drawings from the Same Table - Illustrates how easy it is to create multiple drawings that use the same table and same geometry by copying and pasting an existing drawing.  Each new drawing takes no additional storage space in the project, but can be formatted differently.   

 

Example: Add a Computed Field to a Table - How to add a field to a table.  We first set the values for the field in a static way and then we show how to set the values for a field dynamically with a computed field.

 

Example: Create a Table with a Constraint  - Create a simple table that includes a simple constraint upon one of its fields.  

 

SQL Example: Create a Table with a Constraint via a Query -  A simple example using SQL to create a table with a constraint.

 

SQL Example: Force an Anomaly in Constraints - Constraints are only evaluated when we insert or update records.   If a constraint refers to external data, such as the values in a different table, we can force an anomalous condition where the table with the constraint may contain data that no longer meets the requirements of the constraint.

 

SQL Example: Using the mfd_id Field and mfd_id_x Index - A sequence of SQL examples of working with the mfd_id field and mfd_id_x index.