Indexes

Like most database systems, Manifold uses indexes to improve performance, both within Manifold native tables and also taking advantage of indexes that may be available within any external data sources that might be linked into a Manifold project.    An index is a data structure that provides greater efficiency and speed for many database operations.  An index on a field helps the database system find specific values in that field faster.  

 

The performance gain from using an index is so great that Manifold requires tables to have an index (normally a btree or btreenull index) to be editable and to support Manifold's interactive selection capabilities (a different thing than using SELECT in SQL, which does not require an index).   We can create indexes on fields in tables using the Schema dialog.

 

In general, if a field plays an important role in a query it should have an index on it.  For example, key fields for JOINs or in the Join dialog should have an index on them.   If we write a query where for a billion records we want to check if a particular field has a value within a given range, that field should have an index on it.  If we write a query involving larger data, and the query runs unusually slowly, processing only a few records per second, despite appearing to be a simple query, that often indicates a lack of an index on one or more fields that play important roles in the query.

 

Must read:  Please make sure to read the Add an Index to a Table topic as well.  That topic covers the mfd_id field used within Manifold for indexes as well as other important info. See also the Collations topic to learn how mechanisms for managing order in different languages, called collations, are managed by Manifold.

Index Types

The default index created by Manifold is a btree index, a classic balanced tree, B-tree, data structure which utilizes a single field in which there can be no NULLs and no duplicate values.  Indexes can also be created using more than one field.  We can even create an index on a computed field, so long as the table for the computed field is stored in the .map project (and not in an external database).

 

There are four forms of the basic btree index, referred to collectively as BTREExxx indexes, which cover the four possibilities for whether NULLs are allowed or not allowed and whether duplicates are allowed or not allowed.

 

Manifold uses rtree spatial indexes for spatial work with geometry fields in drawings and tile fields in rasters.    Rtree indexes are balanced tree structures utilizing bounding rectangles, the R in rtree.  In general, if a drawing's table does not have an rtree index on the geometry field the drawing will not be able to display the objects.

 

btree

A classic B-tree, that is a balanced tree, index.  A btree index allows no duplicates and no NULLs.

btreedup

A B-tree index that allows duplicates but does not allow NULLs.

btreedupnull

A B-tree index that allows duplicates and also allows NULLs.

btreenull

A B-tree index that allows NULLs but does not allow duplicates; that is, it requires all non-NULLs to be unique.

rtree

A spatial index that is a balanced tree structure utilizing bounding rectangles or boxes.

 

A BTREExxx index sorts values from lowest to highest, is generally applied to numbers, strings, and dates, and helps optimize searches for operators such as = as well as operators such as  IN, or  BETWEEN AND, ome other operators.

 

An RTREE index sorts values spatially and can only be applied to geometry values.  An RTREE index helps optimize searches for functions specific to the database, for example, functions such as GeomAdjacent, GeomContains,  GeomIntersects,  GeomTouches,  GeomWithin, and other functions.

Indexes are Exposed in Results Tables

Whenever possible, indexes from input tables are exposed, that is, available for use in subsequent steps, in the results tables of various SQL statements.   This makes writeback possible, that is, the editing of records by editing results tables and it improves performance in many ways:

 

 

Editable Results Tables

In most other database systems we normally think of results tables produced by an SQL query as read only, but as can be seen from the list above, in Manifold results tables are often editable.  If the results table includes an indexed field, as is often the case with a SELECT or a JOIN statement, we can edit that results table and the edits made to field values automatically will be written back into the source tables.   See the Editable Results Tables  topic for details.

Tips for Using Indexes

When is it useful to add an index?   Consider a query such as:

 

SELECT t.value1, u.value2

   FROM t, u

   WHERE t.id=u.id:

 

We assume all fields are numbers so we consider only BTREExxx indexes. We can then say:

 

 

 

 

 

The answer to the general question, "Which field needs which index?"  depends on the queries we are running and the nature of our data.  That is a big topic on which books can be written (and have been written), but a quick and imprecise answer is that if we have a field which we primarily use to identify records and to refer to records from other tables, for example, an ID field or a RECORDNUM field or something similar, and if we are using that field in constructs such as WHERE t.ID = <another value possibly from another table>, or a field involved in an ON condition for a JOIN,  we will probably want to add a BTREExxx index on that field.

 

If the values in the field cannot contain NULLs, we can use that to guide the index type we choose, for example, using a BTREE or a BTREEDUP index.  If the field can contain NULLs we instead would use a BTREENULL or BTREEDUPNULL index. If values in the field cannot be duplicates, we would choose either BTREE or  BTREENULL, and if they can contain duplicates we would use either BTREEDUP or BTREEDUPNULL.

 

If we are doing spatial searches on a particular field, then probably we would build an RTREE index on that field.

 

Notes

More than one field in an index - BTREExxx indexes can be built on more than one field.  All BTREExxx indexes built on more than one field support searches by partial key.

 

Partial keys, indexes and WHERE - WHERE optimizes for conditions which engage BTREExxx indexes via a partial key.   For example, SELECT ... WHERE a = ... runs faster if the source table has a BTREExxx index with a composite key built on a and other fields, with a coming first in the composite key.

 

RTREE indexes and spatial searches - RTREE indexes can use a secondary box as a hint for spatial searches, when the query engine exposes an RTREE index on a join that already filters by a spatial criteria.

 

Spatial indexes and tiles - Spatial indexes for tiles support a number of operations which help the user interface, such as computing bounds limited to visible pixels, computing approximate statistics, and so forth.   Spatial indexes for tiles are supported in ,map files and in file dataports that store data in MAPCACHE files.

 

Selection or SELECT?  Manifold allows interactive selection of records in a table using point and click commands or using the Select pane.   That is a different thing than the use of the SELECT statement in SQL.  See the Selection topic.     SELECT works with any table whether or not there are indexes in the table.   Selection requires the table to have an index.

 

Indexes or indices?   Both are correct in English language as the plural form of index, with indexes more popular in the US.  Computer people tend to use indexes, presumably due to the influence of Americanisms within computing.   Given this documentation was written by many hands, some under the influence of American English and others influenced by UK English, both forms may appear.

See Also

Tables

 

Data Types

 

Queries

 

Collations

 

Schema

 

Command Window

 

Command Window - Query Builder

 

SQL Statements

 

SQL Functions

 

SQL Operators

 

COLLATE

 

Example: Create a Table with a Constraint - Create a simple table that includes a simple constraint upon one of its fields.  

 

Adding an Index to a Table - A basic topic on adding an index using the built-in capabilities of the mfd_id field.

 

Example: Add a Spatial Index to a Table - A typical use of an index is to provide a spatial index on a geom field in a table, so the geom data can be visualized in a drawing. This topic provides the step by step procedure for adding a spatial index.

 

Example: Add a UUID-based Index to a Table - Create a new computed field that is filled with UUID values on creation and then create an index on that field.   This technique creates an indexed field that has guaranteed unique values for all records and thus the indexed field and record values may be used in other projects.