Temporary Databases

This topic provides a tutorial introduction to the use of temporary databases in a project.   The Manifold query engine supports temporary databases that live for the duration of the query. This is very useful for complex transforms and for many other things, such as internet map serving.  SQL statements such as CREATE ROOT <root>, USE ROOT <root> , CREATE DATASOURCE <name> AS ROOT, and DROP ROOT <root> work with temporary databases.  

 

Running a query that creates a temporary database, puts some data into it, and then returns a table composed from data in the temporary database is completely valid.  Even though the query is finished, the temporary database and all its tables will live until the table returned by the query is no longer needed, for example, if whatever executed the query stops reading the result table.

 

Following are code examples that, beginning with a new, empty project, should be run one after another in the same session in a Command Window:

 

We execute all of the statements in order by pressing the ! key.

 

CREATE DATASOURCE d (PROPERTY 'Type' 'manifold');

CREATE TABLE d::t ([mfd_id] INT64, INDEX [mfd_id_x] BTREE ([mfd_id]), a INT32, b INT32);

INSERT INTO d::t (a, b) VALUES (1, 1), (2, 5);

SELECT * FROM d::t;

 

The first statement creates a new data source component named d.  If we like, we can open d in the Project pane to see the table t. The type manifold means that we are creating a Manifold format .map file. We don't have to supply a connection string to data sources of type manifold, if we try to open such a data source with an empty connection string, Manifold will create a new temporary file.

 

     

 

Subsequent statements in the above SQL create a new table in the newly-created data source, fill it with some sample values, and then SELECT those values.

 

 

We continue onward in the same command window session:

 

A convenient way of continuing in the command window is to highlight the statements to be run and to then press Alt-Enter

 

USE CHILD d;

UPDATE t SET b=b+10;

SELECT * FROM t;

 

The USE CHILD statement descends into d and makes it the current data source for the query session running in the command window.

 

The next statement updates values in the table t using an example calculation.

 

The third statement SELECTs the contents of t.

 

 

Note in the above that after we made d the current data source, we refer to the table as just t and not d::t.  If we open the table in the Project pane the title bar refers to it as a part of d since in that case we are opening t from outside the command window session in which we descended into d as the CHILD.

 

 

Next, we create a temporary database, again in the same command window session:

 

 

CREATE ROOT x;

USE ROOT x;

SELECT * FROM [mfd_root];

 

The first statement creates a new temporary database and gives it a name, x, so we can refer to it.

 

The second statement switches to that temporary database. We are now inside the temporary database.

 

In the third statement we list all components by querying [mfd_root].   In the temporary database called x there are only the [mfd_root] and [mfd_meta] tables since the database is empty.

 

 

 

At the beginning of these code examples we created a datasource d that is a .map file.  From the temporary database, how do we get to the data in the .map file that we see in the Project pane as datasource d?   To do that we have to establish a link from the temporary database into the .map file. This is done using data source components:

 

 

CREATE DATASOURCE [mapfile] AS ROOT;

 

In the above we do not specify any values for the properties: we simply use AS to signal that the query engine has to somehow establish a link on its own, and then we use ROOT to specify where we want to link.   

 

We can now use the data from the .map file, put it into the temporary database, alter it there however we want, and then perhaps put the result back into the .map file.   

 

For example:

 

 

 

SELECT * INTO t FROM [mapfile]::d::t;

INSERT INTO t (a, b) VALUES (5, 5), (6, 6);

SELECT * INTO [mapfile]::[t_rebuilt] FROM t;

 

We can open t_rebuilt in the Project pane and see that it contains updated data consisting of two new records.

 

Example

Download the temp-databases.mxb project from the Downloads page.    The project has a table of States with names and a table of Roads where each road refers to a state by its name.  In both tables, the only indexes are on an ID field and on a Geom field.

 

We can write a join as follows:

 

SELECT * INTO temp_1 FROM [States]

  INNER JOIN [Roads] ON [States].[Name] = [Roads].[StateName];

 

Since neither field in the ON has an index built on it, the join would be slow if the tables were large tables.  

 

For faster operation with larger tables,  we can add indexes to the source tables.   One index in one of the source tables on a field used in the ON would be enough in this particular case.    Suppose, however, we cannot do that because the tables are read-only, or suppose we do not want to add an index or otherwise modify the tables because the tables are not ours.   In such circumstances we might not even have the option to make a copy of one of the source tables and add an index to that copy.  

 

In that case, for speed we can use a temporary database as follows, with comments added to explain various steps:

 

CREATE ROOT x; -- create temporary database

USE ROOT x; -- switch to temporary database

CREATE DATASOURCE d AS ROOT; -- link MAP file as data source 'd'

SELECT * INTO [States] FROM d::[States];

  -- copy table from MAP file to temporary database

ALTER TABLE [States] (ADD INDEX [Name_x] BTREE ([Name]));

  -- add index on name in copied table

SELECT * INTO d::temp_2 FROM [States]

  INNER JOIN d::[Roads] AS r ON [States].[Name] = r.[StateName];

  -- perform join between copy of first table in temporary database

  --   and second table that remains in MAP file

  -- write result table into MAP file

 

Notes

'Database' vs 'data source' - In Manifold documentation and build notes, these two terms are frequently more or less the same thing. Data source is a collective name for files,  databases, or web servers which provide data.   When we start Manifold and create a new .map file, we set that .map file to be an unnamed root data source, 'root' because it is a root of a potential data source tree. When we link a SHP file, we do so by creating a new data source component, that is, a new data source.   That data source behaves like a database in that it exposes data through tables, and it has system tables with reserved names which describe its contents. The object representing an opened data source for a script is usually called Database, and in other ways data sources are treated as databases within Manifold.

See Also

Tables

 

Functions

 

Queries

 

Regular Expressions

 

Command Window

 

Command Window - Query Builder

 

SQL Statements

 

SQL Functions

 

SQL Operators

 

Temporary Databases