Functions

The Manifold query engine supports declaring and calling functions, including script functions, functions in external files and compiled functions as .NET assemblies.  This topic discusses declaring and calling functions.   For notes on functions built into the Manifold query engine, see the SQL Functions topic.  

 

The Manifold query engine allows declaring and calling functions within queries given the following basic characteristics:

 

 

In its simplest form a function is just a named expression. There can be many reasons for making an expression into a function but one of the most common reasons is to provide a descriptive name for some action.  Another common reason is to make sure we use the exact same formula or logic in different places in a query, to avoid keyboarding errors.  

 

If we need to use an expression several times in a query, giving it a simple, short name by declaring it as a function allows us to then utilize that expression without risk of errors introduced by slight variations in the expression we write.  Declaring a long expression as a function with a short name also can save a lot of keyboarding if we use that expression more than once in a query.

 

Query Functions

A query function is a function returning a table that can be either a table name, a table parameter, a CALL, or within parentheses a SELECT, EXECUTE or VALUES clause.   Some examples:

 

Example - a table name:

 

FUNCTION f() TABLE AS mfd_root END;

SELECT * FROM CALL f();

 

Example - a table parameter:

 

FUNCTION f(p TABLE) TABLE AS p END;

SELECT * FROM CALL f(mfd_meta);

 

Example - a CALL:

 

FUNCTION f() TABLE AS CALL ValueSequence(0, 10, 1) END;

SELECT * FROM CALL f();

 

Example - an EXECUTE clause within parentheses:

 

FUNCTION f() TABLE AS (EXECUTE CALL ValueSequence(0, 10, 2)) END;

SELECT * FROM CALL f();

 

Redefining Functions

Using the FUNCTION statement with the name of a previously defined function will redefine that function. The redefined function can have a different prototype from the previously defined one,   For example, we can redefine a function that takes no arguments and returns INT32 into a function that takes two NVARCHAR arguments and returns NVARCHAR.

 

Example: Declaration and use of a basic math function:

 

FUNCTION Square(x FLOAT64) FLOAT64 AS x * x END;

 

SELECT [result] AS [value], Square([result]) AS [square]

FROM CALL ValueSequence(0, 10, 1);

 

Example:  Using SQL Server Replicate:

 

This function repeats a character using SQL Server's Replicate function.   The parentheses around the EXECUTE statement convert a table returned by EXECUTE into a single value in that table):

 

FUNCTION Rep(x NVARCHAR, y INT32) NVARCHAR

AS (EXECUTE [[ SELECT Replicate(@p@, @q@) ]] ON [sql]

WITH (p=x, q=y)) END;

 

SELECT [result] AS [value], Rep('z', [result])

AS [rep] FROM CALL ValueSequence(0, 10, 1);

 

Example: A Function that Returns a Table:

 

The following displays a line from the mfd_root table where the value of the field mfd_id is 2.

 

FUNCTION f(p INT32) TABLE AS (SELECT * FROM mfd_root

WHERE mfd_id=p) END

 

EXECUTE CALL f(2);

 

To display the value of the field name from that line we can use  the statement:

 

SELECT name FROM CALL f(2);

 

Example: A Function that Takes a Table as a Parameter:

 

To display values from the field mfd_id in the mfd_root table:

 

FUNCTION f(p TABLE) TABLE AS

(SELECT mfd_id FROM p) END

EXECUTE CALL f(mfd_root);

 

To display values from the field mfd_id in the mfd_meta table we could use:

 

EXECUTE CALL f(mfd_meta);

 

Example:  Passing a table to a function:

 

FUNCTION f(t TABLE) NVARCHAR AS

(SELECT value FROM t WHERE property = 'Folder') END

SELECT name, f((COLLECT property, value))

FROM mfd_meta GROUP BY name;

 

...another example...

 

FUNCTION f(t TABLE) TABLE AS t END;

SELECT SPLIT CALL f((COLLECT [Name]))

FROM [mfd_meta]

GROUP BY TRUE;

 

Example: Returned Tables with Different Schemas:

 

Using a function to display part of a table that has three fields, mfd_root:

 

FUNCTION f(p TABLE, x INT32) TABLE AS

(SELECT * FROM p WHERE mfd_id>=x) END

EXECUTE CALL f(mfd_root, 2);

 

Using the same function to display part of a table that has four fields, mfd_meta:

 

EXECUTE CALL f(mfd_meta, 2);

 

Scripted Functions

Functions can be much more than simple expressions:  they can be written using a script, in which case they are called script functions. This really opens the door to doing all types of custom analysis.   Script functions used in SQL can take table arguments, which allows building custom aggregates, and

 

Script functions used in SQL also can return table results, which allows building custom splitters.  When script functions return table results, the Manifold query engine has to know in advance the structure of the returned table, that is, the schema. To accomplish this, the query engine uses a secondary script function which must return an empty table with the structure that will be used by the function that returns the actual data.  The name of that secondary script function is specified via the ENTRYSCHEMA keyword, by default set to <entry>Schema. If the original script function takes other tables as arguments, the secondary script function is passed model tables so it can decide what the output table is going to be based on what the argument tables are.

 

VBScript Examples

 

The following VBScript script provides an example of a function for queries.   We could omit Main, but it is useful to still have it and either do nothing as in this example or do a test call:

 

'VBScript

Function Square(p)

  Square = p*p

End Function

Sub Main

End Sub

 

If the script above is called vbs, the following query calls the function Square:

 

FUNCTION Square(x FLOAT64) FLOAT64 AS SCRIPT [vbs];

 

SELECT [result] AS [value],

Square([result]) AS [f]

FROM CALL ValueSequence(0, 10, 1);

 

For each parameter value, there is an implicit CAST to the type declared by the FUNCTION statement.  There is also an implicit CAST to the type that is the result of the function.   If we run the following query example we can see how calling Square() as declared will discard the fractional part of the argument:

 

FUNCTION Square(x INT32) FLOAT64 AS SCRIPT [vbs];

 

SELECT [result]+0.5 AS [value],

Square([result]+0.5) AS [f]

FROM CALL ValueSequence(0, 10, 1);

 

Functions in .NET Scripts

 

In .NET scripts function names are typically composite since every function is a member of some .NET class. To get to these functions, specify the complete name of a function using the ENTRY clause, for example:

 

// C#

class Script

{

static int Square(int x) { return x*x; }

static void Main() { }

}

 

FUNCTION Square(x INT32) INT32

AS SCRIPT [cs] ENTRY 'Script.Square';

 

SELECT [result] AS [value], Square([result])

AS [f] FROM CALL ValueSequence(0, 10, 1);

 

Same Function, Different Names

 

Manifold allows binding the same script function with different names.

 

For COM script engines that convert types on the fly, these bindings can use different types for parameters and / or return value, for example:

 

FUNCTION SquareDD(x FLOAT64) FLOAT64

AS SCRIPT [vbs] ENTRY 'Square';

 

FUNCTION SquareID(x INT32) FLOAT64

AS SCRIPT [vbs] ENTRY 'Square';

 

SELECT [result]+0.5 AS [value],

SquareDD([result]+0.5) AS [dd],

SquareID([result]+0.5) AS [id]

FROM CALL ValueSequence(0, 10, 1);

 

 

.NET script engines differ from COM script engines (like the COM script engine for VBScript)  in that .NET is strongly-typed and can have multiple functions with the same name but with different parameter types.  Keeping that in mind, Manifold determines what exact .NET function to use based on parameter types declared in the FUNCTION statement as well.

 

An example with function overloads:

 

// C#

class Script

{

static string Format() { return "no parameters"; }

static string Format(int x) { return x.ToString(); }

static string Format(double x, double y) { return string.Format("{0}:{1}", x, y); }

static void Main() { }

}

 

FUNCTION Format()

NVARCHAR AS SCRIPT [cs] ENTRY 'Script.Format';

 

FUNCTION FormatI(x INT32)

NVARCHAR AS SCRIPT [cs] ENTRY 'Script.Format';

 

FUNCTION FormatDD(x FLOAT64, y FLOAT64)

NVARCHAR AS SCRIPT [cs] ENTRY 'Script.Format';

 

SELECT [result] AS [value], Format() AS [f],

FormatI([result]) AS [fi], FormatDD([result],

[result]+100) AS [fdd] FROM CALL ValueSequence(0, 10, 1);

 

Note:   It's true that some .NET languages might give the appearance that .NET is not strongly-typed at all, but such languages are implemented using additional levels of logic on top of the same, strongly-typed .NET foundation.

 

Script Functions in External Files

 

Script functions can be stored in external files. They can also be stored in compiled form as .NET assemblies.

 

To use a script function stored in an external file, use the FILE 'file' construct instead of a reference to the script component in the FUNCTION statement. The external file containing the script function must be located within the Manifold installation folders, using the same logic for location of files as applies to where add-ins must be placed, such as ~\Bin for 32-bit, ~\Bin64 for 64-bit, ~\Shared and so on.

 

Example of calling a function stored in a compiled assembly:

 

FUNCTION F(x FLOAT64) FLOAT64

AS SCRIPT FILE 'math2.dll' ENTRY 'math.Var.F';

 

SELECT [result] AS [value], F([result]) AS [f]

FROM CALL ValueSequence(0, 10, 1);

 

Examples

Example: Redefining Functions

In the command pane of a Command Window we will execute a series of FUNCTION commands that define and then redefine functions.   We also will execute SELECT statements that use those functions.   The purpose of this sequence of commands is to explore how functions can be defined, used, redefined and reused.

 

FUNCTION f(name NVARCHAR) NVARCHAR

AS StringToUpperCase(name) END;

 

SELECT name, f(name) FROM mfd_root;

 

We have defined a function named f which converts a string to upper case, and used that in the  SELECT statement.

 

FUNCTION f(name NVARCHAR) NVARCHAR

AS '[' + name + ']' END;

 

SELECT name, f(name) FROM mfd_root;

 

New we have redefined f so that instead of converting to upper case the function adds square brackets at left and right.   We then used the new definition for f in another SELECT statement.

 

We continue by defining another function.

 

FUNCTION g(name NVARCHAR) NVARCHAR

AS StringToUpperCase(f(name)) END;

 

The above command defines a function named g, which calls the function f and converts the result from f to upper case.  When run as a separate statement this definition of g returns a table with a single cell.

 

SELECT name, g(name) FROM mfd_root;

 

The SELECT statement above uses the function g.  Note that each call to g in turn generates a call to function f, a function which adds square brackets.

 

Let's continue by redefining f:

 

FUNCTION f(name NVARCHAR) NVARCHAR

AS StringReverse(name) END;

 

The above redefines f so that instead of adding square brackets it reverses the string.

 

SELECT name, f(name) FROM mfd_root;

 

We can verify that f reverses strings by running a SELECT statement using f.

 

SELECT name, g(name) FROM mfd_root;

 

If we now run a SELECT statement using function g we see that g continues to use the old definition of f in which the function f adds square brackets. That old definition of f is no longer associated with any name  but g can still use that old  definition, because g has already resolved that old definition for f.

 

Note that g must use the old definition of f until g itself is redefined, because g  was defined using the prototype for the old definition of f.  Given that the new definition for f could have a different prototype (the new definition could have taken zero arguments, for example), in order for g to use the new definition of f we must redefine g  after we have redefined f.

 

One way to think of this is by analogy to how cache works: a function that is defined in terms of some other function can be imagined to have cached what it understands about that other function.  If that other function is redefined the cached understanding of it will linger within any other functions that use it until they too are redefined to use a new understanding, that is, the new definition, of that other function.

 

FUNCTION f(name NVARCHAR) NVARCHAR

AS StringReverse(g(name)) END;

 

Now we have redefined f to call g (which still uses the old definition of f) and to then  reverse the result.

 

SELECT name, f(name) FROM mfd_root;

 

Surprisingly, this works fine when f is used in a SELECT statement.

 

Note that the last version of f as redefined above calls g which in turn calls the definition of f as it remembers it, g's memory being the old version of f. The call stack, therefore, with function versions in brackets, is now:

 

f(4, reverse) ->

g(1, convert to upper case) ->

f(2, add brackets)

 

Let us do one more redefinition of f:

 

FUNCTION f(name NVARCHAR) NVARCHAR AS StringTrim(StringReverse(f(name)), '[]') END;

 

SELECT name, f(name) FROM mfd_root;

 

Finally, we now have redefined f to call its previous definition. The body of the new f can see the previous definition of f, so this works as well in the SELECT statement that calls f. The call stack now is:

 

f(5, reverse and trim) ->

f(4, reverse) ->

g(1, convert to upper case) ->

f(2, add brackets)

 

While it is cool that we can redefine functions and certainly exotic that a function can call an old, remembered version of itself it should also be obvious that deliberately coding the above into working code is a formula for confusion.   This example is more of an illustration of what not to do, in particular, not redefining functions that are called by other functions

 

Notes

tech_lars_sm.png

Functions and Data Sources

 

Consider the following code, run in sequence in a Command Window in a newly-created, empty .map project.

 

--SQL

CREATE TABLE t (a INT32);

INSERT INTO t (a) VALUES (1), (2), (3);

CREATE DATASOURCE d (PROPERTY 'Type' 'manifold');

USE CHILD d;

CREATE TABLE t (a INT32, b NVARCHAR);

INSERT INTO t (a, b) VALUES (2, 'abc'), (3, 'xyz');

 

This creates a table named 't' with some values, then creates a data source named 'd' and then creates a different table named 't' on that data source.

 

Now, continuing on in the same command window session, we create a function that selects data from 't':

 

--SQL

FUNCTION f(v INT32) TABLE AS (SELECT * FROM t WHERE a>=v) END;

 

If we run this function while the current data source is 'd':

 

--SQL

EXECUTE CALL f(2);

 

...we get values from the table 't' in that data source 'd'.

 

But if we run the same function while the current data source (see the Temporary Databases topic) is the original .map file:

 

--SQL

USE ROOT;

EXECUTE CALL f(2);

 

...we get values from the table 't' in the original .map  file.

 

That is, functions do not 'remember' data sources which were in use when they were defined.  When we call a function, it runs in the context of the current data source.

 

The only quasi-exception are functions defined in scripts.  When we define such a function using FUNCTION ... AS SCRIPT <script>, the query engine immediately retrieves the code from the script component using whatever is the current data source as a base.  It does not remember the name of the script component from which code should be retrieved so it can try to locate that script component at the moment of the call.  

 

That is not done because  the intent is to have the code of the function adjust to the current database while remaining the same code.  The intent is not to change the code itself.

 

See Also

Tables

 

Add an Index to a Table

 

Command Window

 

Queries

 

Regular Expressions

 

SQL Statements

 

SQL Operators

 

SQL Functions

 

Temporary Databases