The Manifold query engine supports declaring and calling functions, including script functions, functions in external files and compiled functions as .NET assemblies. This topic discusses declaring and calling functions. For notes on functions built into the Manifold query engine, see the SQL Functions topic.
The Manifold query engine allows declaring and calling functions within queries given the following basic characteristics:
Names - Function names must be unique and cannot be reserved words.
Single returned value - A function can take zero or more parameters and must return a single resulting value.
No recursion - A function can call another function but a function can not call itself.
Unlimited declarations - There is no limit on the number of functions declared in the same query.
Must be declared - A function must be declared before it can be used. There is no way to undeclare a function.
Functions can take and return tables - A function in a query can take tables as parameters and can return tables. Returned tables can have different schemas.
In its simplest form a function is just a named expression. There can be many reasons for making an expression into a function but one of the most common reasons is to provide a descriptive name for some action. Another common reason is to make sure we use the exact same formula or logic in different places in a query, to avoid keyboarding errors.
If we need to use an expression several times in a query, giving it a simple, short name by declaring it as a function allows us to then utilize that expression without risk of errors introduced by slight variations in the expression we write. Declaring a long expression as a function with a short name also can save a lot of keyboarding if we use that expression more than once in a query.
A query function is a function returning a table that can be either a table name, a table parameter, a CALL, or within parentheses a SELECT, EXECUTE or VALUES clause. Some examples:
Example - a table name:
FUNCTION f() TABLE AS mfd_root END;
SELECT * FROM CALL f();
Example - a table parameter:
FUNCTION f(@p TABLE) TABLE AS @p END;
SELECT * FROM CALL f(mfd_meta);
Example - a CALL:
FUNCTION f() TABLE AS CALL ValueSequence(0, 10, 1) END;
SELECT * FROM CALL f();
Example - a TABLE statement within parentheses:
FUNCTION f() TABLE AS (TABLE CALL ValueSequence(0, 10, 2)) END;
SELECT * FROM CALL f();
A function can use parameters. Parameters use the @ character as a prefix. The names of function parameters can be reserved words, because the @ prefix removes the ambiguity. All the same, to avoid confusion it is wise to avoid reserved words as parameter names.
Example - Function parameter:
FUNCTION f(@p INT32) INT32 AS @p+1 END;
VALUES (f(10)), (f(11));
Copy and paste the above into a Command Window and press ! Run to see what it does.
All references to parameters inside a function body are resolved to local parameters. If a function wants to use a query parameter passed via EXECUTE, that parameter has to be transferred explicitly, under the same or different name.
Field values passed to computed fields and constraints do not use @ as a prefix.
Parameter names can contain spaces and special characters: @[parameter name] or @[john@sample.com] are OK.
The query parser allows whitespace between the @ prefix and a parameter name: @ abc or @abc are both OK.
Example - Function parameters:
FUNCTION combine(@p NVARCHAR, @q NVARCHAR) NVARCHAR AS
@p & ': ' & @q END;
SELECT combine(name, type) FROM mfd_root;
Copy and paste the above into a Command Window and press ! Run to see what it does.
Using the FUNCTION statement with the name of a previously defined function will redefine that function. The redefined function can have a different prototype from the previously defined one, For example, we can redefine a function that takes no arguments and returns INT32 into a function that takes two NVARCHAR arguments and returns NVARCHAR.
Example: Declaration and use of a basic math function:
FUNCTION Square(@x FLOAT64) FLOAT64 AS @x * @x END;
SELECT [value] AS [result], Square([value]) AS [square]
FROM CALL ValueSequence(0, 10, 1);
In the above example, as well as in following examples in this topic that use ValueSequence, the name of the column returned by ValueSequence is Value, so in the SELECT statement we use [value] (case is not significant in SQL column names) to refer to what ValueSequence returns.
Example: A Function that Returns a Table:
The following displays a line from the mfd_root table where the value of the field mfd_id is 2.
FUNCTION f(@p INT32) TABLE AS (SELECT * FROM mfd_root
WHERE mfd_id=@p) END;
TABLE CALL f(2);
To display the value of the field name from that line we can write:
FUNCTION f(@p INT32) TABLE AS (SELECT * FROM mfd_root
WHERE mfd_id=@p) END;
SELECT name FROM CALL f(2);
Example: A Function that Takes a Table as a Parameter:
To display values from the field mfd_id in the mfd_root table:
FUNCTION f(@p TABLE) TABLE AS
(SELECT mfd_id FROM @p) END;
TABLE CALL f(mfd_root);
To display values from the field mfd_id in the mfd_meta table we could use:
FUNCTION f(@p TABLE) TABLE AS
(SELECT mfd_id FROM @p) END;
TABLE CALL f(mfd_meta);
Example: Passing a table to a function:
FUNCTION f(@t TABLE) NVARCHAR AS
(SELECT value FROM @t WHERE property = 'Folder') END;
SELECT name, f((COLLECT property, value))
FROM mfd_meta GROUP BY name;
...another example...
FUNCTION f(@t TABLE) TABLE AS @t END;
SELECT SPLIT CALL f((COLLECT [Name]))
FROM [mfd_meta]
GROUP BY TRUE;
Example: Returned Tables with Different Schemas:
Using a function to display part of a table that has three fields, mfd_root:
FUNCTION f(@p TABLE, @x INT32) TABLE AS
(SELECT * FROM @p WHERE mfd_id>=@x) END;
TABLE CALL f(mfd_root, 2);
Using the same function to display part of a table that has four fields, mfd_meta:
FUNCTION f(@p TABLE, @x INT32) TABLE AS
(SELECT * FROM @p WHERE mfd_id>=@x) END;
TABLE CALL f(mfd_meta, 2);
Functions can be much more than simple expressions: they can be written using a script, in which case they are called script functions. This opens the door to doing all types of custom analysis. Script functions used in SQL can take table arguments, which allows building custom aggregates. Script functions used in SQL also can return table results, which allows building custom splitters.
When script functions return table results, the Manifold query engine has to know in advance the structure of the returned table, that is, the schema. To accomplish this, the query engine uses a secondary script function which must return an empty table with the structure that will be used by the function that returns the actual data. The name of that secondary script function is specified via the ENTRYSCHEMA keyword, by default set to <entry>Schema. If the original script function takes other tables as arguments, the secondary script function is passed model tables so it can decide what the output table is going to be based on what the argument tables are.
VBScript Examples
The following VBScript script provides an example of a function for queries. We could omit Main, but it is useful to still have it and either do nothing as in this example or do a test call:
'VBScript
Function Square(p)
Square = p*p
End Function
Sub Main
End Sub
If the script above is called vbs, the following query calls the function Square:
FUNCTION Square(@x FLOAT64) FLOAT64 AS SCRIPT [vbs];
SELECT [value] AS [result],
Square([value]) AS [f]
FROM CALL ValueSequence(0, 10, 1);
For each parameter value, there is an implicit CAST to the type declared by the FUNCTION statement. There is also an implicit CAST to the type that is the result of the function. If we run the following query example we can see how calling Square() as declared will discard the fractional part of the argument:
FUNCTION Square(@x INT32) FLOAT64 AS SCRIPT [vbs];
SELECT [value]+0.5 AS [result],
Square([value]+0.5) AS [f]
FROM CALL ValueSequence(0, 10, 1);
Functions in .NET Scripts
In .NET scripts function names are typically composite since every function is a member of some .NET class. To get to these functions, specify the complete name of a function using the ENTRY clause.
For example, suppose we have a C# script:
// C#
class Script
{
static int Square(int x) { return x*x; }
static void Main() { }
}
If the script above is called cs, the following query in a Command Window calls the function Square:
FUNCTION Square(@x INT32) INT32
AS SCRIPT [cs] ENTRY 'Script.Square';
SELECT [value] AS [result], Square([value])
AS [f] FROM CALL ValueSequence(0, 10, 1);
Same Function, Different Names
Manifold SQL allows binding the same script function with different names.
For COM script engines that convert types on the fly, these bindings can use different types for parameters and / or return value, for example:
FUNCTION SquareDD(@x FLOAT64) FLOAT64
AS SCRIPT [vbs] ENTRY 'Square';
FUNCTION SquareID(@x INT32) FLOAT64
AS SCRIPT [vbs] ENTRY 'Square';
SELECT [value]+0.5 AS [result],
SquareDD([value]+0.5) AS [dd],
SquareID([value]+0.5) AS [id]
FROM CALL ValueSequence(0, 10, 1);
.NET script engines differ from COM script engines (like the COM script engine for VBScript) in that .NET is strongly-typed and can have multiple functions with the same name but with different parameter types. Keeping that in mind, Manifold SQL determines what exact .NET function to use based on parameter types declared in the FUNCTION statement as well.
An example with function overloads:
Suppose we have a C# script:
// C#
class Script
{
static string Format() { return "no parameters"; }
static string Format(int x) { return x.ToString(); }
static string Format(double x, double y) { return string.Format("{0}:{1}", x, y); }
static void Main() { }
}
If the script above is called cs2, the following query in a Command Window calls the script function Format in three different functions:
FUNCTION Format()
NVARCHAR AS SCRIPT [cs2] ENTRY 'Script.Format';
FUNCTION FormatI(@x INT32)
NVARCHAR AS SCRIPT [cs2] ENTRY 'Script.Format';
FUNCTION FormatDD(@x FLOAT64, @y FLOAT64)
NVARCHAR AS SCRIPT [cs2] ENTRY 'Script.Format';
SELECT [value] AS [result], Format() AS [f],
FormatI([value]) AS [fi], FormatDD([value],
[value]+100) AS [fdd] FROM CALL ValueSequence(0, 10, 1);
Note: It's true that some .NET languages might give the appearance that .NET is not strongly-typed at all, but such languages are implemented using additional levels of logic on top of the same, strongly-typed .NET foundation.
Script Functions in External Files
Script functions can be stored in external files. They can also be stored in compiled form as .NET assemblies.
To use a script function stored in an external file, use the FILE 'file' construct instead of a reference to the script component in the FUNCTION statement. The external file containing the script function must be located within the installation folders, using the same logic for location of files as applies to where add-ins must be placed, such as in the same folder as the manifold.exe executable, ~\extras and so on.
Example of calling a function stored in a compiled assembly:
If we have a compiled assembly called math2.dll, the following query in a Command Window calls a function in that assembly:
FUNCTION F(@x FLOAT64) FLOAT64
AS SCRIPT FILE 'math2.dll' ENTRY 'math.Var.F';
SELECT [value] AS [result], F([value]) AS [f]
FROM CALL ValueSequence(0, 10, 1);
In the command pane of a Command Window we will execute a series of FUNCTION commands that define and then redefine functions. We also will execute SELECT statements that use those functions. The purpose of this sequence of commands is to explore how functions can be defined, used, redefined and reused.
FUNCTION f(@name NVARCHAR) NVARCHAR
AS StringToUpperCase(name) END;
SELECT name, f(@name) FROM mfd_root;
We have defined a function named f which converts a string to upper case, and used that in the SELECT statement.
FUNCTION f(@name NVARCHAR) NVARCHAR
AS '[' + @name + ']' END;
SELECT name, f(name) FROM mfd_root;
New we have redefined f so that instead of converting to upper case the function adds square brackets at left and right. We then used the new definition for f in another SELECT statement.
We continue by defining another function.
FUNCTION g(@name NVARCHAR) NVARCHAR
AS StringToUpperCase(f(@name)) END;
The above command defines a function named g, which calls the function f and converts the result from f to upper case. When run as a separate statement this definition of g returns a table with a single cell.
SELECT name, g(name) FROM mfd_root;
The SELECT statement above uses the function g. Note that each call to g in turn generates a call to function f, a function which adds square brackets.
Let's continue by redefining f:
FUNCTION f(@name NVARCHAR) NVARCHAR
AS StringReverse(@name) END;
The above redefines f so that instead of adding square brackets it reverses the string.
SELECT name, f(name) FROM mfd_root;
We can verify that f reverses strings by running a SELECT statement using f.
SELECT name, g(name) FROM mfd_root;
If we now run a SELECT statement using function g we see that g continues to use the old definition of f in which the function f adds square brackets. That old definition of f is no longer associated with any name but g can still use that old definition, because g has already resolved that old definition for f.
Note that g must use the old definition of f until g itself is redefined, because g was defined using the prototype for the old definition of f. Given that the new definition for f could have a different prototype (the new definition could have taken zero arguments, for example), in order for g to use the new definition of f we must redefine g after we have redefined f.
One way to think of this is by analogy to how cache works: a function that is defined in terms of some other function can be imagined to have cached what it understands about that other function. If that other function is redefined the cached understanding of it will linger within any other functions that use it until they too are redefined to use a new understanding, that is, the new definition, of that other function.
FUNCTION f(@name NVARCHAR) NVARCHAR
AS StringReverse(g(@name)) END;
Now we have redefined f to call g (which still uses the old definition of f) and to then reverse the result.
SELECT name, f(name) FROM mfd_root;
Surprisingly, this works fine when f is used in a SELECT statement.
Note that the last version of f as redefined above calls g which in turn calls the definition of f as it remembers it, g's memory being the old version of f. The call stack, therefore, with function versions in brackets, is now:
f(4, reverse) ->
g(1, convert to upper case) ->
f(2, add brackets)
Let us do one more redefinition of f:
FUNCTION f(@name NVARCHAR) NVARCHAR
AS StringTrim(StringReverse(f(@name)), '[]') END;
SELECT name, f(name) FROM mfd_root;
Finally, we now have redefined f to call its previous definition. The body of the new f can see the previous definition of f, so this works as well in the SELECT statement that calls f. The call stack now is:
f(5, reverse and trim) ->
f(4, reverse) ->
g(1, convert to upper case) ->
f(2, add brackets)
While it is wonderful that we can redefine functions, and it is certainly exotic that a function can call an old, remembered version of itself, it should also be obvious that deliberately writing the above into working code is a formula for confusion. This example is more of an illustration of what not to do, in particular, not redefining functions that are called by other functions
Functions and Data Sources
Consider the following code, run in sequence in a Command Window in a newly-created, empty .map project.
CREATE TABLE t (a INT32);
INSERT INTO t (a) VALUES (1), (2), (3);
CREATE DATASOURCE d (PROPERTY 'Type' 'manifold');
USE CHILD d;
CREATE TABLE t (a INT32, b NVARCHAR);
INSERT INTO t (a, b) VALUES (2, 'abc'), (3, 'xyz');
This creates a table named 't' with some values, then creates a data source named 'd' and then creates a different table named 't' on that data source.
Now, continuing on in the same command window session (enter the SQL below into the Command Window, highlight it, and then run just that portion by pressing Alt-Enter), we create a function that selects data from 't':
FUNCTION f(@v INT32) TABLE
AS (SELECT * FROM t WHERE a>=@v) END;
If we run this function while the current data source is 'd':
TABLE CALL f(2);
...we get values from the table 't' in that data source 'd'.
But if we run the same function while the current data source (see the Temporary Databases topic) is the original .map file:
USE ROOT;
TABLE CALL f(2);
...we get values from the table 't' in the original .map file.
That is, functions do not 'remember' data sources which were in use when they were defined. When we call a function, it runs in the context of the current data source.
The only quasi-exception are functions defined in scripts. When we define such a function using FUNCTION ... AS SCRIPT <script>, the query engine immediately retrieves the code from the script component using whatever is the current data source as a base. It does not remember the name of the script component from which code should be retrieved so it can try to locate that script component at the moment of the call.
That is not done because the intent is to have the code of the function adjust to the current database while remaining the same code. The intent is not to change the code itself.
Functions are their own world - Functions are their own world when it comes to parameters. All references to parameters inside a function body are resolved to local parameters, that is, to parameters explicitly passed as arguments to that function.
VALUE @deg2rad FLOAT64 = Pi / 180;
FUNCTION rad_1(@deg FLOAT64) FLOAT64 AS @deg * @deg2rad END;
In the above query fragment, the function rad_1 would cause an error (and shown in red text to emphasize it is incorrect) because it uses @deg2rad, a parameter that has not been declared as an argument.
VALUE @deg2rad FLOAT64 = Pi / 180;
FUNCTION rad_2(@deg FLOAT64, @convert FLOAT64) FLOAT64 AS @deg * @convert END;
SELECT * FROM [TABLE] WHERE [angle] > rad_2(45, @deg2rad);
The function rad_2 will work in the query above, because all parameters used within the function are declared as arguments.