ORDER BY

The ORDER BY clause can be used within the EXECUTE, SELECT and VALUES statements to list records within a table within some desired order, or to order records for use within an expression that uses FETCH or other clauses.   

 

ORDER BY is an important clause because records in tables should never be assumed to be stored in some specific order.  If we want to see or use a table in some specific order we must always explicitly specify that ordering using the ORDER BY clause.

 

ORDER BY <field> <options>, <field> <options>, ....

 

<field> -  When used with SELECT each <field> can be an expression.   When used with EXECUTE and VALUES each <field> must be a field name.

 

<options> - A list of options used by b-tree indexes, which can be any number of options or empty.   See the discussion in the SQL Statements topic.  Available options include:

 

ASC or DESC - sort in ascending (default) or descending order.

CASE or NOCASE - case-sensitive or case-insensitive sort, only applies to text fields,

ACCENT or NOACCENT - consider or ignore accents on letters or not. Nvarchar (Unicode) text fields.

SYMBOLS or NOSYMBOLS - consider or ignore  symbols or not. Nvarchar (Unicode) text fields.

KANATYPE or NOKANATYPE - consider or ignore kana type for Asian languages. The default is KANATYPE.  Nvarchar (Unicode) text fields.

WIDTH or NOWIDTH - consider or ignore character widths for Asian languages. The default is WIDTH. Nvarchar (Unicode) text fields.

WORDSORT or NOWORDSORT - ignore or consider space between words. The default is NOWORDSORT (space between words matters).  Nvarchar (Unicode) text fields.

 

Inapplicable options are ignored, for example, specifying CASE for a numeric field.

 

SELECT Job, Name

  FROM Invoices

  ORDER BY Job DESC, Name NOCASE;

 

The query above sorts a table using the Job field and the Name field, first sorting in descending order by Job, so that a job value of Programmer comes before Accountant, and then sorting the Name field in default, ascending order without considering upper or lower case characters used in the spelling of the name.

 

Note that the use of the word CASE in the above options is different than the use of the word CASE in a CASE expression.   Manifold's query engine knows which is intended based on how the word CASE is used.  

 

Tables are not ordered.  Therefore, it is a mistake to use ORDER BY when writing records into a table using a SELECT ... INTO statement.   That will throw an error.

Examples

Manifold's ORDER BY clause is similar to other full-featured SQL implementations, such as those discussed in Chris Fehily's SQL: Visual Quickstart Guide.  The following comments refer to specific examples in that book, with adjustments to Fehily's examples as required for Manifold SQL:

Multiple Column Sorting

We can sort and sub-sort using multiple columns:

 

SELECT au_fname, au_lname, city, state

FROM authors

ORDER BY state, city DESC;

Sorting by Relative Column Positions

The SQL standard and some SQL implementations allow referring  to columns used for sorting by their relative position in the SELECT list.   This is an unpleasant habit which Manifold does not allow.   The SQL below works in many SQL implementations but in Manifold it is wrong.

 

SELECT au_fname, au_lname, city, state

FROM authors

ORDER BY 4, 2 DESC;

 

The above example lis listing 4.13 in the SQL: Visual Quickstart Guide book.  The intent, which works in some SQL implementations, is that the 4 in the ORDER BY is a shorthand way of referring to the state column, and the 2 is a shorthand way of referring to the au_lname column.    We can run that query in Manifold and it will not throw an error, because the 4 and the 2 will be interpreted as literal numbers, but the results table will not be sorted by those columns.  

 

Manifold deliberately disallows using integer numbers like the above, to refer by implication to columns in the SELECT list for two reasons:  First, that leads to queries that can be confusing and difficult to read.   Second, it seems daft to  parse a number, like the integer 4, as a numeric literal, except in some "secret handshake" cases like the ORDER BY clause where the user must remember that this is an exception to the rule and in this case the number 4 means the fourth column of the SELECT list.  

Sorting and NULLs

Nulls come last, that is are considered the highest values, in Manifold when sorting ascending:

 

SELECT pub_id, state, country

FROM publishers

ORDER BY state ASC;

 

Any records with a NULL in the state column will appear last in the results table for the above query.   This matches default behavior in Oracle, DB2, and PostgreSQL.  Microsoft Access, Microsoft SQL Server, and MySQL are different, with NULL values appearing first, that is, are considered the lowest value when sorting ascending.

Sorting by Unselected Columns is OK

We can sort using columns that do not appear in the SELECT list.  

 

SELECT city, state

FROM authors

ORDER BY zip;

 

The above sorts the results table by zip, even though that column does not appear in the SELECT list.  That can be confusing, since the table records may appear to a user to be in unsorted order.

Sorting using Conditional Logic with CASE

We can use CASE to change the column used for sorting depending on the value of yet a different column for that record:

 

SELECT title_id, type, price, sales

FROM titles

ORDER BY CASE WHEN type = 'history'

  THEN price ELSE sales END;

 

When the value of the type column is history, the price field is used for sorting and otherwise the sales field is used for sorting.

Sorting using a Computed Expression

We can sort using a computed expression, either as built into a computed field or as expressed on the fly within the query, such as the example below:

 

SELECT title_id, price, sales, price * sales AS revenue

FROM titles

ORDER BY price * sales DESC;

 

We can avoid duplicating the computed expression by using a nested query:

 

SELECT *

FROM (SELECT title_id, price, sales, price * sales AS revenue

        FROM titles)

ORDER BY revenue DESC;

 

The use of the nested query also allows using the alias in the ORDER BY clause, as discussed below.

Indexes

Fields often used for sorting should have indexes, so sorting can work faster.   Manifold's ORDER BY can and does use indexes whenever possible, both for data within .map files and also for data within other data sources.  If a data source specifies an index that matches the criteria in ORDER BY the Manifold query engine will use that index and will avoid fetching all data into intermediate storage and then sorting that fetched data locally.

 

Matching is significantly wider than just one-to-one. For example, if a sorted table provides an index on fields a and b, and ORDER BY requests sorting on a, then sorting will use the index despite the index using two fields instead of one. In  the same table, if ORDER BY requests sorting on a in descending order and then sorting on b also in descending order then sorting also will use the index as well.  CASE and NOCASE options must be matched exactly.

 

However, if ORDER BY requests sorting on a in ascending order and then sorting on b in descending order in that case sorting will not be able to use the index provided by the table and will have to create a temporary index locally. If ORDER BY requests sorting on b sorting will not be able to use the index provided by the table. If ORDER BY requests sorting on a+b sorting will not be able to use the index provided by the table.

Avoid Aliases in ORDER BY

Manifold takes a much more granular approach than non-parallel systems to running parts of queries, with stricter boundaries between parts given that different parts could possibly be run in parallel or on different data sources.    As a result, ORDER BY and SELECT are in different nodes so ORDER BY cannot sneak any peeks into an alias list for SELECT. That means Manifold in general does not allow an ORDER BY clause to refer to aliases assigned in the SELECT list, for example, as in forms such as

 

SELECT [X (I)] AS X, [Y (I)] AS Y

FROM [Drawing] ORDER BY X, Y;

 

Instead, aliases in general must be replaced with the names of the original fields, as in

 

SELECT [X (I)] AS X, [Y (I)] AS Y

FROM [Drawing] ORDER BY [X (I)], [Y (I)];  

 

If an alias is used to name an expression, in general it must replaced with a copy of an expression.   There are ways to avoid duplicating an expression, of course, with the most obvious way to use nested queries, such as the example given above for computed expressions:

 

SELECT *

FROM (SELECT title_id, price, sales, price * sales AS revenue

        FROM titles)

ORDER BY revenue DESC;

 

Non-parallel SQL implementations will normally allow using aliases in the ORDER BY clause.   For example, query 4.16 in the SQL: Visual Quickstart Guide book. illustrates the use of AS to assign aliases to column names in the SELECT list, and then the use of those aliases to refer to the columns in ORDER BY:

 

SELECT au_fname AS [First Name], au_lname AS [Last Name], state

FROM authors

ORDER BY state, [Last Name], [First Name];

 

The query above has been adjusted to use Manifold SQL syntax, using square brackets [ ] to surround names that contain a space character.   To adjust that to use a nested query, we could write:

 

SELECT *

FROM (SELECT au_fname AS [First Name], au_lname AS [Last Name], state

        FROM authors)

ORDER BY state, [Last Name], [First Name];

 

The nested query works because the inner SELECT is evaluated first, making the alias list it generates available for the ORDER BY in an unambiguous way.

 

The discussion above uses the qualifier in general because Manifold will correctly utilize aliases in an ORDER BY clause in cases where the use of those alias is unambiguous.   We still should avoid using aliases in the ORDER BY clause because it can be difficult, especially for beginners, to tell when such use will work.   

Examples

Consider the Invoices table seen at right.   Each record has a Name, a Job classification, and a Payment.   Names and job classifications include some repetition, for example, with the name Alex occurring in three records, twice with Electrical jobs and once with a Carpetry job.

 

A simple SELECT query:

 

SELECT Job, Name

  FROM Invoices

  ORDER BY Job;

 

The results table lists all records in the order specified by ORDER BY, that is, alphabetically by Job.  Within each Job category records are unordered.   For example, the first four records are all Carpentry records but the names are not in any order: Tina, Alex, Tom, and Tina.

 

We can order the table by two fields:

 

SELECT Job, Name

  FROM Invoices

  ORDER BY Job, Name;

 

 

The results table lists all records ordered alphabetically first by Job and then within each Job alphabetically by Name.    For example, the first four records are all Carpentry records but the names are now in alphabetical order as well: Alex, Tina, Tina, and Tom.

 

Options such as DESC accompany the ordering field:

 

SELECT Job, Name

  FROM Invoices

  ORDER BY Job DESC, Name DESC;

 

 

The table now is ordered alphabetically in descending (Z first) order by Job and then within each Job in descending order by Name.

 

We can use FETCH to  limit the number of records returned, to some desired value, such as 5:

 

SELECT Job, Name

  FROM Invoices

  ORDER BY Job DESC, Name DESC

  FETCH 5;

 

Note that FETCH comes last in the query.

 

 

The results table thus generated has the same sort ordering, but is limited to only five records.   We might use FETCH with ORDER BY if we would like to get, say, the top few hundred records in a very large table that has many millions of records.  

 

Notes

Fehily recommended - Manifold strongly recommends Chris Fehily's books on SQL, such as SQL: Visual QuickStart Guide and others. Some examples from Fehily will require slight adaptation for use in Manifold or other GIS-oriented SQL implementations, but learning SQL from a Fehily book is a great way to get started.

 

No brackets - In this topic to reduce visual clutter we do not use square [ ] brackets around simple field and table names.    Square brackets are optional if the name is unambiguous.

 

Before ORDER BY - THREADS and BATCH should appear in a query before ORDER BY: both of  these options continue working with ORDER BY. If the expressions in the SELECT list take a long time to compute then using more than one thread will very likely significantly shorten that time whether or not ORDER BY is used.

 

ORDER BY comes last, usually -  ORDER BY is normally the last clause in a SELECT statement, unless we are using FETCH.   FETCH comes after an ORDER BY.   

 

Not a part of SELECT ... INTO - If we are just beginning work with enterprise class databases it is easy to forget that, unlike consumer class personal file managers, records within tables in high end databases should never be assumed to be stored in some particular order.   If we want to see records in some particular order we always must do so using an ORDER BY clause.   

 

A consequence of records being stored in tables in no particular order is that there is no role for ORDER BY in a  SELECT ... INTO clause.  Suppose, for example, we have a table of Customers with a Name field.   For convenience, we would that table to always open by default with a particular order, so we attempt the following incorrect query:

 

SELECT * INTO [Customers by Name] FROM [Customers] ORDER BY [NAME];

 

That will not work because ORDER BY has no role in how records are stored in tables.   If we want to see the Customers table ordered by Name, we could create a query called [Customers by Name] that contained the SQL:

 

SELECT * FROM [Customers] ORDER BY [NAME];

 

We can then use the query wherever we want to see a sorted table.

 

Discussion:  Why are records stored in tables without a specific sort order?  Primarily for performance and flexibility when working with larger or more complex data sets.  Storing records in sort order means constantly updating a database whenever records are edited, inserted or deleted, which very rapidly becomes expensive and inconvenient.   Changing just the first letter of a record means moving it in sort order if the table stores records in sort order, which can involve a lot of work in a very large table and, while the table is being reorganized, introduce complications and delays for other processes and people that may want to simultaneously access that table.  

 

There are always technical means that, at a cost, can resolve such complexities.  For simple, small databases used by only one person in a non-parallel way such means are not particularly costly or complex.  But for databases that are designed for more sophisticated applications it turns out that a more effective approach is to not store records in sort order but instead to store records in whatever storage order is most effective for the internal machinery and then to always generate a desired, sorted view of the data on the fly using ORDER BY.  

 

Ambiguous forms not supported - Some traditional, but ambiguous forms of ORDER BY used in various databases are not allowed by Manifold.    For example,

 

SELECT * FROM t ORDER BY 1;

 

...is an acceptable construction in some systems meaning to take all data from table t and to sort it by the first field.  However, that syntax is not supported by Manifold since it is ambiguous given the evaluation of expressions in the form of:

 

SELECT * FROM t ORDER BY 1+1;

 

The ambiguity arises because the "order by 1" case sorts by a field while the "order by 1+1" case would sort by a constant expression, that is, wouldn't sort at all.

 

See Also

Queries

 

Command Window

 

Query Builder

 

SQL Statements

 

COLLECT

 

EXECUTE

 

FETCH

 

GROUP BY

 

VALUES