SELECT

Purpose

Use the SELECT statement to retrieve data from tables or views.

Prerequisites

  • You need to have system privilege SELECT ANY TABLE or appropriate SELECT privileges on tables or views which are referenced in the SELECT list. Either the tables or views belong to the actual user, or belongs to one of its roles or the actual user owns the object privilege SELECT on the table/view or its schema.
  • When accessing views, it is necessary that the owner of the view has appropriate SELECT privileges on the referenced objects of the view.
  • If you use a subimport, you need the appropriate rights similar to the IMPORT statement.

Syntax

subquery::=

Subquery

with_clause::=

With Clause

select_list::=

Select List

from_item::=

From Item

connect_by_clause::=

Connect By Clause

preferring_clause::=

Preferring Clause

preference_term::=

Preferring Term

group_by_clause::=

Group By Clause

cube_rollup_clause::=

Cube Roll UP Clause

grouping_sets_clause::=

Grouping Sets Clause

grouping_expression_list::=

Grouping Expression List

order_by_clause::=

Order By Clause

limit_clause::=

Limit Clause

join_clause::=

Join Clause

inner_outer_clause::=

Inner Outer Clause

subimport::=

values_table::=

Value Table

Usage Notes

  • You can calculate scalar expressions by omitting the FROM clause, for example, SELECT 'abc'.
  • Using the WITH clause, you can define common table expressions (CTEs). These are temporary views which are valid only during the execution of a subquery.
  • In case of DISTINCT, identical rows will be eliminated. If you use the keyword ALL (default), all rows will be present in the result table.
  • Source tables and views are defined in the FROM clause. Through the values_table, you can easily define static tables, for example by (VALUES (1, TRUE), (2, FALSE), (3, NULL)) AS t(i, b) a table with two columns and three rows is specified.
  • The SELECT list defines the columns of the result table. If * is used, then all columns will be listed. t.* can be used to list all columns of the table or view t or the object with the table alias t.
  • In complex expressions within the SELECT list, the usage of column aliases can be very useful. Directly, such aliases can only be referenced inside the ORDER BY clause and the QUALIFY filter condition. But you can also reference those aliases indirectly through the keyword LOCAL within the other clauses (WHERE, GROUP BY, HAVING) and even in the SELECT list. An example for the indirect referencing: SELECT ABS(x) AS x FROM t WHERE local.x>10. Moreover, column aliases define the column names of the result table.
  • The WHERE clause can be used to restrict the result by certain filter conditions.
  • Equality join conditions among tables can be specified within the WHERE clause by using the = operator. If you want to define an outer condition, add (+) after the outer-expression.
  • The CONNECT BY clause can be used to define hierarchical conditions. It is evaluated before the WHERE conditions - except the join conditions on tables which are referenced in the CONNECT BY clause.
    The following elements are relevant for the CONNECT BY clause:
  • START WITH By this condition you can specify the set of root nodes in the graph.
    condition(s) You can define several conditions. A hierarchical connection between father and son rows can be defined by the keyword PRIOR (for example, PRIOR employee_id = manager_id). If you don't specify such a PRIOR condition, the cross product will be computed. Thus, the statement SELECT LEVEL FROM dual CONNECT BY LEVEL<=100 results in a table with 100 rows, because 100 cross products were calculated for the table dual.
    NOCYCLE If you specify this option, the query also returns results if there exists a cycle. In this case the expansion will be terminated when a cycle is detected.

    The following functions and operators can be used in the SELECT list and the WHERE conditions to qualify the results of a hierarchical query.

    SYS_CONNECT_BY_PATH (expr, char) Returns a string containing the full path from the root node to the current node, containing the values for expr and separated by char.
    LEVEL Returns the hierarchy level of a row, that is, 1 for the root node, 2 for it's direct sons, and so on.
    PRIOR

    It references the row's parent row. By that, you can define the father-son condition. But furthermore, you can use this operator to access the values of the parent row. Hence, the following two CONNECT BY conditions are equivalent:

    • PRIOR employee_id = manager_id AND PRIOR employee_ id=10.
    • PRIOR employee_id = manager_id AND manager_ id=10.
    CONNECT_BY_ROOT Instead of a row's value, the corresponding value of the root node is used (for example, CONNECT_BY_ROOT last_name would be evaluated by the name of the highest manager of an employee if the condition PRIOR employee_id = manager_id was defined in the CONNECT BY clause).
    CONNECT_BY_ISLEAF This expression returns 1 if a row is a leaf within the tree (that is, it has no sons), otherwise 0.
    CONNECT_BY_ISCYCLE

    Returns whether the current row causes a cycle. In the path, such a row will occur exactly twice. This expression can only be used in combination with the NOCYCLE option.

    This behavior is different to Oracle's implementation where the parent node is returned instead of the connection point of the cycle.

    The function LEVEL and the operator PRIOR can also be used within the CONNECT BY clause.

  • The PREFERRING clause defines a Skyline preference term. For more information. refer to the Skyline section.
  • The GROUP BY clause defines groups of rows which will be aggregated in the result table. Inside the SELECT list, you can use aggregate functions. Using a numerical value x (position) results in aggregating the result table by the x-th column. For example, GROUP BY 1 is equivalent to GROUP BY the first column in the SELECT list. If GROUP BY is used, all SELECT list elements have to be aggregated except those which define the grouping keys.
  • CUBE, ROLLUP, and GROUPING SETS are extensions of the GROUP BY clause for calculating superaggregates. Those are hierarchical aggregation levels like, for example, partial sums for days, months and years and can be computed within one single GROUP BY statement instead of using a UNION of several temporary results.
  • You can distinguish between regular result rows (normal aggregation on the deepest level) and superaggregate rows. The total of all arguments results in the normal aggregation, the subsets result in the corresponding superaggregates. You can discern the result row types by using the function GROUPING[_ID].
  • CUBE

    Calculates the aggregation levels for all possible combinations of the arguments (2n combinations).

    Example: By CUBE(countries, products) you can sum up all subtotal revenues of all country/product pairs (regular result rows), but additionally the subtotals of each country, the subtotals of each product and the total sum. This expression is equivalent to GROUPING SETS ((countries, products), (countries), (products), ()).

    ROLLUP

    Calculates the aggregation levels for the first n, n-1, n-2, ... 0 arguments (overall n+1 combinations). The last level corresponds to the grand total.

    Example: By ROLLUP(year,month,day) you can sum up all revenues of all single date (regular result rows), but additionally for each month of year, for each year and the grand total (superaggregate rows).

    This expression is equivalent to GROUPING SETS ((year,month,day), (year,month), (year), ()).

    GROUPING SETS Calculates the aggregation levels for the specified combinations. CUBE and ROLLUP are special forms of GROUPING SETS and simplify the notation.
    () Is similar to GROUPING SETS () and aggregates the whole table as one single group.

    If multiple hierarchical groupings are specified, separated by a comma, then the result is the set of all combinations of partial groupings (cross product). For example, the expression ROLLUP(a,b),ROLLUP(x,y) results in overall 9 combinations. Starting with the subsets (a,b), (a), () and (x,y), (x), () you get the following combinations: (a,b,x,y), (a,b,x), (a,b), (a,x,y), (a,x), (a), (x,y), (x),().

  • By using the HAVING clause you can restrict the number of groups.
  • QUALIFY can be used to restrict the result by certain filter conditions in a way similar to WHERE or HAVING clauses, but its filter condition may contain or reference Analytic Functions.
  • If the condition does not contain an analytic function, it is preferable to use WHERE or HAVING clauses instead of QUALIFY, because the filter for QUALIFY is evaluated later.
  • QUALIFY is available starting with Exasol version 7.0.3.

  • The result table can be sorted by specifying the ORDER BY clause. Using a numerical value x (position) results in sorting the result table by the x-th column, for example, ORDER BY 1.
  • Options of the ORBER BY clause: 

    • ASC (Default) means ascending, DESC means descending sorting.
    • The NULLS LAST (Default) and NULLS FIRST option can be used to determine whether NULL values are sorted at the end or the beginning.
    • String data is sorted by its binary representation.

  • The number of result rows can be restricted by defining the LIMIT clause. The optional offset can only be used in combination with ORDER BY. LIMIT is not allowed within correlated subqueries. LIMIT with OFFSET is not allowed in aggregated SELECT statements without GROUP BY.
  • By using the subimport clause, you can integrate the import of external data sources directly in your query.
    • For information about the usage of external data sources and their options, refer to the description of the IMPORT statement in Section Manipulation of Database (DML) section. It is highly recommended to explicitly specify the target column types (see example below). Otherwise, the column names and data types are chosen in a generic way. For importing files, these types are mandatory.
    • Local files cannot be imported directly within queries.
    • By creating a view, external data sources can be integrated with Exasol as a sort of external tables.
    • Local filter conditions on such imports are not propagated to the source databases. However, you can achieve that by using the STATEMENT option.
  • SELECT statements can be combined using the Table Operators - UNION [ALL], INTERSECT, MINUS.
  • The SELECT queries can be directly returned from the query cache in case the syntactically equivalent query was already executed before. For more information, refer to the command ALTER SYSTEM.

Examples

The following examples relate to these tables:

SELECT * FROM customers;

C_ID        NAME
----------  ----------
         1  smith
         2  jackson

SELECT * FROM sales;

S_ID        C_ID        PRICE        STORE
----------  ----------  -----------  ----------
         1           1       199.99  MUNICH  
         2           1         9.99  TOKYO
         3           2        50.00  MUNICH
         4           1       643.59  TOKYO
         5           2         2.99  MUNICH
         6           2      1516.78  NEW YORK

SELECT store, SUM(price) AS volume FROM sales
GROUP BY store ORDER BY store DESC;

STORE       VOLUME
----------  --------
TOKYO       653.58
NEW YORK    1516.78
MUNICH      252.98

SELECT name, SUM(price) AS volume FROM
customers JOIN sales USING (c_id)
GROUP BY name ORDER BY name;

NAME        VOLUME
----------  -------
jackson     1569.77
smith       853.57

WITH tmp_view AS (SELECT name, price, store FROM customers, sales
                  WHERE customers.c_id=sales.c_id)
SELECT sum(price) AS volume, name, store FROM tmp_view
GROUP BY GROUPING SETS (name,store,());

VOLUME                  NAME        STORE
----------------------  ----------  ----------
               1569.77  jackson
                853.57  smith
                653.58              TOKYO
                252.98              MUNICH
               1516.78              NEW YORK
               2423.34

SELECT * FROM (IMPORT INTO (v VARCHAR(1))
               FROM EXA AT my_connection
               TABLE sys.dual);

SELECT last_name, employee_id id, manager_id mgr_id,
       CONNECT_BY_ISLEAF leaf, LEVEL,
       LPAD(' ', 2*LEVEL-1)||SYS_CONNECT_BY_PATH(last_name, '/') "PATH"   
    FROM employees
    CONNECT BY PRIOR employee_id = manager_id
    START WITH last_name = 'Clark'
    ORDER BY employee_id;

LAST_NAME   ID  MGR_ID  LEAF  LEVEL  PATH
----------  --  ------  ----  -----  ----------------
Clark       10  9       0     1      /Clark
Sandler     11  10      1     2      /Clark/Sandler
Smith       12  10      0     2      /Clark/Smith
Jackson     13  10      0     2      /Clark/Jackson
Taylor      14  10      1     2      /Clark/Taylor
Brown       15  12      1     3      /Clark/Smith/Brown
Jones       16  12      1     3      /Clark/Smith/Jones
Popp        17  12      1     3      /Clark/Smith/Popp
Williams    18  13      1     3      /Clark/Jackson/Williams
Johnson     19  13      1     3      /Clark/Jackson/Johnson