Basic Language Elements
Comments in SQL
Comments are primarily used to facilitate the clarity and maintainability of SQL scripts. They can be added at the discretion of the user anywhere in the SQL script.
Comments in Exasol can be of two types:
- Line comments begin with the characters
--
(two hyphens), indicating that the remaining part of the current line is a comment. - Block comments are delimited by the characters
/*
and*/
, and can span across several lines. All the characters between the delimiters are ignored.
Examples
-- This is a comment
SELECT * FROM dual;
-- Display the number of rows of table dual
SELECT count(*) FROM dual;
SELECT avg(salary) -- Average salary
FROM staff -- From table staff
WHERE age>50; -- Limitation to over 50 years old
/* Any number of comment rows can be
inserted between the delimiters
*/
SELECT * FROM dual;
SELECT /*This section is not evaluated!*/ * FROM dual;
SQL identifier
Each database object has a unique name. Within a SQL statement, object names are referenced as SQL identifiers. Database objects include schemas, tables, views, columns, functions, users, and roles. In Exasol, SQL identifiers can have a maximum length of 128 characters. Identifiers can be regular identifiers (unquoted) or delimited identifiers (quoted).
Regular Identifiers
Regular identifiers are unquoted.
As defined in the SQL standard, Exasol only allows a subset of the Unicode character set in regular identifiers. The first character of an SQL identifier is more restricted than the following ones:
- First character: All letters of Unicode classes Lu (uppercase letters), LI (lowercase letters), Lt (title case letters), Lm (modifier letters), Lo (other letters), and Nl (letter numbers).
- Following characters: All classes allowed for the first character, plus all letters of Unicode classes Mn (non-spacing marks), Mc (spacing combination marks), Nd (decimal numbers), Pc (connectors punctuations), Cf (formatting codes), and the Unicode character U+00B7 (middle dot).
For details about the characters included in each of the specified Unicode classes, see the Unicode standard.
When using the simple ASCII character set, a regular identifier may start with letters of the set {a-z, A-Z} and may further contain letters of set {a-z, A-Z, 0-9,_}.
A reserved word cannot be used as a regular identifier. For more information, see Reserved words.
Regular identifiers are not case-sensitive but are always stored in UPPERCASE letters in the database.
Examples
Identifier | Name in the database |
---|---|
ABC
|
ABC
|
aBc
|
ABC
|
a123
|
A123
|
Delimited identifiers
Delimited identifiers are quoted, which here means that they are enclosed in (delimited by) double quotation marks. All characters except a dot (.) are allowed in a delimited identifier. In the case of users and roles you are allowed to use a dot, for example, in email addresses. This is because users and roles do not conflict with schema-qualified names. For more details, see Schema qualified names.
To use double quotation marks within a name you must include two successive double quotation marks. For example, "ab""c"
indicates the name ab"c
.
Except in users and roles, delimited identifiers are case-sensitive when stored in the database.
Examples
Identifier | Name in the database |
---|---|
"ABC"
|
ABC
|
"abc"
|
abc
|
"_X_"
|
_X_
|
"ab""c"
|
ab"c
|
Schema qualified names
A schema qualified name consists of the schema name and the object name, separated by a dot. Using schema qualified names enables access to objects that are not located in the current schema. If the schema name is omitted, the current schema is implied.
Examples
SELECT my_column FROM my_schema.my_table;
SELECT "my_column" FROM "my_schema"."my_table";
SELECT my_schema.my_function(my_column) from my_schema.my_table;
Reserved words
Reserved words such as the keyword SELECT
cannot be used as regular identifiers. If you want to create a table with the name SELECT
, the name must be delimited with double quotation marks. Note that the name will then be case-sensitive, which means that SELECT
as a table name is different from a table name such as Select
or seLect
.
A list of the reserved words in Exasol can be found in the EXA_SQL_KEYWORDS system table.
Regular expressions
Regular expressions, often shortened as regex
or regexp
, can be used to filter or replace strings with certain patterns. Exasol supports the Perl Compatible Regular Expressions (PCRE) dialect, which has a powerful syntax and exceeds the functionality of dialects like POSIX Basic Regular Expressions (POSIX BRE) or POSIX Extended Regular Expressions (POSIX ERE).
The following describes the basic functionality of regular expressions in Exasol. Detailed information about the PCRE dialect can be found on www.pcre.org.
Regular expressions can be used in the following functions and predicates:
The compiled code for PCRE by default uses an offset of 2 bytes stored in big-endian order. These are used, for example, to link from the start of a subpattern to its alternatives and its end. The use of 2 bytes per offset limits the size of the compiled pattern to 64K.
Examples
In the description of the corresponding scalar functions and predicates, you can find examples of regular expressions within SQL statements. The following examples demonstrate the general expressions:
Regular Expressions | Meaning |
---|---|
3[47][0-9]{13}
|
American Express credit card number (15 digits and starting with 34 or 37) |
\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}
|
IP addresses like 192.168.0.1 |
[+-]?[0-9]*\.?[0-9]+([+-]?[eE][0-9]+)?
|
Floating point numbers like -1.23E-6 or 40 |
(19|20)\d\d-(0[1-9]|1[012])-(0[1-9]|[12][0-9]|3[01])
|
Date values with format YYYY-MM-DD, for example, 2010-12-01 (restricted to 1900-2099) |
(?i)[a-z0-9._%+-]+@[a-z0-9.-]+\.[a-z]{2,4}
|
Email addresses like jane.doe@example.com |
select regexp_substr(',match1,,match2', '(?<=,)[^,]*', 1,2);
|
Empty strings are not valid. For example, the result in this example would be match2. |
The expressions in these examples are simplified to provide clarity. Expressions used in a production environment, for example, an expression for identifying valid email addresses, are much more complex.
Pattern Elements
You can use the following elements in regular expressions: