Basic Language Elements
Comments in SQL
Comments are primarily used to facilitate the clarity and maintainability of SQL scripts. They can be added at the discretion of the user anywhere in the SQL script.
Comments in Exasol can be of two types:
- Line comments begin with the characters “--” (two hyphens), indicating that the remaining part of the current line is a comment.
- Block comments are delimited by the characters “/*” and “*/”, and can span across several lines. All the characters between the delimiters are ignored.
Examples
-- This is a comment
SELECT * FROM dual;
-- Display the number of rows of table dual
SELECT count(*) FROM dual;
SELECT avg(salary) -- Average salary
FROM staff -- From table staff
WHERE age>50; -- Limitation to over 50 years old
/* Any number of comment rows can be
inserted between the delimiters
*/
SELECT * FROM dual;
SELECT /*This section is not evaluated!*/ * FROM dual;
SQL Identifier
Each database object has a unique name. Within an SQL statement, these names are referenced as SQL identifiers. These database objects include schemas, tables, views, columns, functions, users, and roles. In Exasol, SQL identifiers can have a maximum length of 128 characters. One differentiates between regular identifiers and identifiers in quotation marks.
Regular Identifiers
Regular identifiers are stated without quotation marks. As defined in the SQL standard, Exasol only allows a subset of the unicode characters. The first letter is more restricted than the following ones. Detailed lists of characters in each of the specified classes are defined in the Unicode standard.
- First Letter: All letters of unicode classes Lu (upper-case letters), LI (lower-case letters), Lt (title-case letters), Lm (modifier letters), Lo (other letters), and Nl (letter numbers).
- Following Letters: The classes of the first letter (aforementioned) and additionally all letters of unicode classes Mn (non-spacing marks), Mc (spacing combination marks), Nd (decimal numbers), Pc (connectors punctuations), Cf (formatting codes), and the unicode character U+00B7 (middle dot).
Regarding the simple ASCII character set, these rules denote that a regular identifier may start with letters of the set {a-z, A-Z} and may further contain letters of set {a-z, A-Z, 0-9,_}.
A further restriction is that a reserved word cannot be used as a regular identifier.
If you want to use characters that are prohibited for regular identifiers, you can use Delimited Identifiers. For example, if you want to use the word table as identifier, you have to quote it ("table"), since it is a reserved word.
Regular identifiers are always stored in upper case. They are not case-sensitive. Therefore, in the below example, the two identifiers ABC and abc are equal.
Regular identifiers are stored in upper case.
Examples
Identifier | Name in the DB |
---|---|
ABC | ABC |
aBc | ABC |
a123 | A123 |
Delimited Identifiers
These identifiers are names enclosed in double quotation marks. Any characters can be used within the quotation marks.
When you want to use double quotation marks in the name, you must include two double quotation marks next to one another. For example, "ab""c" indicates the name ab"c.
Note: Except in users and roles, delimited identifiers in quotation marks are always case-sensitive when stored in the database.
Examples
Identifier | Name in the DB |
---|---|
"ABC" | ABC |
"abc" | abc |
"_X_" | _X_ |
"ab""c" | ab"c |
Schema-Qualified Names
Database objects, when called via schema-qualified names, the names are separated with a period. This enables access to schema objects which are not located in the current schema.
Examples
SELECT my_column FROM my_schema.my_table;
SELECT "my_column" FROM "my_schema"."my_table";
SELECT my_schema.my_function(my_column) from my_schema.my_table;
Reserved Words
Some reserved words cannot be used as regular identifiers. For example, the keyword SELECT. It is a reserved word. If you need to create a table with the name SELECT, this is only possible if the name is within double quotation marks. Therefore, SELECT as a table name is different from a table name such as Select or seLect.
A list of reserved words in Exasol can be found in the EXA_SQL_KEYWORDS system table.
Regular Expressions
Regular expressions can be used to filter or replace strings with certain patterns. Exasol supports the Perl Compatible Regular Expressions (PCRE) dialect, which has a powerful syntax and exceeds the functionality of dialects like POSIX Basic Regular Expressions (POSIX BRE) or POSIX Extended Regular Expressions (POSIX ERE). This chapter describes the basic functionality of regular expressions in Exasol. Detailed information about the PCRE dialect can be found on www.pcre.org.
Regular expressions can be used in the following functions and predicates:
The compiled code for PCRE uses an offset of 2-bytes by default (always stored in big-endian order). These are used, for example, to link from the start of a subpattern to its alternatives and its end. The use of 2 bytes per offset limits the size of the compiled pattern to 64K.
Examples
In the description of the corresponding scalar functions and predicates, you can find examples of regular expressions within SQL statements. The following examples demonstrate the general expressions:
Regular Expressions | Meaning |
---|---|
3[47][0-9]{13} | American Express credit card number (15 digits and starting with 34 or 37) |
\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3} | IP addresses like 192.168.0.1 |
[+-]?[0-9]*\.?[0-9]+([+-]?[eE][0-9]+)? | Floating point numbers like -1.23E-6 or 40 |
(19|20)\d\d-(0[1-9]|1[012])-(0[1-9]|[12][0-9]|3[01]) | Date values with format YYYY-MM-DD, for example, 2010-12-01 (restricted to 1900-2099) |
(?i)[a-z0-9._%+-]+@[a-z0-9.-]+\.[a-z]{2,4} | Email addresses like hello.world@yahoo.com |
select regexp_substr(',match1,,match2', '(?<=,)[^,]*', 1,2); | Empty strings are not valid. For example, the result in this example would be match2. |
Note: These example expressions are simplified to provide clarity. The productive expressions, for example, identifying valid email addresses are much more complex.
Pattern Elements
You can use the following elements in regular expressions: