JSON_EXTRACT
Purpose
This function extracts one or more SQL values from a valid JSON string. For each SQL value, JSON_EXTRACT requires a JSON path expression that specifies the value to extract.
Syntax
json_extract::=
json_input_clause::=
Usage Notes
- The parameter
json_exprdefines the input JSON string for the function. Exasol accepts an arbitrary expression of type(VAR)CHAR. To return an SQL value,json_exprhas to follow the syntax rules on JSON.org. JSON_EXTRACTis based on the Exasol UDF framework and behaves like a SCALAR EMITS UDF. Thus, the same restrictions apply (for example, there is no Virtual Schema pushdown). AlthoughJSON_EXTRACTrelies on the UDF framework, internally, it uses a fast native JSON processing framework.- Each
json_path_exprparameter defines a single JSON path forJSON_EXTRACT. The function accepts an arbitrary expression of type(VAR)CHARfor each path. For the path syntax rules in Exasol, refer to the JSON Path Expressions section. JSON_EXTRACTsupports an arbitrary number ofjson_path_exprparameters. For eachjson_path_expr, the EMITS clause has to define acolumn_nameand adata_typefor the JSON values it returns. The order of the columns in the EMITS clause corresponds to the order of thejson_path_exprparameters. Thus, the firstcolumn_nameanddata_typebelongs to the firstjson_path_expr.- If any
json_path_exprdoes not identify an SQL value,JSON_EXTRACTthrows an error. Thus,ERROR ON EMPTYis the default for eachjson_path_expr. - If any error occurs during the JSON processing,
JSON_EXTRACTthrows an error by default. Thus,ERROR ON ERRORis the default for eachjson_path_expr. - If the last
json_path_exprparameter is the string literal'$.error()', the error behavior ofJSON_EXTRACTchanges toLOG ON EMPTYandLOG ON ERRORfor alljson_path_exprparameters. - If
'$.error()'is present, the EMITS clause requires an extra'error column'withdata_type (VAR)CHARto log all occurring errors. It is possible to freely choose thecolumn_nameand(VAR)CHARsize for the'error column'in the EMITS clause. - To specify the
ON EMPTYandON ERRORbehavior for each path separately, refer to the JSON Path Expressions section. - If the error behavior for a path is
LOG ON EMPTYand the path does not identify an SQL value,JSON_EXTRACTreturnsNULLfor the path and writes the problem into the'error column'. - If the error behavior for a path is
LOG ON ERRORand an error occurs during the JSON processing,JSON_EXTRACTreturnsNULLfor the path and writes the problem into the'error column'. - To learn more about the different possible errors, refer to the JSON Error Handling section.
- With
JSON_EXTRACT, it is possible to extract several rows from a singlejson_expr. This makes it possible to create an extra row for each element in a JSON array. To extract several rows, thejson_path_exprhas to contain a'#'character to split up all expressions identified by a nested path into several rows. - All nested path prefixes ending with a
'#'character must be a prefix of the longest nested path.
Example
CREATE OR REPLACE TABLE json_input(id INT, json VARCHAR(2000000));
INSERT INTO json_input VALUES(1, '{"firstname" : "Ann", "surname" : "Smith", "age" : 29}');
INSERT INTO json_input VALUES(2, '{"firstname" : "Sue", "surname" : "Smith", "age" : 42}');INSERT INTO json_input VALUES(3, '{"firstname" : "Claire", "surname" : "Doe", "age" : "Five"}');
INSERT INTO json_input VALUES(4, '{"firstname" : "John", "name" : "Doe", "age" : 50}');
SELECT id,
JSON_EXTRACT (json,
'$.firstname',
'$.surname',
'$.age',
'$.error()'
)
EMITS(
forename VARCHAR(100),
surname VARCHAR(100),
age INT,
error_column VARCHAR(2000000)
)
FROM json_input;
Result
| ID | FIRSTNAME | SURNAME | AGE | ERROR_COLUMN |
| 1 | Ann | Smith | 29 | NULL |
| 2 | Sue | Smith | 42 | NULL |
| 3 | Claire | Doe | NULL | [{ "column" : "AGE", "error" : "cast error" }] |
| 4 | John | NULL | 50 | [{ "column" : "SURNAME", "error" : "no SQL/JSON item" }] |
CREATE OR REPLACE TABLE json_input2(id INT, json VARCHAR(2000000));
INSERT INTO json_input2 VALUES(1, '{"data" : [{"country" : "United Kingdom", "customers" : [{"name" : "John Doe", "phone numbers" : [{"type" : "private", "number" : "+4412345678"}]}]}]}');
INSERT INTO json_input2 VALUES(2, '{"data" : [{"country" : "Germany", "customers" : [{"name" : "Hans Meier", "phone numbers" : [{"type" : "private", "number" : "+4912345678"}, {"type" : "mobile", "number" : "+49987654321"}]} ,{"name" : "Achim Schmidt", "phone numbers" : [{"type" : "private", "number" : "+49214365879"}, {"type" : "mobile", "number" : "+4989674523"}]}]}]}');
SELECT id,
JSON_EXTRACT(json,
'$.data.country',
'$.data.customers#.name',
'$.data.customers#."phone numbers"#.type',
'$.data.customers#."phone numbers"#.number'
)
EMITS(
country VARCHAR(100),
name VARCHAR(100),
phone_number_type VARCHAR(100),
phone_number VARCHAR(100)
)
FROM json_input2
ORDER BY id, name, phone_number;
Result
| ID | COUNTRY | NAME | PHONE_NUMBER_TYPE | PHONE_NUMBER |
| 1 | United Kingdom | John Doe | private | +4412345678 |
| 2 | Germany | Achim Schmidt | private | +49214365879 |
| 2 | Germany | Achim Schmidt | mobile | +4989674523 |
| 2 | Germany | Hans Meier | private | +4912345678 |
| 2 | Germany | Hans Meier | mobile | +49987654321 |