JSON_EXTRACT
Purpose
This function extracts one or more SQL values from a valid JSON string. For each SQL value, JSON_EXTRACT
requires a JSON path expression that specifies the value to extract.
Syntax
json_extract::=
json_input_clause::=
Usage Notes
- The parameter
json_expr
defines the input JSON string for the function. Exasol accepts an arbitrary expression of type(VAR)CHAR
. To return an SQL value,json_expr
has to follow the syntax rules on JSON.org. JSON_EXTRACT
is based on the Exasol UDF framework and behaves like a SCALAR EMITS UDF. Thus, the same restrictions apply (for example, there is no Virtual Schema pushdown). AlthoughJSON_EXTRACT
relies on the UDF framework, internally, it uses a fast native JSON processing framework.- Each
json_path_expr
parameter defines a single JSON path forJSON_EXTRACT
. The function accepts an arbitrary expression of type(VAR)CHAR
for each path. For the path syntax rules in Exasol, refer to the JSON Path Expressions section. JSON_EXTRACT
supports an arbitrary number ofjson_path_expr
parameters. For eachjson_path_expr
, the EMITS clause has to define acolumn_name
and adata_type
for the JSON values it returns. The order of the columns in the EMITS clause corresponds to the order of thejson_path_expr
parameters. Thus, the firstcolumn_name
anddata_type
belongs to the firstjson_path_expr
.- If any
json_path_expr
does not identify an SQL value,JSON_EXTRACT
throws an error. Thus,ERROR ON EMPTY
is the default for eachjson_path_expr
. - If any error occurs during the JSON processing,
JSON_EXTRACT
throws an error by default. Thus,ERROR ON ERROR
is the default for eachjson_path_expr
. - If the last
json_path_expr
parameter is the string literal'$.error()'
, the error behavior ofJSON_EXTRACT
changes toLOG ON EMPTY
andLOG ON ERROR
for alljson_path_expr
parameters. - If
'$.error()'
is present, the EMITS clause requires an extra'error column'
withdata_type (VAR)CHAR
to log all occurring errors. It is possible to freely choose thecolumn_name
and(VAR)CHAR
size for the'error column'
in the EMITS clause. - To specify the
ON EMPTY
andON ERROR
behavior for each path separately, refer to the JSON Path Expressions section. - If the error behavior for a path is
LOG ON EMPTY
and the path does not identify an SQL value,JSON_EXTRACT
returnsNULL
for the path and writes the problem into the'error column'
. - If the error behavior for a path is
LOG ON ERROR
and an error occurs during the JSON processing,JSON_EXTRACT
returnsNULL
for the path and writes the problem into the'error column'
. - To learn more about the different possible errors, refer to the JSON Error Handling section.
- With
JSON_EXTRACT
, it is possible to extract several rows from a singlejson_expr
. This makes it possible to create an extra row for each element in a JSON array. To extract several rows, thejson_path_expr
has to contain a'#'
character to split up all expressions identified by a nested path into several rows. - All nested path prefixes ending with a
'#'
character must be a prefix of the longest nested path.
Example
CREATE OR REPLACE TABLE json_input(id INT, json VARCHAR(2000000));
INSERT INTO json_input VALUES(1, '{"firstname" : "Ann", "surname" : "Smith", "age" : 29}');
INSERT INTO json_input VALUES(2, '{"firstname" : "Sue", "surname" : "Smith", "age" : 42}');INSERT INTO json_input VALUES(3, '{"firstname" : "Claire", "surname" : "Doe", "age" : "Five"}');
INSERT INTO json_input VALUES(4, '{"firstname" : "John", "name" : "Doe", "age" : 50}');
SELECT id,
JSON_EXTRACT (json,
'$.firstname',
'$.surname',
'$.age',
'$.error()'
)
EMITS(
forename VARCHAR(100),
surname VARCHAR(100),
age INT,
error_column VARCHAR(2000000)
)
FROM json_input;
Result
ID | FIRSTNAME | SURNAME | AGE | ERROR_COLUMN |
1 | Ann | Smith | 29 | NULL |
2 | Sue | Smith | 42 | NULL |
3 | Claire | Doe | NULL | [{ "column" : "AGE", "error" : "cast error" }] |
4 | John | NULL | 50 | [{ "column" : "SURNAME", "error" : "no SQL/JSON item" }] |
CREATE OR REPLACE TABLE json_input2(id INT, json VARCHAR(2000000));
INSERT INTO json_input2 VALUES(1, '{"data" : [{"country" : "United Kingdom", "customers" : [{"name" : "John Doe", "phone numbers" : [{"type" : "private", "number" : "+4412345678"}]}]}]}');
INSERT INTO json_input2 VALUES(2, '{"data" : [{"country" : "Germany", "customers" : [{"name" : "Hans Meier", "phone numbers" : [{"type" : "private", "number" : "+4912345678"}, {"type" : "mobile", "number" : "+49987654321"}]} ,{"name" : "Achim Schmidt", "phone numbers" : [{"type" : "private", "number" : "+49214365879"}, {"type" : "mobile", "number" : "+4989674523"}]}]}]}');
SELECT id,
JSON_EXTRACT(json,
'$.data.country',
'$.data.customers#.name',
'$.data.customers#."phone numbers"#.type',
'$.data.customers#."phone numbers"#.number'
)
EMITS(
country VARCHAR(100),
name VARCHAR(100),
phone_number_type VARCHAR(100),
phone_number VARCHAR(100)
)
FROM json_input2
ORDER BY id, name, phone_number;
Result
ID | COUNTRY | NAME | PHONE_NUMBER_TYPE | PHONE_NUMBER |
1 | United Kingdom | John Doe | private | +4412345678 |
2 | Germany | Achim Schmidt | private | +49214365879 |
2 | Germany | Achim Schmidt | mobile | +4989674523 |
2 | Germany | Hans Meier | private | +4912345678 |
2 | Germany | Hans Meier | mobile | +49987654321 |