How Virtual Schemas Work
This article explains step by step how virtual schemas access data on data sources.
When you access the data of a virtual schema that uses a Java or Python-based adapter, a container of the corresponding language is started on one node. The code of the adapter script is loaded in this container. Exasol interacts with the adapter script using a request-response protocol encoded in JSON. The database takes the active part sending the request by invoking a callback method.
When using a Lua-based adapter, a container approach is not used.
The following is an example of accessing a virtual table:
In this example, the following activities take place in the backend:
-
Exasol determines that a virtual table is involved, looks up the corresponding adapter script, and starts the language container on a single node in the Exasol cluster.
-
Exasol sends a request to the adapter, asking for the capabilities of the adapter.
-
The adapter returns a response including the supported capabilities – for example, whether it supports specific
WHERE
clause filters or specific scalar functions. -
Exasol sends an appropriate pushdown request by considering the specific adapter capabilities. For example, the information for column projections (in the example above, only the single column name is necessary) or filter conditions is included.
-
The adapter processes this request and sends back a certain SQL query in Exasol syntax which will be executed afterward. This query is an IMPORT statement or a SELECT statement including a row-emitting UDF script that handles the data processing.
-
The received data is directly integrated into the overall query execution of Exasol.
The example could be transformed into the following two alternatives, using IMPORT
or SELECT
:
Example using IMPORT
In this alternative, the adapter can handle filter conditions and creates an IMPORT command including a statement that is sent to the external system.
SELECT name FROM (
IMPORT FROM JDBC AT ...
STATEMENT 'SELECT name from remoteschema.users WHERE name LIKE ''A%''' );
Example using SELECT
In this alternative, a UDF script is used with two parameters handing over the address of the data source and the column projection, but not using any logic for the filter condition. This would then be processed by Exasol rather than by the data source.
SELECT name FROM (
SELECT GET_FROM_MY_DATA_SOURCE('data-source-address', 'required_column=name') )
WHERE name LIKE 'A%';
The examples show the fully transformed query, while only the inner statements are created by the adapter.
For more information, see also the Exasol Virtual Schema repository on GitHub.