How it works

Every time you access data of a virtual schema on a node of a cluster, a container of the corresponding language is started (for example, a JVM or a Python container). In this container, the code of the adapter script is loaded. Exasol interacts with the adapter script using a request-response protocol encoded in JSON. The database takes the active part sending the request by invoking a callback method.

Here is an example of accessing a virtual table.

SELECT name FROM VIRTUAL_SCHEMA_TEST.users WHERE name like 'A%';

In the above example, following activities go on in the back-end.

  1. Exasol determines that a virtual table is involved, looks up the corresponding adapter script and starts the language container on a single node in the Exasol cluster.
  2. Exasol sends a request to the adapter, asking for the capabilities of the adapter.
  3. The adapter returns a response including the supported capabilities, for example, whether it supports specific WHERE clause filters or specific scalar functions.
  4. Exasol sends an appropriate pushdown request by considering the specific adapter capabilities. For example, the information for column projections (in the example above, only the single column name is necessary) or filter conditions is included.
  5. The adapter processes this request and sends back a certain SQL query in Exasol syntax which will be executed afterward. This query is an IMPORT statement or a SELECT statement including an row-emitting UDF script which cares about the data processing.
  6. The example above could be transformed into these two alternatives (IMPORT and SELECT):

    • SELECT name FROM ( IMPORT FROM JDBC AT ... STATEMENT 'SELECT name from remoteschema.users WHERE name LIKE "A%"' );
    • SELECT name FROM ( SELECT GET_FROM_MY_DATA_SOURCE('data-source-address', 'required_column=name') ) WHERE name LIKE 'A%';

    In the first alternative, the adapter can handle filter conditions and creates an IMPORT command including a statement which is sent to the external system.

    In the second alternative, a UDF script is used with two parameters handing over the address of the data source and the column projection, but not using any logic for the filter condition. This would then be processed by Exasol rather than the data source.

    The examples show the fully transformed query while only the inner statements are created by the adapter.

  7. The received data is directly integrated into the overall query execution of Exasol.

To know more about the virtual schema, see the Exasol Virtual Schema repository on GitHub.

If you want to enhance existing adapter or create a new adapter, use the EXPLAIN VIRTUAL statement to analyze what the adapter is pushing down to the underlying system.