How Virtual Schemas Work
Every time you access the data of a virtual schema, a container of the corresponding language is started (for example, a JVM or a Python container) on one node. In this container, the code of the adapter script is loaded. Exasol interacts with the adapter script using a request-response protocol encoded in JSON. The database takes the active part sending the request by invoking a callback method.
Here is an example of accessing a virtual table.
In the above example, following activities go on in the back-end.
- Exasol determines that a virtual table is involved, looks up the corresponding adapter script and starts the language container on a single node in the Exasol cluster.
- Exasol sends a request to the adapter, asking for the capabilities of the adapter.
- The adapter returns a response including the supported capabilities, for example, whether it supports specific WHERE clause filters or specific scalar functions.
- Exasol sends an appropriate pushdown request by considering the specific adapter capabilities. For example, the information for column projections (in the example above, only the single column name is necessary) or filter conditions is included.
- The adapter processes this request and sends back a certain SQL query in Exasol syntax which will be executed afterward. This query is an IMPORT statement or a SELECT statement including a row-emitting UDF script which handles the data processing.
- The received data is directly integrated into the overall query execution of Exasol.
The example above could be transformed into these two alternatives (IMPORT and SELECT):
In the first alternative, the adapter can handle filter conditions and creates an IMPORT command including
a statement which is sent to the external system.
In the second alternative, a UDF script is used with two
parameters handing over the address of the data source and the column projection, but not using any logic
for the filter condition. This would then be processed by Exasol rather than the data source.
The examples show the fully transformed query while only the inner statements are created by the adapter.
To learn more about the virtual schema, see the Exasol Virtual Schema repository on GitHub.
If you want to enhance existing adapter or create a new adapter, use the EXPLAIN VIRTUAL statement to analyze what the adapter is pushing down to the underlying system.