Virtual Schema User Guide

The JDBC adapter for virtual schemas allows you to connect to JDBC data sources like Hive, Oracle, Teradata, Exasol, or any other data source supporting JDBC. It uses the IMPORT FROM JDBC statement in the background to obtain the requested data when running a query on a virtual table. A JDBC adapter also serves as the reference adapter for the Exasol virtual schema framework. Exasol provides many Supported Data Sources that you can use to create virtual schemas. You can use the generic dialect to work with any JDBC driver that is not listed in Supported Data Sources.

The user guides for all dialects are provided in their respective repositories. You can check them from Supported Data Sources.

Creating a virtual schema requires the following steps:

Deploy JDBC Driver Files

You need to upload JDBC driver twice, once in EXAoperation and once in BucketFS. The driver uploaded in EXAoperation is used by the EXALoader to run the IMPORT statements generated by the virtual schema adapter. The driver uploaded to BucketFS is used by the virtual schema adapter to read metadata from the data source. Do the following to deploy the JDBC driver files:

  • Upload the JDBC driver files in a bucket so that it is accessible for the adapter script. To know how to create a bucket and upload the driver into a bucket, see BucketFS Setup.
  • Upload the JDBC driver files through EXAoperation. To know how to upload JDBC drivers through EXAoperation, see Manage JDBC Drivers.

Some JDBC drivers may contain multiple files and you need to upload all of them. For more information, see Supported Data Sources.

Install Adapter Script

Run the following statement and create a schema to hold the adapter script.

CREATE SCHEMA SCHEMA_FOR_VS_SCRIPT;

Create an adapter script by running the following script and replace all the placeholders with the paths to your virtual schema libraries (JAR files) and JDBC driver. The exact syntax for your dialect can be found in the respective dialect user guides in Github.

--/
CREATE JAVA ADAPTER SCRIPT SCHEMA_FOR_VS_SCRIPT.JDBC_ADAPTER_SCRIPT AS
  %scriptclass com.exasol.adapter.RequestDispatcher;
  %jar /buckets/your-bucket-fs/your-bucket/virtual-schema-dist-<version>-<dialect>-<version>.jar;
  %jar /buckets/your-bucket-fs/your-bucket/<JDBC driver>.jar;
/

If you notice any issue with your SQL client not identifying where a script ends or truncating scripts, see the SQL Client Troubleshooting section for solutions.

Define a JDBC Connection

Run the following statement and create a JDBC connection.

CREATE CONNECTION JDBC_CONNECTION TO '<jdbc connection string>' USER '<username>' IDENTIFIED BY '<password>';

Create a Virtual Schema

Run the following statement to create the Virtual Schema. The adapter retrieves the metadata through JDBC and maps them to virtual tables. The metadata (virtual tables, columns, and data types) are then cached in Exasol.

CREATE VIRTUAL SCHEMA VS_TEST
USING SCHEMA_FOR_VS_SCRIPT.JDBC_ADAPTER_SCRIPT
WITH
  CONNECTION_NAME = 'JDBC_CONNECTION'
  SCHEMA_NAME     = '<SCHEMA_NAME>';

For a list of all properties supported by the JDBC adapter, see Virtual Schema Properties.

Using the Adapter

The following table lists the commands you can use to manage the adapter: 

Command Description Example
CREATE SCHEMA Creates a virtual schema.
CREATE VIRTUAL SCHEMA VS_TEST USING SCHEMA_FOR_VS_SCRIPT.JDBC_ADAPTER_SCRIPT WITH
  CONNECTION_NAME = 'JDBC_CONNECTION'
  SCHEMA_NAME     = '<SCHEMA_NAME>';
DROP SCHEMA Deletes a virtual schema and all of its virtual tables.
DROP VIRTUAL SCHEMA VS_TEST;
ALTER SCHEMA Modifies the properties of a virtual schema or refreshes the metadata using REFRESH option.
ALTER VIRTUAL SCHEMA VS_TEST REFRESH;
ALTER VIRTUAL SCHEMA VS_TEST REFRESH TABLES T1 T2; 
-- refresh only these tables
ALTER VIRTUAL SCHEMA VS_TEST SET TABLE_FILTER='MY_TABLE, CLICKS';
ALTER VIRTUAL SCHEMA VS_TEST SET TABLE_FILTER=null;
EXPLAIN VIRTUAL Analyzes the resulting queries for external system that are created by the Exasol compiler.
EXPLAIN VIRTUAL SELECT * FROM MY_TABLE;
OPEN SCHEMA and SELECT Explore the tables in the virtual schema, just like a regular schema.
OPEN SCHEMA VS_TEST;
SELECT * FROM MY_TABLE;
SELECT count(*) FROM MY_TABLE;
SELECT DISTINCT USER_ID FROM MY_TABLE;

Virtual Schema Limitations

When developing custom virtual schema dialects, keep the following limits in mind:

Request Limits

The limitations of a virtual schema request are as follow:

  • SQL statements are limited to 64 MiB. This affects the size of CREATE VIRTUAL SCHEMA including all its properties.

  • A single property is limited to two million characters (Exasol's VARCHAR limit).

Response Limits

The limitations of a virtual schema response are as follow:

  • Virtual Schema responses are limited to 2 GiB by protocol buffer.

  • Adapter Notes are limited to two million characters.

  • Value lists are limited by the SQL compiler run time. In general, value lists should not exceed 100.000 entries.

Exasol recommends using IMPORT statement for virtual schema results instead of using value lists. This allows parallel import.

SQL Client Troubleshooting

DBeaver

Problem: If a script contains multiple semicolons or newlines, DBeaver has problem with identifying where a script ends.

Solution: The simplest way to get around that problem is to highlight the whole script and execute it as a single step.

DbVisualizer

Problem: DbVisualizer has problem with delimiting Scripts.

Solution: To tell DbVisualizer that a part of a script should be handled as a single statement, you can insert an SQL block begin-identifier just before the block and an end-identifier after the block.
The delimiter must be the only text on the line. The default begin-identifier consists of two dashes followed by a forward slash (--/) and for the End Identifier it is a single slash (/).