Load data from Google BigQuery

This article explains how to connect and load data from Google BigQuery using Simba BigQuery JDBC driver .

For more information about data type mappings and how to migrate data, see the BigQuery to Exasol migration scripts in our GitHub repository.

Prerequisites

Procedure

Configuring Google BigQuery to connect with Exasol includes the following steps.

Step 1: Create service account

  1. Log in to Google Cloud Console and create a new project or select an existing project.

  2. From the navigation menu, select IAM & Admin > Service accounts.

  3. Click Create Service Account for your project.

  4. Enter the service account details and grant the corresponding BigQuery roles from the Role dropdown list.

  5. Select the checkbox next to Furnish a new private key and ensure JSON is selected as Key type.

  6. Save the JSON file to your computer.

  7. Click Create.

For additional information, refer to Creating and managing service accounts.

Step 2: Add the JDBC driver

To add the JDBC driver, create a configuration file called settings.cfg with the following settings:

DRIVERNAME=BIGQUERY
PREFIX=jdbc:bigquery:
FETCHSIZE=100000
INSERTSIZE=-1
NOSECURITY=YES

Follow the procedure described in Add JDBC Driver to upload the settings.cfg configuration and JDBC driver files to Exasol.

To upload all files in your current directory to BucketFS, run the following command:

find -name '*' -type f -exec curl --insecure -X PUT -T {} https://w:$WRITE_PW@$DATABASE_NODE_IP:2581/default/drivers/jdbc/bigquery/{} \;

Replace WRITE_PW and DATABASE_NODE_IP in the example with the appropriate values.

The option --insecure or -k tells curl to bypass the TLS certificate check. This option allows you to connect to a HTTPS server that does not have a valid certificate. Only use this option if certificate verification is not possible and you trust the server.

Step 3: Upload JSON key file to BucketFS.

Upload the saved JSON key file to the same bucket you uploaded the JDBC driver to. For more information, see Manage Buckets and Files in BucketFS.

Step 4: Create database connection

Create the database connection using the following pattern:

CREATE CONNECTION BQ_CON TO 'jdbc:bigquery://https://www.googleapis.com/bigquery/v2:443;ProjectId=<your-project-id>;OAuthType=0;Timeout=10000;OAuthServiceAcctEmail=<your-service-account>;OAuthPvtKeyPath=/exa/data/bucketfs/<bucketfs-service>/.dest/<bucket-name>/<your-account-keyfile>;Location=<location>[;EnableSession=1]';

The Location parameter defines where data can be stored and processed. The location can be either a region (a specific geographic place, such as Frankfurt) or a multi-region (EU or US). For more information, see BigQuery locations.

The EnableSession=1 property is only required when using Simba BigQuery JDBC Driver version 1.3.0.1001 or later. For more information, see BigQuery: Transaction control statements are supported only in scripts or sessions.

Examples

Connection details:

  • Project ID: exa-migration
  • Service account: migration-test@exa-migration.iam.gserviceaccount.com
  • BucketFS service: bfsdefault
  • Bucket name: default
  • JSON key-file: /drivers/jdbc/bigquery/my-key.json
  • Location: EU (multi-region)

The resulting connection statement:

CREATE CONNECTION BQ_CON TO 'jdbc:bigquery://https://www.googleapis.com/bigquery/v2:443;ProjectId=exa-migration;OAuthType=0;Timeout=10000;OAuthServiceAcctEmail=migration-test@exa-migration.iam.gserviceaccount.com;OAuthPvtKeyPath=/exa/data/bucketfs/bfsdefault/.dest/default/drivers/jdbc/bigquery/my-key.json;Location=EU;EnableSession=1';

Test your connection using this statement:

IMPORT FROM JDBC AT BQ_CON STATEMENT 'SELECT 1';

Load data

Use IMPORT to load data from a table or SQL statement using the connection that you created.