Load data from Apache Hive

This article explains how to connect and load data from Apache Hive using the Cloudera Hive JDBC driver.

Prerequisites

  • Hive must be reachable from the Exasol system.

  • The user credentials in the connection must be valid.

Download driver

Download the compatible Cloudera Hive JDBC driver.

If you want to use another driver, contact Support.

Configure driver in EXAoperation

You can follow the below steps to configure the drive in EXAoperation. Additionally, refer to the Manage JDBC Drivers section for more information.

  1. Log in to EXAoperation user interface as an Administrator user.
  2. Select Configuration > Software and click the JDBC Drivers tab.
  3. Click Add to add the JDBC driver details.
  4. Enter the following details for the JDBC properties:
    • Driver Name: HiveCloudera
    • Main Class: com.cloudera.hive.jdbc41.HS2Driver
    • Prefix: jdbc:hive2:
    • Disable Security Manager: Depending on the driver version, this option may need to be checked to bypass restrictions that may otherwise prevent the driver from functioning properly.
    • Comment: This is an optional field.
  5. Click Add to save the settings.
  6. Select the radio button next to the driver from list of JDBC driver.
  7. Click Choose File to locate the downloaded driver and click Upload to upload the JDBC driver.

Create connection

To create the connection, use the following statements. Replace the connection string and credentials as needed.

-- Connecting EXAloader (the native bulk loader in Exasol) via Cloudera Hive driver 
-- to Cloudera, MapR, and Hortonworks Hadoop distributions

-- cloudera-quickstart-vm-5.13.0-0-vmware
create or replace connection hive_conn to 'jdbc:hive2://192.168.42.133:10000' user 
'cloudera' identified by 'cloudera';

-- MapR-Sandbox-For-Hadoop-6.1.0-vmware
create or replace connection hive_conn to 'jdbc:hive2://192.168.42.134:10000' user 
'mapr' identified by 'mapr';

-- Azure Hortonworks Sandbox with HDP 2.6.4
create or replace connection hive_conn to 'jdbc:hive2://192.168.42.1:10000' user 
'raj_ops' identified by 'raj_ops';

To test the connection, use the following statements:

-- EXPORT test for Cloudera driver
export exa_syscat
    into jdbc driver = 'HiveCloudera' at hive_conn table exa_syscat created by
    'create table exa_syscat (schema_name varchar(128), object_name varchar(128),
 object_type varchar(15), object_comment varchar(2000))'
    replace;

-- IMPORT test for Cloudera driver
import into(schema_name varchar(128), object_name varchar(128), object_type varchar(15), 
object_comment varchar(2000))
    from jdbc driver = 'HiveCloudera' at hive_conn table exa_syscat;

Load data

Use IMPORT to load data from a table or SQL statement using the connection that you created.

The IMPORT statement allows you to use Kerberos connections in your connection string. For more details, see the example in connection_def.