Load data from local or remote files

Learn how to load data into Exasol from files stored on a local or remote system.

This article explains how to use the IMPORT statement to load data into Exasol from files stored on a local or a remote system. When importing files from a remote system, the FTP/FTPS, SFTP, and HTTP/HTTPS protocols are supported.

Files can be imported in parallel across nodes, enabling higher throughput for larger datasets. Single files can be read using multiple parallel operations, which results in faster and more efficient data read operations.

To import from a local file you must use the Exasol JDBC driver. During import, the driver starts an internal HTTP/HTTPS server to stream data into the cluster.

Supported formats

Exasol supports importing CSV, FBV, and Apache Parquet files. The Apache Iceberg table format is supported in Parquet files.

All Parquet import features are fully supported in Exasol 2026.1.0 and later, with partial support in earlier versions. For more details about which features are supported in your Exasol version, see the Release notes.

Create connection to remote system

To prevent connection details from being exposed in logs and audit tables, we recommend that you always create a connection object using CREATE CONNECTION and authenticate using the saved connection instead of providing the details openly in the IMPORT/EXPORTstatements.

Example:
Copy
CREATE CONNECTION my_conn
    TO 'ftp://<my_server>/path_to_file/'
    USER '<my_username>' 
    IDENTIFIED BY '<my_password>';

Import CSV/FVB files

  1. Create a table in Exasol with column names and types that match the source.

  2. Use one of the following methods depending on the source file location:

    Local file

    Copy
    IMPORT INTO <target_table> 
        FROM LOCAL CSV FILE '/path/to/filename.csv';

    Remote file

    Copy
    IMPORT INTO <target_table>
        FROM CSV AT '<connection_string>' 
        FILE '<path_to_file';

File options

You can specify options with IMPORT such as column separator/delimiter, row separator (line break character), and character encoding. You can also specify rows that should be omitted in the import (for example, a header row), and whether to remove spaces before/after columns in a CSV file.

To learn about the available file options for the IMPORT command, see file_opts.

Example:
Copy
-- CSV file
IMPORT INTO my_table 
    FROM LOCAL CSV FILE '/path/to/filename.csv' 
    ENCODING = 'UTF-8'
    SKIP = 1
;

Specify data format

If data in the imported file does not match the default format for the data type (such as date and timestamp formats), you can either change the default format for the session or specify the format for a column explicitly in the IMPORT statement.

To learn more about data formats, see Format models.

Example 1 - change the default date format for the current session:
Copy
ALTER SESSION SET NLS_DATE_FORMAT='DD.MM.YYYY';
Example 2 - specify the date format for column 6 in the IMPORT statement:
Copy
... 
FILE 'myfile.csv' (1..5,6 format='DD.MM.YYYY',7..12
...

Import Parquet files

You can use IMPORT to load data from Apache Parquet files that are stored locally or on a remote server using FTP/FTPS, SFTP, and HTTP/HTTPS.

Authentication and encryption methods when importing Parquet files are the same as when loading data from a file in CSV/FBV format. This includes SSL/TLS, certificate validation, user/password, and public key auth.

This section describes how to import Parquet files from local or remote systems. For more details about data formatting and options when importing Parquet files, see Importing Parquet files.

All Parquet import features are fully supported in Exasol 2026.1.0 and later, with partial support in earlier versions. For more details about which features are supported in your Exasol version, see the Release notes.

Import from local file is only supported with the Exasol JDBC driver.

Syntax

Copy
-- import local file
IMPORT INTO <target_table>
    FROM LOCAL PARQUET 
    FILE '<path_to_file>';

-- import file from remote system
IMPORT INTO <target_table>
    FROM PARQUET AT '<connection_string>' 
    FILE '<path_to_file>';

Examples

Import local parquet file:
Copy
IMPORT INTO my_schema.my_table
    FROM LOCAL PARQUET 
    FILE '/home/user/test.parquet'
;
Import file from remote system:
Copy
-- Create connection object
CREATE CONNECTION my_conn
    TO 'ftp://example.com/path_to_file/'
    USER 'my_username' 
    IDENTIFIED BY 'my_password'
;

-- Import file
IMPORT INTO my_schema.my_table
    FROM PARQUET AT 'my_conn' 
    FILE 'test.parquet'
;

Import multiple Parquet files

You can import multiple Parquet files in a single statement.

Example:
Copy
IMPORT INTO my_schema.my_table
    FROM LOCAL PARQUET
    FILE '/home/user/test.parquet'
    FILE '/home/user/test2.parquet'
    FILE '/home/user/test3.parquet'
;

Configuration parameters

When importing local or remote Parquet files you can specify the following configuration parameters :

Parameter Description Validation
MaxConnections

Number of Parquet files imported in parallel on a single node.

Must be > 0
MaxConcurrentReads

Number of parallel read operations per file.

Set to 1 for sequential reads.

0 is not allowed.

Set to 1 when importing local files.

Must be > 0
MaxBatchFetchSize Maximum simultaneous buffers fetched during import. Must be > 0
MaxRows Maximum number of rows to read in a batch. Must be > 0
MaxRetry Maximum number of attempts to connect. Must be ≥ 0
SkipCols

Skip specific column indexes during import.

Example: SkipCols=1,3..8,11 excludes columns 1, 3, 4, 5, 6, 7, 8, and 11. All other columns remain part of the imported dataset.

Only integers (0to 9), commas (,) to separate values, and ranges with two dots (..) are allowed.

A range must have exactly two integers, one before and one after the two dots (..). For example, 3..5 is valid, but 3.. or ..5 are invalid.

The starting value must be less than or equal to the ending value. For example, 3..5 is valid, but 5..3 is invalid.

For remote systems the configuration parameters must be added to the connection string. For local import you must specify the parameters using the CONFIG clause.

Examples:
Copy
-- local file
IMPORT INTO my_schema.my_table
    FROM LOCAL PARQUET
    CONFIG 'MaxConnections=2;MaxBatchFetchSize=1;MaxConcurrentReads=1'
    FILE '/home/user/test.parquet'
;

-- remote file (using saved connection)
IMPORT INTO my_schema.my_table
    FROM PARQUET
    AT 'my_conn;MaxConnections=2;MaxBatchFetchSize=1'
    FILE '/home/user/test.parquet'
;

MaxConcurrentReads must be 1 for local Parquet imports. Other values may cause connection or read failures.

Remote server requirements

For optimal performance, the remote server must support HEAD and range requests.

Range request support enables selective row group fetching and parallel processing. If range requests are not supported the whole file must be downloaded, which means that the buffer must be large enough to contain the file. Setting an adequate buffer size for the import is therefore crucial. If the buffer is too small, an error is thrown.

HEAD requests are used to retrieve metadata about a file before downloading it. If HEAD requests are not supported by the server, Exasol cannot determine the file size and will download the whole file using a single thread, regardless of the configuration settings.

Advantages when range requests are supported:

  • Only the required file segments (row groups and metadata) are downloaded

  • Imports require less bandwith and memory usage

  • Allows selective data reading

  • Enables parallel row group fetching

  • Speeds up queries

Advantages when HEAD requests are supported:

  • Allows Exasol to determine the file size before downloading

  • Required for range-based fetching strategy

Fallback behavior when range and/or HEAD requests are not supported:

  • The entire file is downloaded into memory (limited by buffer size)

  • If the file size exceeds the buffer size the file will be downloaded twice

  • Buffer size configuration is crucial

  • Only one thread processes the file, meaning slower throughput

Server support

Behavior

Parallel processing

Selective reading

Performance

Range and HEAD

Range-based downloading

Optimal - fastest, lowest bandwidth

HEAD only

Whole file

Moderate - full file download required

Range only

Whole file (fallback)

Moderate – fallback to full file download because Exasol cannot determine file size

Neither

Whole file

Low - full file download required

When full file download is required, the maximum parallel reads per file is automatically set to 1 regardless of the database or connection configuration.

Setting the buffer size

You can specify the maximum buffer size for the import by using the option MaxBufferSize in the connection string. If this parameter is not set, the value set in the global database parameter etlParquetMaxBufferSize is used.

To avoid large downloads, you can enforce range-based fetching by setting a negative buffer size such as -1. If range requests are not supported by the server, the file will not be imported.

Parameter

Purpose

MaxBufferSize

Override buffer size for a single import

0 = Unlimited buffer size

> 0 = Limited buffer size in bytes. For example: MaxBufferSize=1024

< 0 = Enforces range-based fetching.

Example - increasing the per-query buffer:
Copy
IMPORT INTO large_dataset
    FROM PARQUET AT 'my_conn;MaxBufferSize=1073741824'
    FILE 'huge_file.parquet';
Example - enforcing range-based fetching:
Copy
IMPORT INTO my_table
    FROM PARQUET AT 'my_conn;MaxBufferSize=-1'
    FILE 'data.parquet';

Best practices

  • For optimal speed and selective reading, only use servers that support range and HEAD requests.

  • Set an adequate buffer size in the connection string when necessary.

  • For strict data egress control, set MaxBufferSize=-1 to disallow downloads from servers that do not support range requests.

  • Use secure protocols (HTTPS, FTPS, SFTP) for sensitive data.

Error handling

Errors can be written to a table in the database or to a local CSV file. For more details, see error_clause.

Write to error table:
Copy
IMPORT INTO my_schema.my_table
    FROM LOCAL CSV 
    FILE '/home/user/test.csv'
    ERRORS INTO my_schema.error_table
    REJECT LIMIT 1 ERRORS
;
Write to local CSV file:
Copy
IMPORT INTO my_schema.my_table
    FROM LOCAL CSV 
    FILE '/home/user/test.csv'
    ERRORS INTO LOCAL CSV 
    FILE '/home/user/error/error.csv'
    REJECT LIMIT 1 ERRORS
;

All WITH clauses in a statement must be defined before the ERROR clause.

Monitoring

To monitor the progress of the import process, you can open a new session in your SQL client and connect to the database, then select the information from the corresponding system table. The number of rows that have been read till now are displayed in the column ACTIVITY.

Copy
SELECT * FROM EXA_DBA_SESSIONS;

Examples

The following example imports data from a local CSV file, applies custom data and timestamp formats for two columns, defines the encoding and separator/delimiter characters, and specifies that the first row should be omitted in the import.

The REJECT LIMIT 0 clause specifies that no invalid rows in the source file are allowed. For more information, see error_clause.

Copy
IMPORT INTO DWH.STATE_DIM
    FROM LOCAL CSV 
    FILE 'C:\Work\SampleData\STATE_DIM.csv' (1..3,4 FORMAT = 'DD/MM/YYYY HH24:MI:SS.FF6', 5 FORMAT = 'DD/MM/YYYY HH24:MI:SS.FF6')
    ENCODING = 'UTF-8'
    ROW SEPARATOR = 'CRLF'
    COLUMN SEPARATOR = ','
    COLUMN DELIMITER = '"'
    SKIP = 1
    REJECT LIMIT 0
;

For more examples, see IMPORT.