Load data from local or remote files
Learn how to load data into Exasol from files stored on a local or remote system.
This article explains how to use the IMPORT statement to load data into Exasol from files stored on a local or a remote system. When importing files from a remote system, the FTP/FTPS, SFTP, and HTTP/HTTPS protocols are supported.
Files can be imported in parallel across nodes, enabling higher throughput for larger datasets. Single files can be read using multiple parallel operations, which results in faster and more efficient data read operations.
To import from a local file you must use the Exasol JDBC driver. During import, the driver starts an internal HTTP/HTTPS server to stream data into the cluster.
Supported formats
Exasol supports importing CSV, FBV, and Apache Parquet files. The Apache Iceberg table format is supported in Parquet files.
All Parquet import features are fully supported in Exasol 2026.1.0 and later, with partial support in earlier versions. For more details about which features are supported in your Exasol version, see the Release notes.
-
To learn about data formatting in CSV and FVB files, see CSV and FBV file formats and details.
-
To learn about the options and considerations for Parquet files, see Importing Parquet files.
-
For more details about how to use the IMPORT command, see IMPORT.
Create connection to remote system
To prevent connection details from being exposed in logs and audit tables, we recommend that you always create a connection object using CREATE CONNECTION and authenticate using the saved connection instead of providing the details openly in the IMPORT/EXPORTstatements.
Example:
CREATE CONNECTION my_conn
TO 'ftp://<my_server>/path_to_file/'
USER '<my_username>'
IDENTIFIED BY '<my_password>';
Import CSV/FVB files
-
Create a table in Exasol with column names and types that match the source.
-
Use one of the following methods depending on the source file location:
Local file
CopyIMPORT INTO <target_table>
FROM LOCAL CSV FILE '/path/to/filename.csv';Remote file
CopyIMPORT INTO <target_table>
FROM CSV AT '<connection_string>'
FILE '<path_to_file';
File options
You can specify options with IMPORT such as column separator/delimiter, row separator (line break character), and character encoding. You can also specify rows that should be omitted in the import (for example, a header row), and whether to remove spaces before/after columns in a CSV file.
To learn about the available file options for the IMPORT command, see file_opts.
Example:
-- CSV file
IMPORT INTO my_table
FROM LOCAL CSV FILE '/path/to/filename.csv'
ENCODING = 'UTF-8'
SKIP = 1
;
Specify data format
If data in the imported file does not match the default format for the data type (such as date and timestamp formats), you can either change the default format for the session or specify the format for a column explicitly in the IMPORT statement.
To learn more about data formats, see Format models.
Example 1 - change the default date format for the current session:
ALTER SESSION SET NLS_DATE_FORMAT='DD.MM.YYYY';
Example 2 - specify the date format for column 6 in the IMPORT statement:
...
FILE 'myfile.csv' (1..5,6 format='DD.MM.YYYY',7..12)
...
Import Parquet files
You can use IMPORT to load data from Apache Parquet files that are stored locally or on a remote server using FTP/FTPS, SFTP, and HTTP/HTTPS.
Authentication and encryption methods when importing Parquet files are the same as when loading data from a file in CSV/FBV format. This includes SSL/TLS, certificate validation, user/password, and public key auth.
This section describes how to import Parquet files from local or remote systems. For more details about data formatting and options when importing Parquet files, see Importing Parquet files.
All Parquet import features are fully supported in Exasol 2026.1.0 and later, with partial support in earlier versions. For more details about which features are supported in your Exasol version, see the Release notes.
Import from local file is only supported with the Exasol JDBC driver.
Syntax
-- import local file
IMPORT INTO <target_table>
FROM LOCAL PARQUET
FILE '<path_to_file>';
-- import file from remote system
IMPORT INTO <target_table>
FROM PARQUET AT '<connection_string>'
FILE '<path_to_file>';
Examples
Import local parquet file:
IMPORT INTO my_schema.my_table
FROM LOCAL PARQUET
FILE '/home/user/test.parquet'
;
Import file from remote system:
-- Create connection object
CREATE CONNECTION my_conn
TO 'ftp://example.com/path_to_file/'
USER 'my_username'
IDENTIFIED BY 'my_password'
;
-- Import file
IMPORT INTO my_schema.my_table
FROM PARQUET AT 'my_conn'
FILE 'test.parquet'
;
Import multiple Parquet files
You can import multiple Parquet files in a single statement.
Example:
IMPORT INTO my_schema.my_table
FROM LOCAL PARQUET
FILE '/home/user/test.parquet'
FILE '/home/user/test2.parquet'
FILE '/home/user/test3.parquet'
;
Configuration parameters
When importing local or remote Parquet files you can specify the following configuration parameters :
| Parameter | Description | Validation |
|---|---|---|
MaxConnections
|
Number of Parquet files imported in parallel on a single node. |
Must be > 0 |
MaxConcurrentReads
|
Number of parallel read operations per file. Set to
Set to |
Must be > 0 |
MaxBatchFetchSize
|
Maximum simultaneous buffers fetched during import. | Must be > 0 |
MaxRows
|
Maximum number of rows to read in a batch. | Must be > 0 |
MaxRetry
|
Maximum number of attempts to connect. | Must be ≥ 0 |
SkipCols
|
Skip specific column indexes during import. Example: |
Only integers (
A range must have exactly two integers, one before and one after the two dots (
The starting value must be less than or equal to the ending value. For example, |
For remote systems the configuration parameters must be added to the connection string. For local import you must specify the parameters using the CONFIG clause.
Examples:
-- local file
IMPORT INTO my_schema.my_table
FROM LOCAL PARQUET
CONFIG 'MaxConnections=2;MaxBatchFetchSize=1;MaxConcurrentReads=1'
FILE '/home/user/test.parquet'
;
-- remote file (using saved connection)
IMPORT INTO my_schema.my_table
FROM PARQUET
AT 'my_conn;MaxConnections=2;MaxBatchFetchSize=1'
FILE '/home/user/test.parquet'
;
MaxConcurrentReads must be 1 for local Parquet imports. Other values may cause connection or read failures.
Remote server requirements
For optimal performance, the remote server must support HEAD and range requests.
Range request support enables selective row group fetching and parallel processing. If range requests are not supported the whole file must be downloaded, which means that the buffer must be large enough to contain the file. Setting an adequate buffer size for the import is therefore crucial. If the buffer is too small, an error is thrown.
HEAD requests are used to retrieve metadata about a file before downloading it. If HEAD requests are not supported by the server, Exasol cannot determine the file size and will download the whole file using a single thread, regardless of the configuration settings.
Advantages when range requests are supported:
-
Only the required file segments (row groups and metadata) are downloaded
-
Imports require less bandwith and memory usage
-
Allows selective data reading
-
Enables parallel row group fetching
-
Speeds up queries
Advantages when HEAD requests are supported:
-
Allows Exasol to determine the file size before downloading
-
Required for range-based fetching strategy
Fallback behavior when range and/or HEAD requests are not supported:
-
The entire file is downloaded into memory (limited by buffer size)
-
If the file size exceeds the buffer size the file will be downloaded twice
-
Buffer size configuration is crucial
-
Only one thread processes the file, meaning slower throughput
|
Server support |
Behavior |
Parallel processing |
Selective reading |
Performance |
|---|---|---|---|---|
|
Range and HEAD |
Range-based downloading |
✅ |
✅ |
Optimal - fastest, lowest bandwidth |
|
HEAD only |
Whole file |
❌ |
❌ |
Moderate - full file download required |
|
Range only |
Whole file (fallback) |
❌ |
❌ |
Moderate – fallback to full file download because Exasol cannot determine file size |
|
Neither |
Whole file |
❌ |
❌ |
Low - full file download required |
When full file download is required, the maximum parallel reads per file is automatically set to 1 regardless of the database or connection configuration.
Setting the buffer size
You can specify the maximum buffer size for the import by using the option MaxBufferSize in the connection string. If this parameter is not set, the value set in the global database parameter etlParquetMaxBufferSize is used.
To avoid large downloads, you can enforce range-based fetching by setting a negative buffer size such as -1. If range requests are not supported by the server, the file will not be imported.
|
Parameter |
Purpose |
|---|---|
|
|
Override buffer size for a single import 0 = Unlimited buffer size > 0 = Limited buffer size in bytes. For example: < 0 = Enforces range-based fetching. |
Example - increasing the per-query buffer:
IMPORT INTO large_dataset
FROM PARQUET AT 'my_conn;MaxBufferSize=1073741824'
FILE 'huge_file.parquet';
Example - enforcing range-based fetching:
IMPORT INTO my_table
FROM PARQUET AT 'my_conn;MaxBufferSize=-1'
FILE 'data.parquet';
Best practices
-
For optimal speed and selective reading, only use servers that support range and HEAD requests.
-
Set an adequate buffer size in the connection string when necessary.
-
For strict data egress control, set
MaxBufferSize=-1to disallow downloads from servers that do not support range requests. -
Use secure protocols (HTTPS, FTPS, SFTP) for sensitive data.
Error handling
Errors can be written to a table in the database or to a local CSV file. For more details, see error_clause.
Write to error table:
IMPORT INTO my_schema.my_table
FROM LOCAL CSV
FILE '/home/user/test.csv'
ERRORS INTO my_schema.error_table
REJECT LIMIT 1 ERRORS
;
Write to local CSV file:
IMPORT INTO my_schema.my_table
FROM LOCAL CSV
FILE '/home/user/test.csv'
ERRORS INTO LOCAL CSV
FILE '/home/user/error/error.csv'
REJECT LIMIT 1 ERRORS
;
All WITH clauses in a statement must be defined before the ERROR clause.
Monitoring
To monitor the progress of the import process, you can open a new session in your SQL client and connect to the database, then select the information from the corresponding system table. The number of rows that have been read till now are displayed in the column ACTIVITY.
SELECT * FROM EXA_DBA_SESSIONS;
Examples
The following example imports data from a local CSV file, applies custom data and timestamp formats for two columns, defines the encoding and separator/delimiter characters, and specifies that the first row should be omitted in the import.
The REJECT LIMIT 0 clause specifies that no invalid rows in the source file are allowed. For more information, see error_clause.
IMPORT INTO DWH.STATE_DIM
FROM LOCAL CSV
FILE 'C:\Work\SampleData\STATE_DIM.csv' (1..3,4 FORMAT = 'DD/MM/YYYY HH24:MI:SS.FF6', 5 FORMAT = 'DD/MM/YYYY HH24:MI:SS.FF6')
ENCODING = 'UTF-8'
ROW SEPARATOR = 'CRLF'
COLUMN SEPARATOR = ','
COLUMN DELIMITER = '"'
SKIP = 1
REJECT LIMIT 0
;
For more examples, see IMPORT.