Install Exasol - step by step
This article explains how to install Exasol as an application on Linux hosts.
Overview
You can install Exasol as an application on Linux hosts running either on hardware or on VM instances on a cloud service such as AWS, Azure, or Google Cloud Platform (GCP). The system requirements and procedures for installation and administration are essentially the same regardless of which platform the Linux host is running on.
If your system will be hosted on a cloud service, refer to the documentation from the cloud service provider for information about how to configure network and security settings for your Exasol deployment on that platform.
The installation procedure can be broken down into three major stages: the Preparation stage, where you prepare the installation environment, the Installation stage, where you download the deployment tool and run the installation on the hosts, and the Post-Installation stage, where you can connect to the cluster, upload a license, and carry out any other additional actions to prepare the database for use.
The following diagram is a schematic overview of the installation procedure. You can click in the diagram to navigate directly to the corresponding section in the documentation.
You can run the installation process from a separate Linux system (jump host) or from one of the database host systems. In both cases the host running the installation will require access to the other hosts over SSH.
If you install Exasol on a single host system without using a jump host, SSH is not required for the installation process. We recommend setting up SSH anyway, as this enables passwordless login and makes it easier to extend the installation at a later stage by adding more hosts.
Prerequisites
The database hosts must be pre-installed with one of the recommended Linux distributions and meet the minimum system requirements for an Exasol installation. The following requirements must be checked:
-
The nodes must run on a supported CPU type
-
Storage devices must be of a supported type with RAID configuration as required
-
Each node must have enough storage space for installation and updates
-
The nodes must have a supported operating system version installed and configured as required
-
All software dependencies must be fulfilled for the database nodes (and optional jump host)
For details about the requirements for installing Exasol, see System Requirements.
The following procedure describes how to install Exasol with a user having root privileges on the database hosts. To install Exasol for a non-root user, additional configuration steps are required. For more information, see Rootless Installation.
Preparation
Step 1: Configure the network
IP addresses
The database hosts should be assigned private static IPv4 addresses in the same subnet. For example: 10.0.0.11, 10.0.0.12, 10.0.0.13, 10.0.0.14. DHCP can be used in the network if each host always receives the same IP address.
You can additionally configure a public network to allow direct access to the deployment from outside of the private network. A public network is optional and not required for installation.
For information about how to add private and public IP adresses to the configuration, see Step 6: Create a configuration file.
Internet connection
An internet connection from the jump host is recommended but not required. If the host running the installation is not connected to the internet, the software must be downloaded using another system and copied to the installation host.
Internet connectivity from the database hosts is not required.
Firewall configuration
To operate the Exasol database after installation, traffic must be allowed through the firewall on the following ports:
Service | Default port | Protocol |
---|---|---|
SQL client connections to the database | 8563 | TCP |
SSH access to all cluster nodes | 20002 | TCP |
HTTPS access to the Administration API | 4444 | TCP |
NTP | 123 | TCP/UDP |
DNS | 53 | TCP/UDP |
LDAP (optional) | 389 | TCP/UDP |
For more information about the default ports used by Exasol, see Firewall and Port Settings.
Step 2: Create the installation user
On each database host, create a dedicated user for the Exasol software installation. The user must have sudo privileges and a system shell that allows access over SSH. The name of the installation user can be set freely within the restrictions of the operating system, but must be identical on all hosts. In the following examples, the installation user is called exasol
.
To install Exasol for a non-root user, additional configuration steps are required. For more information, see Rootless Installation.
In the following examples the installation user is created by a user that has sudo privileges. If you carry out this operation as root, the sudo command should be omitted.
-
Create the user
exasol
with a home directory/home/user
: -
Add the user to the
sudoers
group: -
Assign a password for the user:
-
Log out (or create a new session) and log in as the user
exasol
. -
Verify that the user has sudo privileges using
sudo whoami
. The command should returnroot
. -
Repeat the above steps on all the database hosts.
Step 3: Set up SSH authentication
If you install Exasol 8 on a single host system without using a jump host, SSH is not required for the installation process.
Generate an SSH key pair
-
On the host that will run the installation, open a terminal or command prompt.
-
Run the command
ssh-keygen -t rsa
to generate a new RSA key pair. -
Choose a location to save the key pair (the default location is normally
~/.ssh/id_rsa
).
For added security you can provide a passphrase for the keypair.
Copy the public key to the remote hosts
-
Run the command
ssh-copy-id <username>@<remote-node-ip>
. Replace<username>
and<remote-node-ip>
with the actual username of the installation user and the IP address of the remote host.Enter the password for the user on the remote host when prompted.
-
Repeat this step for each host by substituting
<remote-node-ip>
with the IP address of the respective host. The user should be the same on all hosts.
Test the key-based authentication
-
Run the command
ssh <user>@<remote-node-ip>
to initiate an SSH connection to a remote host.If everything is correctly configured, you should be able to log in without entering a password.
-
Repeat this step for each remote host to verify that all nodes can be accessed during the installation.
Step 4: Prepare storage devices
Block storage
Data in an Exasol database is stored on volumes, which are assigned to storage devices (disks). The storage devices must be prepared before you continue with the installation.
In the following configuration step (Step 6: Create a configuration file), you specify the disks in the parameter CCC_HOST_DATADISK
. The installation process will then automatically create the necessary volumes on those disks. You can create additional volumes manually after the installation if needed. For more information, see Storage Management.
Do not create a file system on the storage devices. The database stores persistent data using block storage with a specific structure for Exasol databases.
Do not use the device where the operating system resides for database storage, as this could potentially lead to a system crash if the device runs our of disk space.
The naming and order of the block storage devices must be identical on all nodes.
Supported storage types/technologies
- Sparse file devices hosted on a filesystem like ext4 or XFS (NFS is not supported)
- Block devices (local storage SAS, SSD, NVMe, virtual disks, or remote storage iSCSI/SAN)
- LVM2
- LUKS
Storage device requirements
-
Use at least 4 storage drives with minimum 250 MBps read/write capacity per drive.
Actual performance depends on the number of disks used as well as the speed of the individual disks.
-
OS and storage disks should have RAID 1 (or similar fault tolerance).
-
OS disks must have at least 150 GiB free disk space after installation .
-
Swap partition - use the size recommended by the OS vendor.
For information about how to calculate the required size for the storage devices, see Sizing Considerations.
Installation directory
Exasol is installed in the home directory of the installation user on each database node. For example, if the username is exasol
, Exasol is installed under /home/exasol/
.
The partition where the home directory of the installation user is mounted must have at least 20 GiB free space available for the installation.
Logical volume manager
We recommend using a logical volume manager to manage the storage devices used for block storage. To check if an existing system is using a logical volume manager, use the command lsblk
to list all block devices. If the output shows devices that are named sd*
(sda
, sdb
, sdc
, and so on), the system is not using a logical volume manager.
For example:
# this system is not using a volume manager:
lsblk -p
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
/dev/sda 8:0 0 388.5M 1 disk
/dev/sdb 8:16 0 4G 0 disk [SWAP]
/dev/sdc 8:32 0 256G 0 disk /mnt/wslg/distro
# this system is using a logical volume manager:
lsblk -p
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
/dev/loop1 7:1 0 91.9M 1 loop
/dev/nvme0n1 259:0 0 100G 0 disk
|-/dev/nvme0n1p1 259:4 0 99.9G 0 part /etc/ssh
|-/dev/nvme0n1p14 259:5 0 4M 0 part
`-/dev/nvme0n1p15 259:6 0 106M 0 part
/dev/nvme3n1 259:1 0 50G 0 disk
If you install Exasol for a non-root user, the data disks must be writeable for that user and additional configuration of the storage devices may be required. For more information, see Rootless Installation.
Installation
Step 5: Download and install c4
Exasol Deployment Tool (c4) is a command-line application that is used to install Exasol on a single host or multiple hosts in a network. The c4 application can run either on a separate system (jump host) or on one of the database hosts.
Database and c4 version dependency
The version of c4 used for installation must be compatible with the Exasol database version that you install. The latest version of the database is always compatible with the latest version of c4.
If you want to install an earlier version of Exasol, refer to the following table to find out which version of c4 to use (click on the title to expand):
Exasol database version |
Required c4 version |
---|---|
8.31.0 | 4.20.0 |
8.29.2 | 4.19.0 |
8.29.1 | 4.19.0 |
8.29.0 | 4.19.0 |
8.28.4 | 4.17.5 |
8.28.2 | 4.17.4 |
8.27.0 | 4.16.0 |
8.26.1 | 4.15.0 |
8.26.0 | 4.15.0 |
8.25.0 | 4.14.1 |
8.24.0 |
4.13.0 |
8.23.4 |
0.4.12 |
8.22.0 | 0.4.11 |
8.21.0 | 0.4.11 |
8.20.0 | 0.4.10 |
8.18.1 | 0.4.10 |
8.18.0 | 0.4.10 |
Alternative 1: Installation host has internet access
-
On the installation host, download c4 from the Exasol Download Portal. You can also use the following command in a Linux terminal:
Replace
<version>
with the desired c4 version, for example,4.20.0
.For information about the latest release of Exasol Deployment Tool (c4), see c4 Release Notes.
For more information about how to install and use c4, see Exasol Deployment Tool (c4).
-
The Exasol 8 installation package will be downloaded by c4 during the installation process.
Alternative 2: Installation host does not have internet access
If the host used to run the installation is not connected to the internet, download the resources using another machine and then copy them to the installation host.
-
On a machine with internet access, download c4 from the Exasol Download Portal or using the command line. For more information, see Install c4.
-
Download the latest version of Exasol from the Exasol Download Portal.
-
Copy c4 and the Exasol installation package to the same directory on the host that will run the installation.
You must make the c4 binary executable for all users on the host by using chmod +x c4
. Otherwise the application will not be able to run the installation. For more information, see Install c4.
Step 6: Create a configuration file
On the host that will be used to run the installation, create a file with the filename config
in the same directory as the c4 binary. For example, if you are using the nano text editor:
The configuration file should define the following parameters:
Parameter | Data type | Default | Description |
---|---|---|---|
CCC_HOST_ADDRS
|
string | [empty] |
The IP addresses of the database hosts on the private network, separated by spaces. The number of addresses in this parameter determines the number of nodes that will be installed. |
CCC_HOST_EXTERNAL_ADDRS
(optional) |
string | [empty] |
Public IP addresses of the database hosts, separated by spaces. This parameter is optional and only needed if you want to allow direct access to the deployment from outside of the private network. |
CCC_HOST_DATADISK
|
string | [empty] |
Comma-separated list of block devices to be used for the data volume. If block devices are not specified in this parameter, limited file-based storage is used. To find the names of the available block devices on the host, use the command The devices used should have persistent block device names. Exasol recommends using volume management with LVM2. See also System Requirements. |
CCC_HOST_IMAGE_USER
|
string | [empty] |
Username that will be used to log in to the SSH instances. The user must have sudo privileges on the instances. |
CCC_HOST_IMAGE_PASSWORD
|
string | [empty] |
Password for the user if required for sudo. The password is passed in plaintext to the instances. |
CCC_HOST_KEY_PAIR_FILE
|
string | [empty] | Name of the file that contains the private SSH key required to access host instances. |
CCC_PLAY_WORKING_COPY
|
string | [empty] |
Specifies the Exasol package to install, using the format For example: |
CCC_PLAY_DB_PASSWORD
|
string | aX1234567
|
Password for the database sys user. |
CCC_PLAY_ROOTLESS
|
boolean | false |
Use rootless deployment mode (OPTIONAL). Rootless installation requires additional system configuration. For more information, see Rootless Installation. |
CCC_PLAY_ADMIN_PASSWORD
|
string | aX1234567
|
Password for the system administration user admin in COS. |
CCC_PLAY_RESERVE_NODES (optional) |
integer |
[empty] |
The number of hosts to use as reserve nodes. Reserve nodes are inactive nodes that can automatically take over from an active node in case of failure. For more information about the failover mechanism, see Fail Safety (On-Prem). The reserve nodes are part of the total number of nodes. For example, deploying with 4 nodes and |
The username for the installation user can be any name allowed by the operating system. In the following examples, the user has the username exasol
.
Example configuration
The following configuration file will result in a deployment with 3 database nodes and one reserve node.
CCC_HOST_ADDRS="10.0.0.11 10.0.0.12 10.0.0.13 10.0.0.14"
CCC_HOST_EXTERNAL_ADDRS="203.0.113.11 203.0.113.12 203.0.113.13 203.0.113.14"
CCC_HOST_DATADISK=/dev/mapper/exasol_disk_1,/dev/mapper/exasol_disk_2
CCC_HOST_IMAGE_USER=exasol
CCC_HOST_IMAGE_PASSWORD=exasol123
CCC_HOST_KEY_PAIR_FILE=id_rsa
CCC_PLAY_WORKING_COPY=@exasol-8.32.0
CCC_PLAY_DB_PASSWORD=exasol456
CCC_PLAY_RESERVE_NODES=1
Always replace the default passwords by setting unique, secure passwords in your configuration file. Never use the passwords that are used in the examples in this documentation.
Run diagnostic tool (optional)
Before you start the installation you can run a diagnostic tool on your configuration. By using this tool you can detect issues before starting the installation. The diagnostic tool will check things like ssh accessibility to the hosts, sudo password correctness (if the CCC_HOST_IMAGE_PASSWORD
parameter is set), missing required parameters, etc.
To run the diagnostic tool, use c4 host diag -i <path to configuration>
. For example:
For more information about the diagnostic tool, use c4 host diag --help
.
Step 7: Deploy to hosts
-
On the installation host, run the following command:
The
-i
option tells c4 to use a specific configuration file. By default, c4 reads the configuration from the configuration file./config
(in the current directory). If the configuration is stored in another file, specify the path to this file as an argument on the command line:Rootless install
If you are installing Exasol for a non-root user, you must add
--ccc-play-rootless true
to the command:If you install Exasol for a non-root user, some additional configuration steps are required. For more information, see Rootless Installation.
-
If the configuration is valid, c4 will show the parameter values that will be used and ask you to either proceed with this configuration or cancel the installation.
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Exasol installation procedure is about to be started.
During this procedure, Exasol software will be installed to remote hosts.
It will take some time (several minutes).
The installation is finished when every node reaches stage 'd' (see 'c4 ps').
After the installation is finished, you can connect to the Database or COS.
During the installation, you can login to the hosts via SSH,
and watch the process using:
sudo journalctl -f
After the installation finished, you can connect to COS using:
ssh -p 20002 root@$IP
IP addresses of the systems:
* 203.0.113.11
* 203.0.113.12
* 203.0.113.13
* 203.0.113.14
Exasol version: 8.32.0
Exasol package: @exasol-8.32.0
SSH username : exasol
SSH keyfile : id_rsa
User password : exasol123
Data disk(s) : /dev/mapper/exasol_disk_1,/dev/mapper/exasol_disk_2
Press ENTER to proceed or Ctrl-C to cancel the installation procedure.
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||| -
Press Enter to start the installation.
-
If the necessary installation packages are present in the current directory they will be used for the installation. In this case no internet connection is required.
-
If the installation packages are not found in the current directory and the host used to run the installation is connected to the internet, c4 will automatically download the necessary packages from the Exasol download portal.
-
If no installation packages are found and c4 is not able to connect to the Exasol download portal, the installation process will be aborted and no changes are made to the system.
To troubleshoot a failed installation, first make sure that all the steps above have been carried out correctly and that all system requirements are met. To get help from our Support team, create a case. .
If the packages are available, the installation process will start. The installation requires no user intervention and can be run unattended. It comprises the following steps:
- Copying Exasol packages to the hosts
- Initial OS preparation
- Verifying OS configuration
- Configuring OS on the hosts
- Extracting packages
- Installing c4 on the hosts
- Syncing time between hosts
- Triggering remote installation finalization
The installation will typically require 20 to 90 minutes to complete, depending on the number of hosts and the location of the installation files. This however depends on many factors, and the installation may take longer.
-
-
When the installation has finished, a confirmation message is shown:
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
The final steps of the Exasol installation procedure were successfully
started on remote hosts now.
It will take yet some time to complete (several minutes).
After the installation is finished, you can connect to the Database or COS.
During the installation, you can login to the hosts via SSH,
and watch the process using:
sudo journalctl -f
After the installation finished, you can connect to COS using:
ssh -p 20002 root@$IP
IP addresses of the systems:
* 203.0.113.11
* 203.0.113.12
* 203.0.113.13
* 203.0.113.14
Exasol version: 8.32.0
Exasol package: @exasol-8.32.0
Happy Exasolling!
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Monitoring the installation process
To monitor the installation process, connect to one of the nodes over SSH and use the c4 ps
command. The output shows details about each node, including the current deployment stage. The installation is finished when all database nodes have reached stage d
.
For more details about the c4 ps
command, see How to use c4.
Example
ssh -i KEY_FILE user@203.0.113.11
...
user@ip-10-0-0-11:~$ ./c4 ps
N PLAY_ID NODE MEDIUM INSTANCE DB_VERSION EXTERNAL_IP INTERNAL_IP STAGE STATE UPTIME TTL
┌─ 1 c3275f84 11 host - 8.32.0 203.0.113.11 10.0.0.11 d - 03:50:12 +∞
│ 1 c3275f84 12 host - 8.32.0 203.0.113.12 10.0.0.12 d - 03:50:13 +∞
│ 1 c3275f84 13 host - 8.32.0 203.0.113.13 10.0.0.13 d - 03:50:13 +∞
└─ 1 c3275f84 14 host - 8.32.0 203.0.113.14 10.0.0.14 d - 03:50:13 +∞
2 c3275f84 14 local - 8.32.0 - 10.0.0.14 d - 03:50:13 +∞
To get more detailed information about the installation process, connect to one of the nodes over SSH and use the command sudo journalctl -f
to monitor the installation in real time.
Errors during installation
If an error occurs during the installation process, the process is automatically interrupted and the installation is rolled back. The last INFO
message on the screen provides information about the error. For example:
INFO[2023-09-02 09:22:44]: Extracting packages...
cp: cannot create regular file '/var/lib/ccc/bin/': No such file or directory
In this example, an error happened during the packages extraction step. When you have located and resolved the error, run the installation script again.
If you need help with the installation process, create a support case.
Post-installation
When the installation has completed, you can connect to the database using a SQL client, access the nodes over SSH, and use the different administrative interfaces to carry out additional configuration tasks.
Upload a license
The new Exasol system is installed with a license that allows you to load 10 GiB of raw data for testing purposes. For larger data sizes, you must upload a license to the database.
For information about how to upload a license, see Upload a License.
Connect to Exasol
Once the database is up and running you can connect to it using a database client and start to load data. To create the connection, use the following details:
Hostname |
Comma-separated list of public IP addresses of the active database nodes. For example: The public IP addresses of the nodes are shown in the |
Port |
Value of the Default: |
Username | sys
|
Password |
Value of the Default: |
If the connection uses a TLS certificate and a valid certification path is not found, you may have to provide the certificate fingerprint. For more information, refer to the documentation for the database client.
For more information about how to connect to Exasol and load data, see the following sections:
Configure backups
Backups can be scheduled or created manually. Backups are run in the background while the database is running, and the process normally has a minimal impact on database performance. How long the backup process takes depends on several factors, such as the size of the database and the location of the archive volume.
You can only create a backup of the entire database, not of individual schemas or tables. The backup contains the consistent state of the database at the time when the backup was started, and it includes only completed transactions that were committed at that time.
Backups are stored on archive volumes in a compressed format. The archive volumes can be configured locally within the cluster (local archive), or on a location outside of the cluster (remote archive).
The following example explains how to create a local archive volume and a backup schedule.
The following examples use ConfD through the command-line tool confd_client, which is available on all database nodes. For more information, see ConfD.
Create a local archive volume
-
On one of the nodes in the deployment, use
c4 connect -t <DEPLOYMENT>[.<NODE]/cos
to connect to EXAClusterOS (COS). For example:In most cases it does not matter which node you connect to. If you do not specify a node, c4 will connect to the first active node in the deployment. The command prompt in COS indicates which node you are connected to. For example, if you are connected as root to node 11:
For more information about how to use
c4 connect
, see How to use c4. -
To create a local archive volume, use the ConfD job st_volume_create with the parameters described in the following table.
Some parameters values for the new archive volume must match the corresponding values for the data volume. To find out the values used by the data volume, use the ConfD jobs db_info and st_node_list.
Click to show example output fromst_node_list
confd_client st_node_list
'0':
hdd_free_space:
disk1: 93.5566 GiB
hdds:
name: /dev/nvme2n1
type: disk1
...
name: /dev/nvme3n1
type: disk1
...
name: n11
...
'1':
hdd_free_space:
disk1: 93.5566 GiB
hdds:
name: /dev/nvme2n1
type: disk1
...
name: /dev/nvme3n1
type: disk1
...
name: n12
...
'2':
hdd_free_space:
disk1: 93.5566 GiB
hdds:
name: /dev/nvme2n1
type: disk1
...
name: /dev/nvme3n1
type: disk1
...
name: n13
...
'3':
hdd_free_space:
disk1: 93.5566 GiB
hdds:
name: /dev/nvme2n1
type: disk1
...
name: /dev/nvme3n1
type: disk1
...
name: n14
...Required parameters
Parameter name
Data type Value
disk
string The disk name in Exasol for the storage disk where the data volume resides.
owner
tuple, list Owner tuple (or list of tuples) for the data volume.
nodes
list List of node IDs in the data volume.
num_master_nodes
integer The number of master nodes (active nodes) in the data volume.
name
string A name for the new archive volume.
redundancy
integer The redundancy level of the archive volume.
size
string Volume size for the archive volume as a string, with unit (MiB, GiB, or TiB).
The size value depends on the database size and the backup schedule.
partition_size
string, integer 4294967296
for volumes <250 GiB34359738368
for volumes ≥ 250 GiB and <1TiB274877906944
for volumes ≥ 1 TiBshared
boolean true
type
string archive
block_size
string 512 KiB
stripe_size
string 512 KiB
Master nodes
The parameter
num_master_nodes
defines the number of master nodes that the volume will use. The number of master nodes must match the number of active nodes in the cluster. For example: in a cluster with 3 active nodes and 1 reserve node (3+1), the number of master nodes is 3.In the following example, we create a 1 TiB data volume with the name
LocalArchiveVolume1
on the storage diskdisk1
with 3 master nodes and redundancy 2. The command returns the volume ID for the new volume (vid: 3
).confd_client st_volume_create name: LocalArchiveVolume1 disk: disk1 type: archive size: "1 TiB" num_master_nodes: 3 nodes: [11, 12, 13] redundancy: 2 partition_size: 274877906944 shared: true, owner: [500,500]
# ConfD returns the volume ID of the new archive volume:
vid: 3The ConfD job st_volume_create does not necessarily use the specified
size
, but does internal rounding. To check the actual size of the archive volume after creation to see if it is acceptable, use the ConfD job st_volume_info. If the rounding takes up too much space, contact Support. -
To create the backup schedule, use the ConfD job db_backup_add_schedule.
If a local archive volume runs out of free space, expired backups will be automatically deleted. Expired remote archive volumes will not be deleted by this function.
A common backup schedule is a weekly backup with an expiration of 10 days, and incremental backups on the first 6 days of the week with an expiration time of 3 days. To set up this configuration, create two backup schedules. For example:
For more information about how to create archive volumes, see Create Local Archive Volume and Create Remote Archive Volume.
For more information about backups, see Backup and Restore.
Additional administrative tasks
You can use the Administration API and ConfD to carry out additional administrative tasks after the installation, such as setting up a backup schedule and managing access.
For more information, see the respective topics in the Administration section.