Install Exasol 8 - step by step

This section explains how to install Exasol 8 software on Linux hosts using Exasol Deployment Tool (c4).

Overview

The installation procedure can be broken down into three major stages: the Preparation stage, where you prepare the installation environment, the Installation stage, where you download the deployment tool and run the installation on the hosts, and the Post-Installation stage, where you can connect to the cluster, upload a license, and carry out any other additional actions to prepare the database for use.

The following diagram is a schematic overview of the installation procedure. You can click in the diagram to navigate directly to the corresponding section in the documentation.

installation workflow

You can run the installation process from a separate Linux system (jump host) or from one of the database host systems. In both cases the host running the installation will require access to the other hosts over SSH.

If you install Exasol 8 on a single host system without using a jump host, SSH is not required for the installation process. We recommend setting up SSH anyway, as this enables passwordless login and makes it easier to extend the installation at a later stage by adding more hosts.

Prerequisites

The database hosts must be pre-installed with one of the recommended Linux distributions and meet the minimum system requirements for an Exasol installation. The following requirements must be checked:

  • The nodes must run on a supported CPU type

  • Storage devices must be of a supported type with RAID configuration as required

  • Each node must have enough storage space for installation and updates

  • Each node must have at least two network interfaces for the public and private IPv4 networks

  • The nodes must have a supported operating system version installed and configured as required

  • All software dependencies must be fulfilled for the database nodes (and optional jump host)

For more details about the requirements for installing Exasol 8, see System Requirements.

Preparation

Step 1: Configure network settings

An Exasol cluster is normally configured with two networks using separate physical interfaces: a private network for internal communication between the nodes in the cluster, and a public network that allows connections to the database from outside of the cluster. In a development scenario, you can use the same IP addresses for both networks, although this will result in low performance. For a production environment and for performance testing, always use separate networks.

IP address spacing

To enable parallel IMPORT and EXPORT operations, the database hosts must be assigned consecutive and evenly spaced static IPv4 addresses in the same subnet. For example: 198.51.100.11, 198.51.100.12, 198.51.100.13, 198.51.100.13.

Internet connection

An internet connection from the jump host is optional but recommended. If the host running the installation is not connected to the internet, the software must be downloaded using another system and copied to the installation host. Internet connectivity from the database hosts is not required for the installation.

Firewall configuration

To operate the Exasol database after installation, traffic must be allowed through the firewall on the following ports:

Service Default port Protocol
SQL client connections to the database 8563 TCP
SSH access to all cluster nodes 20002 TCP
HTTPS access to the Administration API 4444 TCP
NTP 123 TCP/UDP
DNS 53 TCP/UDP
LDAP (optional) 389 TCP/UDP

For more information about the default ports used by Exasol, see Firewall and Port Settings.

Step 2: Create the installation user

A dedicated user for the Exasol software installation must exist on all the database hosts. The user must have sudo privileges and a system shell that allows access over SSH.

In this example, the installation user is created by a user that has sudo privileges. If you are carrying out this operation as root, the sudo command should be omitted.

The name of the installation user can be set freely within the restrictions of the operating system. In this example, the installation user is called exasol.

  1. Create the user exasol with a home directory /home/user:

    sudo adduser -m exasol
  2. Add the user to the sudoers group:

    sudo usermod -aG sudo exasol
  3. Assign a password for the user:

    sudo passwd exasol
  4. Log out (or create a new session) and log in as the user exasol.

  5. Verify that the user has sudo privileges using sudo whoami. The command should return root.

    sudo whoami
    root

Step 3: Set up SSH authentication

If you install Exasol 8 on a single host system without using a jump host, SSH is not required for the installation process.

Generate an SSH key pair

  1. Open a terminal or command prompt on the host that will run the installation.

  2. Run the command ssh-keygen -t rsa to generate a new RSA key pair.

  3. Choose a location to save the key pair (the default location is normally ~/.ssh/id_rsa).

For added security you can provide a passphrase for the keypair.

Copy the public key to the remote hosts

  1. Run the command ssh-copy-id <username>@<remote-node-ip>. Replace <username> and <remote-node-ip> with the actual username of the installation user and the IP address of the remote host.

    Enter the password for the user on the remote host when prompted.

  2. Repeat this step for each host by substituting <remote-node-ip> with the IP address of the respective host. The user should be the same on all hosts.

Test the key-based authentication

  1. Run the command ssh <user>@<remote-node-ip> to initiate an SSH connection to a remote host.

    If everything is correctly configured, you should be able to log in without entering a password.

  2. Repeat this step for each remote host to verify that all nodes can be accessed during the installation.

Step 4: Configure storage

Block storage

The installation process will automatically create the necessary volumes on the storage devices that you have defined in the configuration. You can create additional volumes as needed after installation. For more information, see Storage Management.

Do not create a file system on the storage device. The database stores persistent data using block storage with a specific structure for Exasol databases.

Do not use the device where the operating system resides for database storage, as this could potentially lead to a system crash if the device runs our of disk space.

The naming and order of the block storage devices must be identical on all nodes.

Supported storage types/technologies

  • Sparse file devices hosted on a filesystem like ext4 or XFS (NFS is not supported)
  • Block devices (local storage SAS, SSD, NVMe, virtual disks, or remote storage iSCSI/SAN)
  • LVM2
  • LUKS

Storage device requirements

  • Use at least 4 storage drives with minimum 250 MBps read/write capacity per drive.

    Actual performance depends on the number of disks used as well as the speed of the individual disks.

  • OS and storage disks should have RAID 1 (or similar fault tolerance).

  • OS disks must have at least 150 GiB free disk space after installation .

  • Swap partition - use the size recommended by the OS vendor.

For information about how to calculate the required size for the storage devices, see Sizing Considerations.

Installation directory

Exasol is installed in the home directory of the installation user on each database node. For example, if the username is exasol, Exasol is installed under /home/exasol/.

The partition where the home directory of the installation user is mounted must have at least 20 GiB free space available for the installation.

Logical volume manager

We recommend using a logical volume manager to manage the storage devices used for block storage. To check if an existing system is using a logical volume manager, you can use the command lsblk to list all block devices. If the output shows devices that are named sd* (sda, sdb, sdc, and so on), the system is not using a logical volume manager.

For example:

# this system is not using a volume manager:
user@host:~$ lsblk -p
NAME     MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
/dev/sda   8:0    0 388.5M  1 disk
/dev/sdb   8:16   0     4G  0 disk [SWAP]
/dev/sdc   8:32   0   256G  0 disk /mnt/wslg/distro

# this system is using a logical volume manager:
user@host:~$ lsblk -p
NAME              MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
/dev/loop1          7:1    0 91.9M  1 loop
/dev/nvme0n1      259:0    0  100G  0 disk
|-/dev/nvme0n1p1  259:4    0 99.9G  0 part /etc/ssh
|-/dev/nvme0n1p14 259:5    0    4M  0 part
`-/dev/nvme0n1p15 259:6    0  106M  0 part
/dev/nvme3n1      259:1    0   50G  0 disk

Installation

Step 5: Download c4

Exasol Deployment Tool (c4) is a command-line application that is used to install Exasol on a single host or multiple hosts in a network. The c4 application can run either on a separate system (jump host) or on one of the database hosts.

Database and c4 version dependency

The version of c4 used for installation must be compatible with the Exasol database version that you install. The following table describes the version of c4 required to install a specific Exasol database version:

Exasol database version

Required c4 version

8.28.2 4.17.4
8.27.0 4.16.0
8.26.1 4.15.0
8.26.0 4.15.0
8.25.0 4.14.1

8.24.0

4.13.0

8.23.4

0.4.12

8.22.0 0.4.11
8.21.0 0.4.11
8.20.0 0.4.10
8.18.1 0.4.10
8.18.0 0.4.10

Install c4 on a host with internet access

If the host used to run the installation is connected to the internet:

  • Download c4 from the Exasol Download Portal or using the command line. For more information, see Install c4.

  • The Exasol 8 installation package will be downloaded by c4 during the installation process.

Install c4 on a host without internet access

If the host used to run the installation is not connected to the internet, download the resources using another machine and then copy them to the installation host.

  1. Download c4 from the Exasol Download Portal or using the command line. For more information, see Install c4.

  2. Download the latest version of Exasol 8 from the Exasol Download Portal.

  3. Copy c4 and the Exasol 8 installation package to the same directory on the host that you will use to run the installation.

You must make the c4 binary executable for all users on the host by using chmod +x c4. Otherwise the application will not be able to run the installation. For more information, see Install c4.

Step 6: Create a c4 configuration

On the host used to run the installation, create a file with the filename config in the same directory as the c4 binary. The configuration file should define the following parameters:

Configuration parameter Description Default value
CCC_HOST_ADDRS

The IP addresses of the database hosts, separated by spaces.

The number of addresses in this parameter effectively determines the number of nodes that will be installed.

[empty]
CCC_HOST_EXTERNAL_ADDRS

The public IP addresses of the database hosts, separated by spaces.

This parameter is optional and only required if you do not have routing to the internal IP addresses specified in CCC_HOST_ADDRS.

[empty]
CCC_HOST_DATADISK

Comma-separated list of block devices to be used for the data volume. If not specified, limited file-based storage is used.

The devices used should have persistent block device names. Exasol recommends using volume management with LVM2. See also System Requirements.

[empty]
CCC_HOST_IMAGE_USER Username that will be used to log in to the SSH instances. The user must have sudo privileges on the instances. [empty]
CCC_HOST_IMAGE_PASSWORD

Password for the user if required for sudo. The password is passed in plaintext to the instances.

[empty]
CCC_HOST_KEY_PAIR_FILE Name of the file that contains the private SSH key required to access host instances. [empty]
CCC_PLAY_WORKING_COPY

Specifies the Exasol package to install, using the format @exasol-<version>.

For example: @exasol-8.22.0

[empty]
CCC_PLAY_DB_PASSWORD Password for Exasol database authentication (user: sys). aX1234567

The username for the installation user can be any name allowed by the operating system. In the following examples, the user has the username exasol.

Example:
CCC_HOST_ADDRS="10.0.0.11 10.0.0.12 10.0.0.13"
CCC_HOST_EXTERNAL_ADDRS="198.51.100.11 198.51.100.12 198.51.100.13"
CCC_HOST_DATADISK=/dev/mapper/exasol_disk_1,/dev/mapper/exasol_disk_2
CCC_HOST_IMAGE_USER=exasol
CCC_HOST_IMAGE_PASSWORD=exasol123
CCC_HOST_KEY_PAIR_FILE=id_rsa
CCC_PLAY_WORKING_COPY=@exasol-8.23.4
CCC_PLAY_DB_PASSWORD=exasol456

Always replace the default passwords by setting unique, secure passwords in your configuration file. Never use the passwords that are used in the examples in the documentation.

Diagnostic tool

Before you start the installation you can run a diagnostic tool on your configuration. This may allow you to detect some issues before starting the installation. The diagnostic tool will check things like ssh accessibility to the hosts, sudo password correctness (if CCC_HOST_IMAGE_PASSWORD is set), missing required parameters, etc.

To run the diagnostic tool, use c4 host diag -i <path to configuration>. For example:

./c4 host diag -i /path_to_config_file/myconfig
OK check_disks
OK check_external_dependencies
OK check_internal_dependencies
OK check_required_params
OK check_sudo

For more information about the diagnostic tool, use c4 host diag --help.

Step 7: Deploy to hosts

  1. To start the installation, run:

    ./c4 host play -i config

    The -i option tells c4 to use a specific configuration file. By default, c4 reads the configuration from the configuration file ./config in the current directory. If the configuration is stored in another file, specify the path to this file as an argument on the command line:

    ./c4 host play -i /path_to_config_file/myconfig
  2. If the configuration is valid, c4 will show the parameter values that will be used and ask you to either proceed with this configuration or cancel the installation.

    |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

      Exasol installation procedure is about to be started.
      During this procedure, Exasol software will be installed to remote hosts.
      It will take some time (several minutes).
      The installation is finished when every node reaches stage 'd' (see 'c4 ps').
      After the installation is finished, you can connect to the Database or COS.

      During the installation, you can login to the hosts via SSH,
      and watch the process using:

        sudo journalctl -f

      After the installation finished, you can connect to COS using:

        ssh -p 20002 root@$IP

      IP addresses of the systems:

        * 198.51.100.11
        * 198.51.100.12
        * 198.51.100.13

      Exasol version: 8.23.4
      Exasol package: @exasol-8.23.4
      SSH username  : exasol
      SSH keyfile   : id_rsa
      User password : exasol123
      Data disk(s)  : /dev/mapper/exasol_disk_1,/dev/mapper/exasol_disk_2

      Press ENTER to proceed or Ctrl-C to cancel the installation procedure.

    |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
  3. Press Enter to start the installation.

    • If the necessary installation packages are present in the current directory they will be used for the installation. In this case no internet connection is required.
    • If the installation packages are not found in the current directory and the host used to run the installation is connected to the internet, c4 will automatically download the necessary packages from the Exasol download portal.
    • If no installation packages are found and c4 is not able to connect to the Exasol download portal, the installation will fail.

    If the packages are available, the installation process will start. The installation requires no user intervention and can be run unattended. It comprises the following steps:

    • Copying Exasol packages to the hosts
    • Initial OS preparation
    • Verifying OS configuration
    • Configuring OS on the hosts
    • Extracting packages
    • Installing c4 on the hosts
    • Syncing time between hosts
    • Triggering remote installation finalization

    The installation will typically require 20 to 90 minutes to complete, depending on the number of hosts and the location of the installation files. This however depends on many factors, and the installation may take longer.

  4. When the installation has finished, a confirmation message is shown:

    |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

      The final steps of the Exasol installation procedure were successfully
      started on remote hosts now.
      It will take yet some time to complete (several minutes).
      After the installation is finished, you can connect to the Database or COS.

      During the installation, you can login to the hosts via SSH,
      and watch the process using:

        sudo journalctl -f

      After the installation finished, you can connect to COS using:

        ssh -p 20002 root@$IP

      IP addresses of the systems:

        * 198.51.100.11
        * 198.51.100.12
        * 198.51.100.13

      Exasol version: 8.23.4
      Exasol package: @exasol-8.23.4

      Happy Exasolling!

    |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

Monitoring the installation process

You can use the c4 command c4 ps to monitor the status of the deployment. The output shows details about each node, including the current deployment stage and state. The deployment is finished when all database nodes have reached stage d and are in the running state.

For more information about the c4 ps command, use c4 ps --help.

$ c4 ps
      
      N  PLAY_ID   NODE  MEDIUM  INSTANCE     DB_VERSION  EXTERNAL_IP     INTERNAL_IP  STAGE  STATE    UPTIME    TTL  FEATURES
  ┌─  1  c3275f84  11    awscf   r5d.large    8.27.0      203.0.113.11    10.0.0.11    d      running  03:50:12  +∞   ♻️
  │   1  c3275f84  12    awscf   r5d.large    8.27.0      203.0.113.12    10.0.0.12    d      running  03:50:13  +∞   ♻️
  └─  1  c3275f84  13    awscf   r5d.large    8.27.0      203.0.113.13    10.0.0.13    d      running  03:50:13  +∞   ♻️

To get more detailed information about the installation process, you can log in to a database node over SSH and use the command sudo journalctl -fto monitor the installation in real time.

Errors during installation

If an error occurs during the installation process, the process is automatically interrupted and the installation is rolled back. The last INFO message on the screen provides information about the error. For example:

INFO[2023-09-02 09:22:44]: Extracting packages...
cp: cannot create regular file '/var/lib/ccc/bin/': No such file or directory

In this example, an error happened during the packages extraction step. When you have located and resolved the error, run the installation script again.

If you need help with the installation process, create a support case.

Post-installation

When the installation has completed, you can connect to the database using a SQL client, access the nodes over SSH, and use the different administrative interfaces to carry out additional configuration tasks.

Upload a license

The new Exasol system is installed with a license that allows you to load 10 GB of raw data for testing purposes. For larger data sizes, you must upload a license to the database.

For information about how to upload a license, see Upload a License.

Connect to Exasol

Once the database is up and running, you can connect to it using a SQL client and start to load data.

For more information, see the following sections:

Configure backups

Backups can be scheduled or created manually. Backups are run in the background while the database is running, and the process normally has a minimal impact on database performance. How long the backup process takes depends on several factors, such as the size of the database and the location of the archive volume.

You can only create a backup of the entire database, not of individual schemas or tables. The backup contains the consistent state of the database at the time when the backup was started, and it includes only completed transactions that were committed at that time.

Backups are stored on archive volumes in a compressed format. The archive volumes can be configured locally in the cluster (local archive), or on a remote location (remote archive).

The following example explains how to create a local archive volume and a backup schedule.

The following examples use ConfD through the command-line tool confd_client, which is available on all database nodes. For more information, see ConfD.

Create a local archive volume

  1. Connect to EXAClusterOS (COS) using c4 connect -t <DEPLOYMENT>[.<NODE]/cos. For example:

    c4 connect -t 1/cos

    In most cases it does not matter on which node you access ConfD. If you do not specify a node, c4 will connect to the first active node in the deployment. The command prompt in COS indicates which node you are connected to:

    For more information about how to use c4 connect, see How to use c4.

    [root@n11 ~]#
  2. To create a local archive volume, use the ConfD job st_volume_create with the parameters described in the following table.

    Some parameters values for the new archive volume must match the corresponding values for the data volume. To find out the values used by the data volume, use the ConfD jobs db_info and st_node_list.

    Required parameters

    Parameter Name

    Data type

    Value

    disk

    string

    The disk name in Exasol for the storage disk where the data volume resides.

    owner

    tuple, list

    Owner tuple (or list of tuples) for the data volume.

    nodes

    list

    List of node IDs in the data volume.

    num_master_nodes integer

    The number of master nodes (active nodes) in the data volume.

         

    name

    string

    A name for the new archive volume.

    redundancy

    integer

    The redundancy level of the archive volume.

    size

    string

    Volume size for the archive volume as a string, with unit (MiB, GiB, or TiB).

    The size value depends on the database size and the backup schedule.

    partition_size

    string, integer

    4294967296 for volumes <250 GiB

    34359738368 for volumes ≥ 250 GiB and <1TiB

    274877906944 for volumes ≥ 1 TiB

    shared

    boolean

    true

    type

    string

    archive

    block_size string 512 KiB
    stripe_size string 512 KiB

    Master nodes

    The parameter num_master_nodes defines the number of master nodes that the volume will use. The number of master nodes must match the number of active nodes in the cluster. For example: in a cluster that will have 4 active nodes and 1 reserve node (4+1), the number of master nodes is 4.

    In the following example, we create a 1 TiB data volume with the name LocalArchiveVolume1 on the storage disk disk1 with 4 master nodes and redundancy 2. The command returns the volume ID for the new volume (vid: 3).

    confd_client -c st_volume_create -a '{"name": "LocalArchiveVolume1", "disk": "disk1", "type": "archive", "size": "1 TiB", "num_master_nodes": 4, "nodes": [11, 12, 13, 14], "redundancy": 2, "partition_size": 274877906944, "shared": true, "owner": [500,500]}'
    # ConfD will return the volume ID of the new archive volume:
    vid: 3

    The ConfD job st_volume_create does not necessarily use the specified size, but does internal rounding. Use the ConfD job st_volume_info to check the actual size of the archive volume after creation to see if it is acceptable. If the rounding takes up too much space, contact Support.

  3. To create the backup schedule, use the ConfD job db_backup_add_schedule.

    If a local archive volume runs out of free space, expired backups will be automatically deleted. Expired remote archive volumes will not be deleted by this function.

    A common backup schedule is a weekly backup with an expiration of 10 days and incremental backups Monday - Saturday with an expiration time of 3 days. To set up this configuration, create two backup schedules. For example:

    confd_client -c db_backup_add_schedule -a '{db_name: DATABASE_NAME, backup_name: weekly_full_backup, backup_volume_name: VOLUME_NAME, enabled: True, level: 0, expire: "1w 3d",  minute: "0", hour: "0", day: "*", month: "*", weekday: "0"}'
    confd_client -c db_backup_add_schedule -a '{db_name: DATABASE_NAME, backup_name: daily_incremental, backup_volume_name: VOLUME_NAME, enabled: True, level: 1, expire: "3d",  minute: "0", hour: "0", day: "*", month: "*", weekday: "1,2,3,4,5,6"}'

For more information about how to create archive volumes, see Create Local Archive Volume and Create Remote Archive Volume.

For more information about backups, see Backup and Restore.

Additional administrative tasks

You can use the Administration API and ConfD to carry out additional administrative tasks after the installation, such as setting up a backup schedule and managing access.

For more information, see the respective topics in the Administration (On-Prem) section.