Stop and Start Nodes

This article explains how to safely stop and start a node in an on-premises installation.

You may want to temporarily stop a database node in order to perform maintenance. Before you stop a node you must stop the database and the Exasol services. When you start up the node again, the Exasol services and the database should also automatically start.

Exasol 8 does not include functionality to send out-of-band data to restart a host when the OS is not reachable. For information on how to set up an IPMI service such as iDRAC, iLO or ILOM, refer to the documentation for the respective platform.

Prerequisites

The user performing this procedure must either be root or a user with sudo access.

Procedure

The following examples use the command-line tool confd_client, which is available on all database nodes. For more information, see ConfD.

Placeholder values are indicated with UPPERCASE characters. Replace the placeholders with your own values.

Connect to COS

Connect to EXAClusterOS (COS) on the cluster using c4 connect -t <DEPLOYMENT>[.<NODE>]/cos. For example:

./c4 connect -t 1.11/cos

If you do not specify a node, c4 will connect to the first active node in the deployment.

For more information about how to use c4 connect, see How to use c4.

Stop a node

  1. Stop the database using the ConfD job db_stop:

    confd_client db_stop db_name: DB_NAME

    To prevent a risk of data loss, always stop the database before stopping a node, as the database may otherwise be left in a corrupted state.

  2. Stop the c4 services on the node using the systemctl stop command:

    If you are carrying out this operation as root, the sudo command should be omitted.

    sudo systemctl stop c4_cloud_command
    sudo systemctl stop c4
  3. Verify that the services have stopped using the systemctl status command. For example:

    systemctl status c4
    ● c4.service - c4 server
         Loaded: loaded (/etc/systemd/system/c4.service; enabled; vendor preset: enabled)
         Active: inactive (dead) since Wed 2024-05-29 18:45:57 UTC; 29s ago
        Process: 26651 ExecStart=/var/lib/ccc/etc/c4 (code=killed, signal=TERM)
       Main PID: 26651 (code=killed, signal=TERM)
       
    systemctl status c4_cloud_command
    ● c4_cloud_command.service - c4 init service
         Loaded: loaded (/etc/systemd/system/c4_cloud_command.service; enabled; vendor preset: enabled)
         Active: inactive (dead) since Wed 2024-05-29 18:46:51 UTC; 5s ago
        Process: 25832 ExecStart=/var/lib/ccc/etc/c4_cloud_command (code=killed, signal=TERM)
    ...

    You can now safely shut down the host system.

Start a node

  1. Start the host system. The Exasol services and the database should then automatically start.

  2. To verify that the services have started, you can use systemctl status. For example:

    systemctl status c4
    ● c4.service - c4 server
         Loaded: loaded (/etc/systemd/system/c4.service; enabled; vendor preset: enabled)
         Active: active (running) since Wed 2024-05-29 18:48:51 UTC; 2min 47s ago
    ...

    If the c4 services are not running, start them now using the systemctl start command:

    sudo systemctl start c4
    sudo systemctl start c4_cloud_command

    Verify that the services are running using systemctl status as described above.

  3. Check that the database is running using c4 ps. For example:

    ./c4 ps
          
          N  PLAY_ID   NODE  MEDIUM  INSTANCE     DB_VERSION  EXTERNAL_IP     INTERNAL_IP  STAGE  STATE    UPTIME    TTL
      ┌─  1  c3275f84  11    awscf   r5d.large    8.27.0      203.0.113.11    10.0.0.11    d      running  03:50:12  +∞
      │   1  c3275f84  12    awscf   r5d.large    8.27.0      203.0.113.12    10.0.0.12    d      running  03:50:13  +∞
      │   1  c3275f84  13    awscf   r5d.large    8.27.0      203.0.113.13    10.0.0.13    d      running  03:50:13  +∞
      └─  1  c3275f84  14    awscf   r5d.large    8.27.0      203.0.113.14    10.0.0.14    d      running  03:50:13  +∞

    The database is running when the nodes are in stage d.

  4. If the database is not running, start it now using the ConfD job db_start. For example:

    confd_client db_start db_name: MY_DATABASE

If you need to replace a node, you must perform additional steps to add the new node to the deployment. For more information, see Replace Node.