Cluster Architecture

The Exasol database runs under EXACluster OS, which is a Linux-Derivate. EXACluster OS was developed to accomplish, among other things, two major tasks:

  • Ensure the high availability of the whole system
  • Facilitate the administration of the cluster

The most important components:

Exasol DB is a parallel and distributed cluster database.

EXAStorage is a distributed storage engine. All data is stored inside volumes. EXAStorage also provides a failover mechanism.

EXAClusterOS is a cluster operating system. It’s a CentOS based Linux enhanced to provide cluster services and high availability.

EXAoperation is the web interface for administration and monitoring.

Exasol DB Layer

  • Exasol database runs on active nodes (A)
  • Reserve node (R) is not a part of computations and does not store any data. It will be automatically pushed in, in case an active node fails and gets restored from the secondary segments in the background. The reserve node needs to have exactly the same configuration as active nodes
  • Exasol database can use any number of reserve node.
  • The database connection string includes active and reserve nodes

EXAStorage Layer

  • Data will be stored in four different types of EXAStorage volumes
    • Data Volume
    • Local Archive Volume
    • Temporary Volume
    • Remote Archive Volume
  • If redundancy is set, active computation nodes (M) store their data additionally in secondary segments (S), which reside on other nodes
    • Redundancy is supported for Data and Local Archive Volumes
  • Archive Volumes are used to store database backups

For more information on storage and volumes, refer to the Storage Management section.

EXAClusterOS Layer

  • A management node (LS) is required for cluster administration and boot strapping (cold boot) the data nodes (N)
  • Is used to operate and manage the cluster
  • Build to implement cluster services and fail-safety mechanisms
  • Cluster operations need a quorum
  • Applying software updates

Networking Layer

  • Used for sharing cluster vitality information
  • Booting the cluster
  • Database communication
  • EXAStorage communication
  • EXAClusterOS communication

Shared-Nothing Architecture (MPP processing)

Exasol was developed as a parallel system and is constructed according to the shared-nothing principle. Data is distributed across all nodes in a cluster. When responding to queries, all nodes co-operate and special parallel algorithms ensure that most data is processed locally in each individual node's main memory.

When a query is sent to the system, it is first accepted by the node the client is connected to. The query is then distributed to all nodes. Intelligent algorithms optimize the query, determine the best plan of action and generate needed indexes on the fly. The system then processes the partial results based the local datasets. This processing paradigm is also known as SPMD (single program multiple data). All cluster nodes operate on an equal basis, there is no Master Node. The global query result is delivered back to the user through the original connection.

The data stored in this database is symbolized with A,B,C,D to indicate that each node contains a different part of the database data. The active nodes n11-n14 each host database instances that operate on their part of the database locally in an MPP way. These instances communicate and coordinate over the private network.