Database Essentials

This article describes the essentials of Exasol databases and clusters.

Databases and clusters

An Exasol database comprises one or more clusters that handle all query operations. A cluster is a group of servers, each having its own CPUs and main memory (RAM). The size of the cluster defines its compute power based on the total number of CPUs and RAM. Using multiple clusters makes it possible to separate workloads between teams or tenants and allow for greater concurrency.

Increasing the compute power in a cluster is referred to as vertical scaling. Adding more clusters is referred to as horizontal scaling.

The first cluster of a database is called the main cluster, while all other clusters are called worker clusters. The main cluster has a special role: it communicates with all worker clusters and ensures that they have a consistent view of transactions and metadata. Each cluster has a direct connection with the end users and with the central data store, and only metadata is transferred between the clusters.

Worker clusters can only run if the main cluster is running. Stopping the main cluster will also stop all worker clusters in the database.

Object storage

Data and metadata for the database are stored in a central data store such as an S3 bucket, not in the clusters. All clusters in a database access the same underlying data and metadata. Changes that are persisted from one cluster are persisted in the data store and are also persistent for all other transactions in the database, regardless of which cluster they are connected to.

Database scalability

Scalability is the ability to increase and decrease resources based on the business demands. In a native cloud deployment of Exasol on AWS, scaling can entail both vertical scaling (changing cluster resources) and horizontal scaling (changing the number of clusters).

Multiple cluster operation is not supported when Exasol is installed as an application on cloud instances.

Vertical scaling

Vertical scaling refers to changing compute power and RAM to optimize concurrency. When you scale up or down, you adjust the amount of VCPUs and RAM allocated to the cluster. Use vertical scaling when you want to:

  • Speed up the queries you are executing
  • Run large complex queries or support bigger datasets without impacting the performance
  • Add more users or concurrency without affecting the performance

In a multi-cluster Exasol deployment on AWS, vertical scaling by changing instance types only requires that you shut down a single worker cluster, not the entire database.

For information about how to change instance types in an existing cluster, see Scale a Cluster.

For information about choosing instance types, see Sizing Considerations.

Horizontal scaling

Horizontal scaling means adding more clusters to optimize concurrency and manage higher workloads (scale up), or removing clusters that are no longer needed to optimize cost efficiency (scale down). When using multiple clusters, you can isolate different workloads between teams and ensure that resources used in queries by one team do not impact other teams.

For more information about how to add and remove clusters, see Cluster Management.

Increased concurrency can be achieved both by horizontal and vertical scaling. If there are no separate workloads with separate groups of users, we recommend starting with vertical scaling. If this is no longer sufficient, or if there are different workloads, we recommend creating dedicated clusters.