Zonemaps is a new performance feature in Exasol 8 that allows the database to decrease I/O and computation cost of some queries.
Exasol distributes table data across nodes, where each node logically splits the table into multiple contiguous-rows objects. Each such object (consisting of a partial span of the total table rows) is called a segment and typically contains data that is inserted around the same time.
A segment is a logical construct, not physical. Data is still physically stored on disk in a columnar fashion.
A zonemap is a metadata layer on these segments. With zonemaps enabled on a column, each segment has an associated zone record for that column, which contains some summary information about that column-segment including the minimum and maximum values contained within.
Using the zone record, the database determines if data in a segment for a particular column cannot satisfy a predicate, for example
column_date > 2023-01-01. In this case, data processing can skip the whole segment, avoiding I/O and CPU costs. This can provide performance improvements on many queries.
Exasol automatically enables zonemaps on each partition column in the database. Zonemaps promise the best performance when similar data is grouped into segments, and this is the function of Exasol partitioning. Whenever a query contains a supported predicate on a column with zone records (in this automatic case, any partition column), they are used to exclude segments containing data wholly unable to satisfy the predicate from the table scan.
Zonemaps may also be used without partitioning. For this, you must enable zonemaps manually on a column using this syntax:
A table is not limited to one column with zonemaps. If you wish to have zonemaps on additional columns of a table, you may enforce zonemap on each subsequent column:
To remove a zonemap from a column, use the command:
A zonemap has to maintain the metadata contained in the zone records of each segment. This has a small cost during DML operations. Altering the row ordering of a table (by re-partitioning, redistributing, or reorganizing the table) also requires computation of revised zone records for each segment.
Zonemaps are best used on columns that have frequent predicate use, and where the underlying table shows data locality on this column. Data locality is explicit on partitioned tables. It may also occur where there is, for example, a monotonically increasing data field such as a date or timestamp. If this field will be used often in predicates, then it is a good candidate for zonemaps.
A column under zonemapping will show
TRUE in the
COLUMN_IS_ZONEMAPPED column of the
When using the
DESC[RIBE] statement on a table,
ZONEMAPPED will be shown as
TRUE for columns that are zonemapped:
COLUMN_NAME SQL_TYPE NULLABLE DISTRIBUTION_KEY PARTITION_KEY ZONEMAPPED
--------------- ------------- -------- ---------------- ------------- ----------
x INT TRUE TRUE FALSE FALSE
y BOOLEAN TRUE FALSE FALSE TRUE
In profiling, an operation that utilized the zone records will show
WITH ZONEMAP in the
CREATE TABLE my_table ( x int, y boolean);
ENFORCE ZONEMAP ON my_table(x);
INSERT INTO my_table(x) VALUES BETWEEN 1 AND 1000000;
ALTER SESSION SET PROFILE='ON';
SELECT x,y FROM my_table WHERE x<=5;
ALTER SESSION SET PROFILE='OFF';
SELECT * FROM EXA_USER_PROFILE_LAST_DAY WHERE SESSION_ID = CURRENT_SESSION;
All comparison predicates are supported, as well as a subset of special predicates.
IS [NOT] NULL
Supported Data Types
The database can use zonemaps on all the supported predicates in columns of the following types:
INTERVAL YEAR TO MONTH
The database can use zonemaps for a subset of the supported predicates in columns of the following type:
INTERVAL DAY TO SECOND- only supports the
Unsupported data types
For columns of the following types there is no automatic creation of zonemaps, and zonemaps cannot be enforced.