Zone maps
This article explains how to use zone maps for partition pruning in Exasol 8.
Zone maps is a new performance feature in Exasol 8 that allows the database to decrease the I/O and computation cost of certain queries. This feature is also known as partition pruning.
Exasol distributes table data across nodes, where each node logically splits the table into multiple contiguous-rows objects. Each such object, consisting of a partial span of the total table rows, is called a segment. Each segment typically contains data that was inserted around the same time.
A segment is a logical construct, not physical. Data is still physically stored on disk in a columnar fashion.
A zone map is a metadata layer on these segments. With zone maps enabled on a column, each segment has an associated zone record for that column, which contains some summary information about that column-segment including the minimum and maximum values contained within.
Using the zone record, the database determines if data in a segment for a particular column cannot satisfy a predicate, for example, column_date > 2023-01-01
. In this case, data processing can skip the whole segment, avoiding I/O and CPU costs. This can improve performance on many queries.
Exasol automatically enables zone maps on each partition column in the database. Zone maps promise the best performance when similar data is grouped into segments, and this is the function of Exasol partitioning. Whenever a query contains a supported predicate on a column with zone records (in this automatic case, any partition column), they are used to exclude segments containing data wholly unable to satisfy the predicate from the table scan.
Zone maps may also be used without partitioning. For this, you must manually enable zone maps on a column using this syntax:
A table is not limited to one column with zone maps. If you wish to have zone maps on additional columns of a table, you may enforce zone map on each subsequent column:
To remove a zone map from a column, use the command:
A zone map has to maintain the metadata contained in the zone records of each segment. This has a small cost during DML operations. Altering the row ordering of a table (by re-partitioning, redistributing, or reorganizing the table) also requires computation of revised zone records for each segment.
Zone maps are best used on columns that have frequent predicate use, and where the underlying table shows data locality on this column. Data locality is explicit on partitioned tables. It may also occur where there is, for example, a monotonically increasing data field such as a date or timestamp. If this field will be used often in predicates, then it is a good candidate for zone maps.
Inspect zone maps
A column under zone mapping will show TRUE
in the COLUMN_IS_ZONEMAPPED
column of the EXA_{DBA,USER,ALL}_COLUMNS
tables.
When using the DESC[RIBE]
statement on a table, ZONEMAPPED
will be shown as TRUE
for columns that are zone mapped:
COLUMN_NAME SQL_TYPE NULLABLE DISTRIBUTION_KEY PARTITION_KEY ZONEMAPPED
--------------- ------------- -------- ---------------- ------------- ----------
x INT TRUE TRUE FALSE FALSE
y BOOLEAN TRUE FALSE FALSE TRUE
In profiling, an operation that utilized the zone records will show WITH ZONEMAP
in the PART_INFO
field:
CREATE TABLE my_table ( x int, y boolean);
ENFORCE ZONEMAP ON my_table(x);
INSERT INTO my_table(x) VALUES BETWEEN 1 AND 1000000;
ALTER SESSION SET PROFILE='ON';
SELECT x,y FROM my_table WHERE x<=5;
ALTER SESSION SET PROFILE='OFF';
FLUSH STATISTICS;
SELECT * FROM EXA_USER_PROFILE_LAST_DAY WHERE SESSION_ID = CURRENT_SESSION;
Supported predicates
All comparison predicates are supported, as well as a subset of special predicates.
Predicates | Designation |
---|---|
> < = <= >= |
Comparison predicates |
[NOT] BETWEEN , [NOT] IN , IS [NOT] NULL |
Special predicates |
AND
|
Conjunction |
Supported data types
Full support
The database can use zone maps on all the supported predicates in columns of the following types:
BOOLEAN
DATE
DECIMAL
DOUBLE
INTERVAL YEAR TO MONTH
TIMESTAMP
Limited support
The database can use zone maps for a subset of the supported predicates in columns of the following type:
INTERVAL DAY TO SECOND
- only supports the=
(equals) filter
Unsupported data types
For columns of the following types there is no automatic creation of zone maps, and zone maps cannot be enforced.
CHAR
GEOMETRY
HASHTYPE
VARCHAR