Zone maps

This article explains how to use zone maps for partition pruning in Exasol 8.

Zone maps is a new performance feature in Exasol 8 that allows the database to decrease the I/O and computation cost of certain queries. This feature is also known as partition pruning.

Exasol distributes table data across nodes, where each node logically splits the table into multiple contiguous-rows objects. Each such object, consisting of a partial span of the total table rows, is called a segment. Each segment typically contains data that was inserted around the same time.

A segment is a logical construct, not physical. Data is still physically stored on disk in a columnar fashion.

A zone map is a metadata layer on these segments. With zone maps enabled on a column, each segment has an associated zone record for that column, which contains some summary information about that column-segment including the minimum and maximum values contained within.

Using the zone record, the database determines if data in a segment for a particular column cannot satisfy a predicate, for example, column_date > 2023-01-01. In this case, data processing can skip the whole segment, avoiding I/O and CPU costs. This can improve performance on many queries.

Exasol automatically enables zone maps on each partition column in the database. Zone maps promise the best performance when similar data is grouped into segments, and this is the function of Exasol partitioning. Whenever a query contains a supported predicate on a column with zone records (in this automatic case, any partition column), they are used to exclude segments containing data wholly unable to satisfy the predicate from the table scan.

Zone maps may also be used without partitioning. For this, you must manually enable zone maps on a column using this syntax:

CREATE TABLE my_table(x int, y boolean);
ENFORCE ZONEMAP ON my_table(x);

A table is not limited to one column with zone maps. If you wish to have zone maps on additional columns of a table, you may enforce zone map on each subsequent column:

ENFORCE ZONEMAP ON my_table(y);

To remove a zone map from a column, use the command:

DROP ZONEMAP ON my_table(x);

A zone map has to maintain the metadata contained in the zone records of each segment. This has a small cost during DML operations. Altering the row ordering of a table (by re-partitioning, redistributing, or reorganizing the table) also requires computation of revised zone records for each segment.

Zone maps are best used on columns that have frequent predicate use, and where the underlying table shows data locality on this column. Data locality is explicit on partitioned tables. It may also occur where there is, for example, a monotonically increasing data field such as a date or timestamp. If this field will be used often in predicates, then it is a good candidate for zone maps.

Inspect zone maps

A column under zone mapping will show TRUE in the COLUMN_IS_ZONEMAPPED column of the EXA_{DBA,USER,ALL}_COLUMNS tables.

SELECT COLUMN_NAME, COLUMN_IS_ZONEMAPPED FROM EXA_ALL_COLUMNS;
COLUMN_NAME  COLUMN_IS_ZONEMAPPED 
------------ -------------------- 
x            false                
y            true                

When using the DESC[RIBE] statement on a table, ZONEMAPPED will be shown as TRUE for columns that are zone mapped:

DESCRIBE my_table;
COLUMN_NAME     SQL_TYPE      NULLABLE DISTRIBUTION_KEY PARTITION_KEY ZONEMAPPED 
--------------- ------------- -------- ---------------- ------------- ---------- 
x               INT           TRUE     TRUE             FALSE         FALSE      
y               BOOLEAN       TRUE     FALSE            FALSE         TRUE       

In profiling, an operation that utilized the zone records will show WITH ZONEMAP in the PART_INFO field:

CREATE TABLE my_table ( x int, y boolean);
ENFORCE ZONEMAP ON my_table(x);
INSERT INTO my_table(x) VALUES BETWEEN 1 AND 1000000;
ALTER SESSION SET PROFILE='ON';
SELECT x,y FROM my_table WHERE x<=5;
ALTER SESSION SET PROFILE='OFF';
FLUSH STATISTICS;
SELECT * FROM EXA_USER_PROFILE_LAST_DAY WHERE SESSION_ID = CURRENT_SESSION;

profiling

Supported predicates

All comparison predicates are supported, as well as a subset of special predicates.

Predicates Designation
> < = <= >= Comparison predicates
[NOT] BETWEEN, [NOT] IN, IS [NOT] NULL Special predicates
AND Conjunction

Supported data types

Full support

The database can use zone maps on all the supported predicates in columns of the following types:

  • BOOLEAN
  • DATE
  • DECIMAL
  • DOUBLE
  • INTERVAL YEAR TO MONTH
  • TIMESTAMP

Limited support

The database can use zone maps for a subset of the supported predicates in columns of the following type:

  • INTERVAL DAY TO SECOND - only supports the = (equals) filter

Unsupported data types

For columns of the following types there is no automatic creation of zone maps, and zone maps cannot be enforced.

  • CHAR
  • GEOMETRY
  • HASHTYPE
  • VARCHAR