Zonemaps

Zonemaps is a new performance feature in Exasol 8 that allows the database to decrease I/O and computation cost of some queries.

Exasol distributes table data across nodes, where each node logically splits the table into multiple contiguous-rows objects. Each such object (consisting of a partial span of the total table rows) is called a segment and typically contains data that is inserted around the same time.

A segment is a logical construct, not physical. Data is still physically stored on disk in a columnar fashion.

A zonemap is a metadata layer on these segments. With zonemaps enabled on a column, each segment has an associated zone record for that column, which contains some summary information about that column-segment including the minimum and maximum values contained within.

Using the zone record, the database determines if data in a segment for a particular column cannot satisfy a predicate, for example column_date > 2023-01-01. In this case, data processing can skip the whole segment, avoiding I/O and CPU costs. This can provide performance improvements on many queries.

Exasol automatically enables zonemaps on each partition column in the database. Zonemaps promise the best performance when similar data is grouped into segments, and this is the function of Exasol partitioning. Whenever a query contains a supported predicate on a column with zone records (in this automatic case, any partition column), they are used to exclude segments containing data wholly unable to satisfy the predicate from the table scan.

Zonemaps may also be used without partitioning. For this, you must enable zonemaps manually on a column using this syntax:

CREATE TABLE my_table(x int, y boolean);
ENFORCE ZONEMAP ON my_table(x);

A table is not limited to one column with zonemaps. If you wish to have zonemaps on additional columns of a table, you may enforce zonemap on each subsequent column:

ENFORCE ZONEMAP ON my_table(y);

To remove a zonemap from a column, use the command:

DROP ZONEMAP ON my_table(x);

A zonemap has to maintain the metadata contained in the zone records of each segment. This has a small cost during DML operations. Altering the row ordering of a table (by re-partitioning, redistributing, or reorganizing the table) also requires computation of revised zone records for each segment.

Zonemaps are best used on columns that have frequent predicate use, and where the underlying table shows data locality on this column. Data locality is explicit on partitioned tables. It may also occur where there is, for example, a monotonically increasing data field such as a date or timestamp. If this field will be used often in predicates, then it is a good candidate for zonemaps.

Inspecting Zonemaps

A column under zonemapping will show TRUE in the COLUMN_IS_ZONEMAPPED column of the EXA_{DBA,USER,ALL}_COLUMNS tables.

SELECT COLUMN_NAME, COLUMN_IS_ZONEMAPPED FROM EXA_ALL_COLUMNS;
COLUMN_NAME  COLUMN_IS_ZONEMAPPED 
------------ -------------------- 
x            false                
y            true                

When using the DESC[RIBE] statement on a table, ZONEMAPPED will be shown as TRUE for columns that are zonemapped:

DESCRIBE my_table;
COLUMN_NAME     SQL_TYPE      NULLABLE DISTRIBUTION_KEY PARTITION_KEY ZONEMAPPED 
--------------- ------------- -------- ---------------- ------------- ---------- 
x               INT           TRUE     TRUE             FALSE         FALSE      
y               BOOLEAN       TRUE     FALSE            FALSE         TRUE       

In profiling, an operation that utilized the zone records will show WITH ZONEMAP in the PART_INFO field:

CREATE TABLE my_table ( x int, y boolean);
ENFORCE ZONEMAP ON my_table(x);
INSERT INTO my_table(x) VALUES BETWEEN 1 AND 1000000;
ALTER SESSION SET PROFILE='ON';
SELECT x,y FROM my_table WHERE x<=5;
ALTER SESSION SET PROFILE='OFF';
FLUSH STATISTICS;
SELECT * FROM EXA_USER_PROFILE_LAST_DAY WHERE SESSION_ID = CURRENT_SESSION;

profiling

Supported Predicates

All comparison predicates are supported, as well as a subset of special predicates.

Predicates Designation
> < = <= >= Comparison predicates
[NOT] BETWEEN, [NOT] IN, IS [NOT] NULL Special predicates
AND Conjunction

Supported Data Types

Full support

The database can use zonemaps on all the supported predicates in columns of the following types:

  • BOOLEAN
  • DATE
  • DECIMAL
  • DOUBLE
  • INTERVAL YEAR TO MONTH
  • TIMESTAMP

Limited support

The database can use zonemaps for a subset of the supported predicates in columns of the following type:

  • INTERVAL DAY TO SECOND - only supports the = (equals) filter

Unsupported data types

For columns of the following types there is no automatic creation of zonemaps, and zonemaps cannot be enforced.

  • CHAR
  • GEOMETRY
  • HASHTYPE
  • VARCHAR