Configure smart caching in Lakehouse Turbo

This article explains how to accelerate queries by using the smart caching in Lakehouse Turbo.

Prerequisites

Lakehouse Turbo must be activated for the database and connected to a catalog in your data lakehouse. To learn how to sign up and get started, see Get started with Lakehouse Turbo.

Configure replication

  1. In the Lakehouse Turbo Replication tab, select the catalog and schema to accelerate:

    select catalog and schema to replicate

  2. Select the tables that you want to replicate. To select all tables, check the box at the top of the Replicate column.

    select tables to replicate

  3. Configure the cache retention period and caching mode.

    The retention period is defined on the schema level. It defines the interval in seconds between two cache invalidations. If, for example, the interval is set to 60, the cache will be invalidated every minute.

    The Mode column defines the caching strategy on the table level.

    • INCREMENTAL (default) = The table is cached incrementally - only new, updated or removed data since the load is cached.

    • FULL LOAD = The table is truncated before caching. Afterwards, all available data is cached. This mode is recommended for data marts that are overwritten on a regular basis.

    lakehouse schema configuration

  4. To save the schema settings and start replication of the schema, select Enabled then click on Save.

Monitor caching status

The caching status of each table can be found in the tabular overview on schema level.

Example:

table caching status

Additionally, Lakehouse Turbo keeps track of all caching activities in the table exa_dlhc.replication_log. To view the caching activities sorted by execution time descending, execute the following query:

Copy
SELECT * FROM exa_dlhc.replication_log
ORDER BY event_ts DESC;