Migration Testing

Testing plays an integral part in the data warehouse migration process, and having a well-defined test plan is a key element in contributing to a successful migration. You can perform data warehouse migration testing in multiple phases during the entire migration process. This helps in identifying gaps and incorporating the necessary modifications as early as possible in the process.

Migration Testing

You can automate testing as much as possible, making each test repeatable and enabling a consistent approach to evaluating results. However, ensure you do not skip any testing step in the process.

For this section, we assume that you have tested various queries / jobs in Oracle, and you can use these result sets as a reference.

When testing the data warehouse migration, it is important to ensure the end results between Oracle and Exasol are identical. In addition to this, you also have to consider testing SLAs. Testing SLAs is crucial and must be included as a part of your test procedures whenever possible. For example, consider testing SLAs that define:

  • Extraction time-frames for source system
  • Availability
  • Backup / recovery
  • Loading frequency
  • Time when the data must be ready for reports
  • How long queries of a report can take

Sample Test Documentation

While conducting the tests, it is highly advised to define a test documentation template to enforce a complete and standardized testing procedure. This document must cover:

  • Job ID, Job Name, Job Type
  • Test Overview – describing what and how it will be tested
  • Sources and Targets of the Job
    • Structure – Oracle and translated Exasol DDLs
    • Data – Making sure the data in the source and target tables of the query are the same in Oracle and Exasol
      • Queries that loaded the initial state
  • Queries to test – Original Oracle and translated to Exasol
  • Check queries to compare the result sets of the Oracle SQL and translated Exasol SQL
  • Analysis – What is causing the difference?
  • Adjustments / Solution

Generic Test Approach

The Database Migration script in our GitHub repository automatically generates DDLs for tables and constraints in Exasol. This script also creates import statements to load data into Exasol and includes generic test statements.

A generic Oracle to Exasol migration test procedure is structured as follows:

  1. To test a load, a test table is created for each table of the migration. The test table's name has "_MIG_CHK" added as a suffix to the original table name.

  2. The test table has two rows that contain the results of the test queries for Oracle and Exasol. The first column indicates the database, and the second column indicates the overall row count of the tables. Depending on the data type, the rest of the columns contain metrics such as:
    1. Maximum / minimum values / length
    2. Count of distinct values
    3. Count null values
    4. Distinct values
    5. Average values
  3. Example: 

CREATE OR REPLACE TABLE "C##DB_MIG"."SALES_MIG_CHK" AS 
SELECT    CAST('EXASOL' AS VARCHAR2(20)) AS "DB_SYSTEM",    
        CAST(COUNT(*) AS DECIMAL(36,0)) AS "CNT_SALES",
        CAST(SUM(CASE WHEN "SALES"."SALES_DATE" IS NULL THEN 1 END) AS DECIMAL(36, 0)) AS "SALES_DATE_CNT_NULL",
        CAST(COUNT(DISTINCT "SALES"."SALES_DATE") AS DECIMAL(36, 0)) AS "SALES_DATE_CNT_DST",
        CAST(MIN("SALES"."SALES_DATE") AS TIMESTAMP) AS "SALES_DATE_MIN",
        CAST(MEDIAN("SALES"."SALES_DATE") AS TIMESTAMP) AS "SALES_DATE_MED",
        CAST(MAX("SALES"."SALES_DATE") AS TIMESTAMP) AS "SALES_DATE_MAX",
        ...
FROM    "C##DB_MIG"."SALES"
;

INSERT INTO    "C##DB_MIG"."SALES_MIG_CHK" 
SELECT     *
FROM     (
    IMPORT FROM    JDBC AT ORACLE_JDBC STATEMENT 
    'SELECT    CAST('' Oracle'' AS VARCHAR2(20)) AS "DB_SYSTEM",
            CAST(count(*) AS DECIMAL(36,0)) AS "CNT_SALES",
            CAST(sum(CASE WHEN "SALES"."SALES_DATE" IS NULL THEN 1 END) AS DECIMAL(36,0)) AS "SALES_DATE_CNT_NULL",
            CAST(count(DISTINCT "SALES"."SALES_DATE") AS DECIMAL(36,0)) AS "SALES_DATE_CNT_DST", 
            CAST(min("SALES"."SALES_DATE") AS TIMESTAMP) AS "SALES_DATE_MIN",
            CAST(median("SALES"."SALES_DATE") AS TIMESTAMP) AS "SALES_DATE_MED",
            CAST(max("SALES"."SALES_DATE") AS TIMESTAMP) AS "SALES_DATE_MAX",
                ...
    FROM    "C##DB_MIG"."SALES"'
)
;
  1. A summary table called MIGRATION_CHECK is created in the Database Migration Schema. This summary table contains the columns for schema name, table name, column name of the specific test table, the name of the metric, results for Oracle and Exasol, and lastly, a timestamp. This structure allows for a more straightforward analysis of the differences between the results of Oracle and Exasol.
    Example: 
CREATE OR REPLACE TABLE "DATABASE_MIGRATION"."C##DB_MIG_MIG_CHK"
        schema_name VARCHAR(128), table_name VARCHAR(128), column_name VARCHAR(128),
        metric_schema VARCHAR(128), metric_table VARCHAR(128), metric_name VARCHAR(128),
        exasol_metric VARCHAR(50), oracle_metric VARCHAR(50),
        check_timestamp TIMESTAMP DEFAULT CURRENT_TIMESTAMP
)
;


INSERT INTO    "DATABASE_MIGRATION"."C##DB_MIG_MIG_CHK" (
        schema_name, table_name, column_name,
        metric_schema, metric_table, metric_name,
        exasol_metric, oracle_metric

WITH Exasol AS (
    SELECT    'C##DB_MIG' AS schema_name, 'SALES' AS table_name, 'MONEY_GIVEN' column_name,
            'DATABASE_MIGRATION' metric_schema, 'SALES_MIG_CHK' metric_table, '"MONEY_GIVEN_AVG"' AS metric_name,
            to_char("MONEY_GIVEN_AVG") AS Exasol_metric
    FROM     "C##DB_MIG"."SALES_MIG_CHK"
    WHERE     DB_SYSTEM = 'Exasol'
    
    UNION ALL
    ...
),
Oracle AS (
    SELECT     'C##DB_MIG' AS schema_name, 'SALES' AS table_name, 'MONEY_GIVEN' column_name,
            'DATABASE_MIGRATION' metric_schema, 'SALES_MIG_CHK' metric_table, '"MONEY_GIVEN_AVG"' AS metric_name,
            to_char("MONEY_GIVEN_AVG") AS Oracle_metric
    FROM    "C##DB_MIG"."SALES_MIG_CHK"
    WHERE    DB_SYSTEM = 'Oracle'
    
    UNION ALL 
    
    ...
)
SELECT    e.schema_name, e.table_name, e.column_name,
        e.metric_schema, e.metric_table, e.metric_name,
        e.exasol_metric, o.oracle_metric
FROM    exasol e JOIN oracle o 
ON    e.schema_name = o.schema_name
AND    e.table_name = o.table_name
AND    e.metric_name = o.metric_name
;

While the generic test statements might be sufficient for initial loads from Oracle and source loads, ETL/ELT processes might need some adjustments, such as additional metrics and / or filters.

Initial Loads

One of the first technical milestones would be to migrate the relevant structures and data from Oracle into Exasol. The generic tests provide a good baseline to immediately check the data after being imported from Oracle into Exasol.

There are other aspects that you still need to consider, such as:

  • The right character representation
  • Checking if the correct data is in the correct column (for example, loading SHIPPING_DATE column into the ORDER_DATE and vice versa)

Source Loads

There are two different ways you can load data into the data warehouse.

  • Using an external data provider: When you use an external data provider to push data into the data warehouse, the data provider needs to export the data into Exasol, including drivers, network access, schemas, users with specific roles/rights, and so on.
  • Pulling data into the data warehouse: The other way to load data is by pulling data into the data warehouse, for example, by import jobs scheduled in an ETL / ELT tool or by a dedicated job scheduler.

The sources, for example, can be files, other source databases, or even APIs. Even though the different sources may rely on similar metrics to be checked on, each source type may require a different approach due to restrictions or limitations.

To test the above-mentioned ways to load data, Oracle's data loading procedures must be translated into the Exasol dialect. To create a scenario, you must define the data in the source and target tables. Make sure that the sources and target tables contain the same data. As a best practice, compare the test setup between Oracle and Exasol before and after the execution of the source loads. The testing procedures generated by the migration script are a good baseline to spot obvious differences.

ETL / ELT Processes

ETL / ELT procedures may contain complex SQL constructs, functions, and procedures to load the core data warehouse and compute business logic. The translation of the ETL / ELT processes from Oracle to Exasol is key to ensure the same results in every step through the data warehouse.

The testing procedure for the ETL / ETL process is similar to source loads; however, the test queries could require more metrics due to the complex nature of these statements. These additional metrics could include uniqueness, referential integrity, various other data cleaning metrics, and data transformation / business logic with their test cases.

Roles / Users / Rights

In addition to structures, data, processes, testing users' rights and roles is a crucial part of the whole migration. While roles and users can be easily migrated and tested from Oracle to Exasol, mapping and testing access rights require some work.

To identify the missing rights for a user while testing, you must execute all commands in Exasol with the same user as in Oracle. Additionally, you can define use cases to focus on testing the operations a user/role is not allowed to perform. This is to ensure the users have the correct privileges. You can also include scenarios to test row-level security.

For more details about access rights in Oracle, see the following sections:

For additional information, refer to the Oracle database documentation.

For more details about the access management in Exasol, see the following sections:

BI Tools and Other Applications

A data warehouse is usually the source for a lot of different data consumers like BI Tools, ETL tools, and other applications. One requirement is that Exasol is supplying these data consumers with the same data that Oracle would, and all the operations the data consumers are doing deliver the same results as before. Therefore, verifying and validating the integration of all the relevant external components, such as job Scheduler, reports in BI Tools or other applications. This is important for the success of the whole migration.