Managed GPU resources for UDFs

This article explains how GPU resources are managed when using GPU acceleration in UDFs.

Introduction

GPUs are managed resources in Exasol. The fundamental GPU resource management is to only allow exclusive usage of the available GPUs to a single UDF call. This article explains how the GPU resource management mechanism in Exasol works, and provides examples and best practices for preventing resource usage issues in UDFs that use GPU acceleration.

The GPU resource management mechanism is independent of Exasol’s general resource manager and its consumer groups.

How GPU resources are managed

In each select/sub-select statement, only a single UDF call is allowed to exclusively use all the available GPUs. The resource reservation by the GPU resource management happens per UDF call and individually for each UDF call.

When multiple queries that each contain a single GPU-accelerated UDF are executed concurrently, the executions (UDF calls) are serialized. Each UDF call will wait to execute until an exclusive resource usage is made possible by accelerator resources being freed up by other UDF calls.

This means that simultaneous execution of multiple GPU-accelerated UDF calls as part of a single select/sub-select statement is not possible. All UDF calls except one will in this case either fail or fall back to CPU usage, depending on the configuration (see UDF Acceleration options).

Examples

UDF example – GPU accelerated UDF

Copy

--/
CREATE OR REPLACE PYTHON_GPU SCALAR SCRIPT GPU_VISIBLE_DEVICES() RETURNS VARCHAR(20) AS
%perInstanceRequiredAcceleratorDevices None|GpuNvidia;
import os
def run(ctx):
    return os.getenv('NVIDIA_VISIBLE_DEVICES', "none")
/

Query example – Single GPU accelerated UDF call in multiple concurrent queries

Queries:

Copy

-- Single UDF per Query: Executed query in first concurrent session
select GPU_VISIBLE_DEVICES();
-- Single UDF per Query: Executed query in second concurrent session
select GPU_VISIBLE_DEVICES();

Result for each query:

Copy

GPU_VISIBLE_DEVICES()
--------------------- 
all

Query example - Multiple GPU accelerated UDF calls in single query (second UDF with CPU-fallback)

Query:

Copy

-- Multiple UDFs in same query (sub-select)
select GPU_VISIBLE_DEVICES() as UDF1, GPU_VISIBLE_DEVICES() as UDF2;

Result:

Copy

UDF1  UDF2 
----  ---- 
all   none

Undefined behavior: It is not predictable which of the two UDF calls will be accelerated.

Recommendations

Use only a single UDF per query

Use only a single GPU accelerated UDF call in the complete query. This prevents issues with requiring exclusive resource usage multiple times in the same select/sub-select.

Use UDF instance limiting

Use the UDF instance limiting option to control the number of GPU accelerators that are used per UDF instance. See also the example GPU usage across UDF instances.

Use query timeouts

To prevent endlessly running queries blocking GPU resources, set query timeouts for queries containing GPU accelerated UDFs.

For more information about how to use query timeouts, see the documentation for QUERY_TIMEOUT in ALTER SYSTEM (system level), ALTER SESSION (session level), and Resource manager (consumer group level).

Managed GPU resources for UDFs

Introduction

How GPU resources are managed

Examples

UDF example – GPU accelerated UDF

Query example – Single GPU accelerated UDF call in multiple concurrent queries

Query example - Multiple GPU accelerated UDF calls in single query (second UDF with CPU-fallback)

Recommendations

PRODUCT

RESOURCES