Managed GPU resources for UDFs
This article explains how GPU resources are managed when using GPU acceleration in UDFs.
Introduction
GPUs are managed resources in Exasol. The fundamental GPU resource management is to only allow exclusive usage of the available GPUs to a single UDF call. This article describes how the GPU resource management mechanism in Exasol works, and provides examples and best practices for preventing resource usage issues in UDFs that use GPU acceleration.
The GPU resource management mechanism is independent of Exasol’s general resource manager and its consumer groups.
How GPU resources are managed
In each select/sub-select statement, only a single UDF call is allowed to exclusively use all the available GPUs. The resource reservation by the GPU resource management happens per UDF call and individually for each UDF call.
When multiple queries that each contain a single GPU-accelerated UDF are executed concurrently, the executions (UDF calls) are serialized. Each UDF call will wait to execute until an exclusive resource usage is made possible by accelerator resources being freed up by other UDF calls.
This means that simultaneous execution of multiple GPU-accelerated UDF calls as part of a single select/sub-select statement is not possible. All UDF calls except one will in this case either fail or fall back to CPU usage, depending on the configuration (see UDF Acceleration options).
Examples
UDF example – GPU accelerated UDF
--/
CREATE OR REPLACE PYTHON_GPU SCALAR SCRIPT GPU_VISIBLE_DEVICES() RETURNS VARCHAR(20) AS
%perInstanceRequiredAcceleratorDevices None|GpuNvidia;
import os
def run(ctx):
return os.getenv('NVIDIA_VISIBLE_DEVICES', "none")
/
Query example – Single GPU accelerated UDF call in multiple concurrent queries
Queries:
-- Single UDF per Query: Executed query in first concurrent session
select GPU_VISIBLE_DEVICES();
-- Single UDF per Query: Executed query in second concurrent session
select GPU_VISIBLE_DEVICES();
Result for each query:
GPU_VISIBLE_DEVICES()
---------------------
all
Query example - Multiple GPU accelerated UDF calls in single query (second UDF with CPU-fallback)
Query:
-- Multiple UDFs in same query (sub-select)
select GPU_VISIBLE_DEVICES() as UDF1, GPU_VISIBLE_DEVICES() as UDF2;
Result:
UDF1 UDF2
---- ----
all none
Undefined behavior: It is not predictable which of the two UDF calls will be accelerated.
Recommendations
-
Use only a single UDF per query
-
Use only a single GPU accelerated UDF call in the complete query. This prevents issues with requiring exclusive resource usage multiple times in the same select/sub-select.
-
Use UDF instance limiting
-
Use the UDF instance limiting option to control the number of GPU accelerators that are used per UDF instance. See also the example GPU usage across UDF instances.
-
Use query timeouts
-
To prevent endlessly running queries blocking GPU resources, set query timeouts for queries containing GPU accelerated UDFs.
For more information about how to use query timeouts, see the documentation for
QUERY_TIMEOUTin ALTER SYSTEM (system level), ALTER SESSION (session level), and Resource manager (consumer group level).