Adding New Packages to Existing Script Languages
This article explains how to expand existing script languages with additional packages or libraries.
A script language container (SLC) consists of a Linux container with a complete Linux distribution and all required libraries. You can install the libraries and other dependencies to an SLC with a package manager such as apt or pip. This simplifies installation of packages with many dependencies.
Exasol provides different flavors of SLCs for Python, R, and Java. A flavor is essentially a build description that contains a number of files and a description of how they are connected. Exasol also provides various packages for these languages. You can choose to use one of our pre-built SLCs, or you can extend one of our flavors with additional packages and build your own SLC. How to do this is described in this article.
You can download the current releases of our pre-built SLCs from our GitHub repository, where you will also find additional documentation about the SLCs and their compatibility with different versions of Exasol.
The Lua script language is not expandable and is natively compiled into the Exasol database software.
Use a pre-built script language container (SLC)
Step 1: Download a pre-built SLC
Download the current version of the pre-built SLC from the Releases page.
Step 2: Upload the SLC to the BucketFS in Exasol
For instructions on how to upload the SLC to BucketFS, see also Manage Buckets and Files in BucketFS.
The following example using curl defines an environment for the examples in the subsequent steps:
local_container_file = "path/to/container.tar.gz"
bucketfs_host = "192.168.6.75"
bucketfs_port = "1234"
write_password = "write-password"
bucketfs_name = "bfsdefault"
bucket_name = "default"
path_in_bucket = "path/in/bucket/"
container_file_in_bucket = "container.tar.gz"
container_name = "container"
protocol = "HTTP" # or HTTPS
curl -vX PUT -T "$local_container_file" "$protocol://w:$write_password@$bucketfs_host:$bucketfs_port/$bucket_name/$path_in_bucket$container_file_in_bucket
The uploaded SLC must have the file extension .tar.gz
, .tar.bz2
, or .zip
. Files with other file extensions will not be extracted and the contents will not be available in BucketFS.
The path_in_bucket
value must end with a forward slash /
if it is used. This value is optional and can be left empty.
Step 3: Activate the new SLC
When the SLC is uploaded to BucketFS it will be distributed and extracted to all nodes (this will typically take a few minutes). When the SLC has been installed on all nodes, you can activate and use it.
To activate a new SLC you must edit the session parameter SCRIPT_LANGUAGES
. This parameter defines a list of aliases for script language/container combinations. You can change a script alias as needed, either session-wide using ALTER SESSION or system-wide using ALTER SYSTEM. Changing a script alias is especially useful if you have multiple language versions in parallel, or when migrating from one version to another.
The current session and system parameters are found in the EXA_PARAMETERS system table:
The default values are internal macros to make the parameter compact and to dynamically use the last installed version. For example, the alias for bultin_java
would look like the following:
JAVA=localzmq+protobuf:///bfsdefault/default/EXAClusterOS/ScriptLanguages-release-standard-exasol-<version>/?lang=java#/buckets/bfsdefault/default/EXAClusterOS/ScriptLanguages-release-standard-exasol-<version>/exaudf/exaudfclient
Best practices:
-
Always activate and test the SLC in your session before you activate it system-wide.
-
Add new aliases for non-standard SLCs, and only overwrite the built-in defined aliases if necessary.
Define your own script aliases
For SLCs derived from the pre-built SLCs, you can build the ALTER SESSION statement using the exaslct tool with the command generate-language-activation
and specifying the bucket details. This method is less error-prone than writing the statement from scratch. For more information, see Exasol Script Languages Container Tool.
The SCRIPT_LANGUAGES
parameter is a string that lists all active script aliases separated with spaces. Each script alias consists of an ALIAS_NAME
and either a LANGUAGE_URL
or the builtin_*
macro.
The syntax for the SCRIPT_LANGUAGES
parameter is as follows:
The LANGUAGE_URL
contains the following information for the database:
path_to_linux_container
: BucketFS path to the Linux container that is used for the root file system of the UDF.path_to_client
: BucketFS path to the script client. Usually, this path should be a sub-directory of the path to the Linux container.client_param_list
: Parameter list for the script client.
The complete script alias is as follows:
<alias>=localzmq+protobuf:///<path_to_linux_container>/[?<client_param_list>]#<path_to_client>
For SLCs derived from the pre-built SLCs you can use the following template to create your language URL:
<alias>=localzmq+protobuf:///<bucketfs_name>/<bucket_name>/<path_in_bucket><container_name>/?lang=<language>#buckets/<bucketfs_name>/<bucket_name>/<path_in_bucket><container_name>/exaudf/exaudfclient
The following language
parameter options are supported:
java
r
python
You can also generate the language URL for the alias using exaslct, which is the tool used by Exasol to build the pre-built containers. For more information and downloads, see Exasol Script Languages Container Tool.
The script alias must not contain spaces.
The container_name
is the file name of the SLC in BucketFS, without the file extension. The file extenstion must be omitted since BucketFS extracts archives with the file extensions .tar.gz
, .tar.bz2
, or .zip
.
The path_in_bucket
value must end with a forward slash /
if it is used. This value is optional and can be left empty.
Example
Based on the contents and examples above, we get the following ALTER SESSION
command:
ALTER SESSION SET SCRIPT_LANGUAGES='PYTHON3=builtin_python3 R=builtin_r JAVA=builtin_java MY_PYTHON=localzmq+protobuf:///bfsdefault/default/path/in/bucket/container/?lang=python#buckets
/bfsdefault/default/path/in/bucket/container/exaudf/exaudfclient';
This adds the new script alias MY_PYTHON
to the SCRIPT_LANGUAGES
parameter, defining the container as container.tar.gz
(file endings must be omitted) and the language as python
.
To use the alias, create a new UDF using the CREATE SCRIPT command and specify the new script as MY_PYTHON
:
--/
CREATE MY_PYTHON SCALAR SCRIPT my_python_udf(input_parameter VARCHAR(2000))
RETURNS VARCHAR(2000) AS
def run(ctx):
return ctx.input_parameter
/
Customize an existing SLC
You can build a customized script language container (SLC) with the libraries that you need. The easiest way to do this is by using exaslct, which is the tool used by Exasol to build the pre-built containers. Using exaslct you can build a new container from a build description that we call flavor. A flavor contains a number of Docker files and a description of how they are connected. For all flavors, we also provide a dedicated mechanism to customize them. After you customize and build your container, the rest of the installation procedure is the same as for our pre-built containers.
For more information and downloads, see Exasol Script Languages Container Tool.
Exasol AI Lab
Exasol AI Lab provides a virtual machine image that can be used as a developer sandbox to easily build script language containers. The image is pre-configured with all necessary dependencies installed, and is available in multiple editions involving different technology stacks.
The Exasol AI Lab image contains a Jupyter Notebook environment, which includes a tutorial that will do the following:
-
Download the script-languages-release repositories
-
Customize the packages
-
Build the script language container
-
Test the new package
For more information, see the Exasol AI Lab User Guide .
Adding JAR files to a Java UDF
As an alternative to adding a new script language container, you can specify required JAR files in a Java UDF using their BucketFS path.
For example, if you want to use Google’s library to process telephone numbers, you could upload the file in a
bucket named javalib
. The corresponding local bucket path would look like the following: /buckets/bfsdefault/javalib/libphonenumber-4.2.jar
.
In the following script, you can see how the path is configured to import the library.
--/
CREATE JAVA SCALAR SCRIPT jphone(num VARCHAR(2000))
RETURNS VARCHAR(2000) AS
%jar /buckets/bfsdefault/javalib/libphonenumber-4.2.jar;
import com.google.i18n.phonenumbers.PhoneNumberUtil;
import com.google.i18n.phonenumbers.NumberParseException;
import com.google.i18n.phonenumbers.Phonenumber.PhoneNumber;
class JPHONE {
static String run(ExaMetadata exa, ExaIterator ctx) throws Exception {
PhoneNumberUtil phoneUtil = PhoneNumberUtil.getInstance();
try {
PhoneNumber swissNumberProto = phoneUtil.parse(ctx.getString("num"),
"DE");
return swissNumberProto.toString();
} catch (NumberParseException e) {
System.err.println("NumberParseException thrown: " + e.toString());
}
return "failed";
}
}
/