Protegrity Integration

This article explains how to use Protegrity to enhance data security in Exasol through tokenization.

Tokenization is the process of substituting sensitive data with non-sensitive data in the form of tokens. Protegrity allows you to use tokens and mask data in your Exasol database using user defined functions (UDFs) with the Protegrity Application Protector REST API.

Prerequisites

You must have a working installation of Protegrity. This section does not explain how to install and deploy Protegrity.

Preparation

Setting up a script language alias

Integration of Protegrity with Exasol is accomplished by allowing users to define new scalar SQL user defined functions (UDFs) using a special script language created for that purpose. The script language is included in Exasol but is not active by default. In order to use it, you must define a script language alias using the system or session parameter SCRIPT_LANGUAGES:

Copy

ALTER SYSTEM SET SCRIPT_LANGUAGES='... PROTEGRITY=udfplugin:///applicationprotector_rest_72 ...'

In the statement above, ... represents existing content in the session parameter. The following definitions are included by default in the session parameter:

Copy

R=builtin_r JAVA=builtin_java PYTHON3=builtin_python3

To make Protegrity Application Protector REST UDFs available in your current SQL session, the complete ALTER SESSION statement would therefore be:

Copy

ALTER SESSION SET SCRIPT_LANGUAGES='R=builtin_r JAVA=builtin_java PYTHON3=builtin_python3 PROTEGRITY=udfplugin:///applicationprotector_rest_72'

To make the script alias available for all future sessions, define it using the ALTER SYSTEM statement:

Copy

ALTER SYSTEM SET SCRIPT_LANGUAGES='R=builtin_r JAVA=builtin_java PYTHON3=builtin_python3 PROTEGRITY=udfplugin:///applicationprotector_rest_72'

If your SCRIPT_LANGUAGES variable is different from the default, change the statements in the examples as needed.

You can choose any name for the script alias.

SESSION/SCOPE users

When communicating with the REST API, the database has to decide which ESA user to identify as. There are two options:

The SESSION user is the user who logged into the SQL session.
The SCOPE user is identical to the session user, except for the case where a user defines a view that contains Protegrity UDFs. In this case the owner of the view is sent as user name.

The name of the script language alias defines whether the session user or the scope user is used.

PROTEGRITY=udfplugin:///applicationprotector_rest_72 (as in the examples above) will use the scope user.
PROTEGRITY=udfplugin:///applicationprotector_rest_72_ua will use the session user.

If your data strategy requires both behaviors, you can define multiple language aliases in the SCRIPT_LANGUAGES parameter.

It is not possible to prevent a session user from redefining the SCRIPT_LANGUAGES session variable. A user is effectively able to arbitrarily change the behavior.

This means that the scope user feature cannot be used to limit the access of a user that also is known to ESA in any way, as this user could just switch to session user behavior by redefining the script languages alias.

Methods

Protect/Unprotect

Once you have defined PROTEGRITY as a script language alias, you can create UDFs using this alias.

The script code of these UDFs consists of a number of key-value pairs. Each line contains a key-value pair where the key consists of the beginning of the line up to (and not including) the first blank. The part after the blank until the end of the line is the value.

Example:

Copy

CREATE OR REPLACE PROTEGRITY SCALAR SCRIPT protect_email(din VARCHAR(2000000)) RETURNS VARCHAR(2000000) AS
HOST localhost:9080
DATA_ELEMENT_NAME tok_email
METHOD protect
/

The data element tok_email must be defined in the Protegrity ESA connected to the REST API.

With this definition in place, you can protect email data using the UDF like a scalar function. For example:

Copy

SELECT protect_email(user_email) FROM USERS;

The corresponding unprotect method UDF can be defined as in this example:

Copy

CREATE OR REPLACE PROTEGRITY SCALAR SCRIPT unprotect_email(din VARCHAR(2000000)) RETURNS VARCHAR(2000000) AS
HOST localhost:9080
DATA_ELEMENT_NAME tok_email
METHOD unprotect
/

For a full list of supported keys, see Supported parameters.

Reprotect

Reprotect method UDFs require the specification of two data element names. For example:

Copy

CREATE OR REPLACE PROTEGRITY SCALAR SCRIPT protect_email(din VARCHAR(2000000)) RETURNS VARCHAR(2000000) AS
HOST localhost:9080
OLD_DATA_ELEMENT_NAME tok_email_old
NEW_DATA_ELEMENT_NAME tok_email_new
METHOD reprotect
/

Special Methods

The following special methods can be used to create UDFs that facilitate debugging and diagnostics.

Identity

Use this SQL statement to create a UDF that returns the input data without calling the REST API. This method is mainly intended for debugging and performance measurements. For example:

Copy

CREATE OR REPLACE PROTEGRITY SCALAR SCRIPT identity(din VARCHAR(2000000)) RETURNS VARCHAR(2000000) AS
HOST localhost:9080
METHOD identity
/

Whoami

Use this SQL statement to create a UDF that returns the user name that is currently used in calls to to the REST API.

Copy

CREATE OR REPLACE PROTEGRITY SCALAR SCRIPT whoami() RETURNS VARCHAR(2000000) AS
HOST localhost:9080
METHOD whoami
/

For example, in a session with SESSION user root, this would be the result when using this UDF:

Copy

SELECT whoami()
==> ROOT

You can use the whoami method to demonstrate the difference between session and scope use. Given this view, created as SYS user:

Copy

CREATE VIEW whoami_view AS SELECT whoami() AS wai FROM dual
GRANT SELECT ON whoami_view TO root

For the following queries, executed as user root, the results depend on the sessions settings.

Copy

ALTER SESSION SET SCRIPT_LANGUAGES='protegrity=udfplugin:///applicationprotector_rest_72_ua';
-- the next query returns ROOT
SELECT * FROM whoami_view;

ALTER SESSION SET SCRIPT_LANGUAGES='protegrity=udfplugin:///applicationprotector_rest_72';
-- the next query returns SYS
SELECT * FROM whoami_view;

Version

To create a UDF that returns the version of the Application Protector REST software, use the following SQL statement:

Copy

CREATE OR REPLACE PROTEGRITY SCALAR SCRIPT version() RETURNS VARCHAR(2000000) AS
HOST localhost:9080
METHOD version
/

Supported parameters

This section lists all supported parameters for Application Protector REST.

The following parameters are used in production:

Parameter	Data type
`METHOD`	String
Possible values are `protect`, `unprotect`, `reprotect`, `version`, `whoami`, and `identity`.

`HOST`	String
Possible values are in the format `<host>:<port>` and specify the address of the REST API service. Multiple values can be specified, separated by single space characters. During processing, requests will be randomly distributed among the hosts in the list.

`BASIC_AUTH_CONNECTION`	String
Possible values are any Exasol connection name. This allows to store the address of the REST API service in Exasol connection objects, and facilitates using HTTP basic authentication with the REST API by specifying a user name and password in the Exasol connection object. If both `HOST` and `BASIC_AUTH_CONNECTION` are provided, the hostname in `HOST` is used instead of the one in the connection object.

`DATA_ELEMENT_NAME`	String
Specifies the name of the Protegrity data element to be used by the REST API service. This parameter is required for the methods `protect` and `unprotect`.

`OLD_DATA_ELEMENT_NAME`, `NEW_DATA_ELEMENT_NAME`	String
These parameters specify the elements to be used with the `reprotect` method.

`VERIFY_SSL`	Boolean
Specifies whether or not the REST API service host’s SSL certificate should verified. Possible values are `true` and `false`. The default value is `true`.

`VERIFY_SSL_USE_CUSTOM_CERTIFICATES`	Boolean
Specifies if custom certificate files that are installed in the Exasol cluster should be considered when verifying SSL certificates. Possible values are `true` and `false`. The default value is `true`.

`ATTEMPTS`	Integer
Specifies the number of times the connector should retry to send a message to the REST API if it fails to provide and answer for a request. The default value is 100.

`EXTERNAL_IV`, `EXTERNAL_TWEAK`	String
IV and Tweaks options for the `protect` and `unprotect` methods. For more information, refer to the Protegrity website.

`NEW_EXTERNAL_IV`, `OLD_EXTERNAL_IV`, `NEW_EXTERNAL_TWEAK`, `OLD_EXTERNAL_TWEAK`	String
IV and Tweaks options for the `reprotect` method. For more information about these options, contact Protegrity.

Additional parameters

The following additional parameters can be useful when testing a deployment or debugging problems:

Parameter	Data type
`SEND_RECEIVE_VERBOSE`	Boolean
Possible values are `true` and `false`. The default value is `false`. If the parameter is set to `true`, the connector will print information about the messages it receives. For more information, see Debug UDF Script Output

`MAX_MESSAGE_ENTRIES`	Integer
This parameter limits the number of values sent to the REST API service. The default value is 8192. You can only specify an upper bound. The actual number of values sent to the service can be lower. There is no way to increase the number of values.

`MAX_TOKEN_SIZE_FOR_PROTECT`	Integer
This parameter limits the size of values used in the `protect` method. If the value in the statement is larger than the parameter value, the connector will refuse the value and return an error message. The default value is 4096. This limit is applied before data is sent to the REST API. The limitations of the REST API are not affected by this option.

`MAX_TOKEN_SIZE_FOR_UNPROTECT`	Integer
This parameter limits the size of values used in the `unprotect`, `reprotect`, and `identity` methods. If the value in the statement is larger than the parameter value, the connector will refuse the value and return an error message. The default value is 4096. This limit is applied before data is sent to the REST API. The limitations of the REST API are not affected by this option.

`UTF8_DEBUG`	Boolean
Possible values are `true` and `false`. The default value is `false`. If set to true, the UDF will check the validity of the UTF-8 encoding of every message returned from the REST API service and issue an error message that may help in analyzing eventual problems.

`TIMEOUT`	Integer
The number of milliseconds to set as a timeout for answers from the REST API. For no timeout, set the value to 0 (zero). The default value is 0 (zero).

`RETRY_ON_TIMEOUT`	Boolean
Specifies whether or not to retry to send the message to the REST API after a timeout. Possible values are `true` and `false`. The default value is `true`.

Deployment examples

Deployment on cluster nodes	Deployment using external protection service

Example

The following example session shows the setup of the PEPServer and the REST API:

Copy

$ sudo ./PepServerSetup_Linux_x64_7.2.1.33.sh
Please enter ESA hostname or IP address
[]:
aa.bbb.ccc.dd # <- replace by real address!
Please enter the user name for downloading certificates
[]:
admin
Please enter the password for downloading certificates
[]:
Unpacking...
Extracting files...
Downloading certificates from 3.250.122.75:8443...
% Total
% Received % Xferd Average Speed
Time
Dload Upload
Total
100 30720 100 30720
0
0 24037
0 0:00:01
Time
Time Current
Spent
Left Speed
0:00:01 --:--:-- 24037
Extracting certificates...
Certificates successfully downloaded and stored in /opt/protegrity/defiance_dps/data.
Protegrity PepServer installed in /opt/protegrity/defiance_dps.
$ sudo ./ApplicationProtectorRESTLinux_x64_7.2.1.15.sh -dir /opt/protegrity/
XX: -dir
Unpacking...
Extracting files...
gzip: stdin: unexpected end of file
tar: Child returned status 1
tar: Error is not recoverable: exiting now
Application Protector REST API installed in /opt/protegrity//applicationprotector/rest
$ sudo vim /opt/protegrity/applicationprotector/rest/conf/configuration.json
$ sudo mkdir /opt/protegrity/certs/
$ sudo tar zxf sample-certs.tgz -C /opt/protegrity/certs/
$ sudo apt update
$ apt install openjdk-17-jre-headless
$ sudo /opt/protegrity/defiance_dps/bin/pepsrvctrl start
Protegrity PEP Server
Version 7.2.1.33
Copyright (c) 2004-2020 Protegrity Corporation. All Rights Reserved.
$ nohup java -cp /opt/protegrity/applicationprotector/rest/bin/*:/opt/protegrity/
applicationprotector/rest/lib/* com.protegrity.ap.rest.Api -conf /opt/protegrity/
applicationprotector/rest/conf/configuration.json </dev/null &>/dev/null &
[1] 62833
$ curl https://localhost:9080/rest-v1/protect -H ’Content-Type: application/json’ --data
’{"protect": {"policyusername": "SYS","dataelementname": "simple","bulk":{"id": 1,"
data": [{"id": 1,"content":"YWJjZGVmZ2hpamtsbW5vcHFyc3R1dnh5eg=="},{"id": 2,"content
": "MDEyMzQ1Njc4OQ=="}]}}}’ --insecure
{"protect":{"bulk":{"id":1,"returntype":"success","data":[{"id":1,"returncode":"/rest-v1/
returncodes/id/6","returntype":"success","content":"
SWlhTGlFMlBjQ1RZOGdwQ3JoWDB0emMzRQ=="},{"id":2,"returncode":"/rest-v1/returncodes/id
/6","returntype":"success","content":"QnAyWUIyUnJibA=="}]}}}

The option --insecure or -k tells curl to bypass the TLS certificate check. This option allows you to connect to a HTTPS server that does not have a valid certificate. Only use this option if certificate verification is not possible and you trust the server.

Protegrity Integration

Prerequisites

Preparation

Setting up a script language alias

SESSION/SCOPE users

Methods

Protect/Unprotect

Example:

Reprotect

Special Methods

Identity

Whoami

Version

Supported parameters

Additional parameters

Deployment examples

Example

PRODUCT

RESOURCES