Data science UDF examples
This section contains tutorials and examples that show how to use UDFs for data science.
The examples and tutorials in this section demonstrate the capabilities of UDFs in Exasol for data science purposes. In-depth knowledge of data science and the R and Python programming languages is not required to follow the tutorials.
The tutorials are designed to work with our free public demo system which contains pre-loaded data sets, but you can of course also use your own data sets in an existing Exasol deployment. The public demo system is a static shared system hosted in our own cloud and does not provide the full functionality of Exasol. To get access to the free public demo system, sign up here.
User defined functions (UDFs)
User defined functions (UDFs) allow you to program your own analysis, processing, or generation functions and to execute them in parallel within an Exasol cluster. There are different types of UDFs for different input and output specifications:
| INPUT | |||
|
SCALAR Single input |
SET Multiple rows as input |
||
|
OUTPUT |
RETURNS
Single output |
Function: Executed in parallel |
Aggregation function: Not executed in parallel except for GROUP BY |
|
Examples: |
Examples: |
||
| EMITS
Multiple output rows
|
Generator function / Map reduce / ETL UDFs: Executed in parallel |
Analytical function: Not executed in parallel except for GROUP BY |
|
|
Examples:
|
Examples:
|
||
To learn more about how to use UDFs, see User defined functions (UDFs).
UDF tutorials
The following articles in this section contain a number of tutorials for different use cases:
-
Python classification tutorial
Learn how UDFs can be used in a machine learning or data science context using Python. In this tutorial you will learn how to test the accuracy of a model, either from a SQL client or directly from your familiar Python environment such as Jupyter Notebooks.
-
Learn how to display JSON data stored in a column as a table using a simple UDF.
-
Learn how to use Exasol to add geocodes to your data.