Train and Predict with R and UDFs

This section describes how to train and predict with R using Exasol user defined functions (UDFs).

In this section you will learn how you can use R and Exasol, both as standalone tools and combined, to run a machine learning algorithm such as random forest on your data. In the tutorials in this section you will use the Exasol R package together with Exasol user defined functions (UDFs), and sample data from the Boston Housing public data set.

UDFs provide a flexible interface for integrating the Java, Lua, Python and R languages in an Exasol native environment. By using UDFs you can develop your own analysis, processing, or generation functions, and then execute them in parallel inside Exasol.

The Exasol R package enables you to use UDFs through the exa.createScript() function, which deploys R code dynamically from any R environment into an Exasol database.

The examples in this section use RStudio, but you can use any development environment for R.

For more information about how to use UDFs, see UDF Scripts.

For more information about how to install the Exasol R package and connect R with Exasol, see Exasol R Package .

About the tutorials

In the tutorials you will first train a model on data from the Boston Housing data set. The trained model is then used to make predictions on separate (unseen) test data from this data set. Since the response variable that you want to predict is continuous (the median value of housing), this is a regression example.

A typical scenario would be to load data from a local machine into R or RStudio, train the model, and evaluate and make predictions in this environment. In the following tutorials we also describe som alternative scenarios.

Train Locally and Predict in Exasol

In the first tutorial you will train the model in the R environment, then upload the model to Exasol and predict using a UDF.

Train and Predict in Exasol

In this tutorial you will use UDFs to both train and predict in R inside your Exasol environment, using the data from the first tutorial.

Predict Through UDF in SQL

This scenario is similar to the second tutorial, except that you will load the data from the Exasol database into a SQL development tool and run the prediction UDF there.