REGR_FUNCTIONS

Purpose

With the help of the linear regression functions you can determine a least-square regression line.

Syntax

regr_functions::=

Usage notes

  • If either expr1 or expr2 is NULL, then the corresponding number pair is not considered for the computation.
  • Exasol interprets expr2 as independent variable (x value) and expr1 as dependent variable (y value).

Regression functions

REGR_SLOPE

The slope of the regression line

formula - regression slope

REGR_INTERCEPT

The y-intercept of the regression line

formula - regression y intercept

REGR_COUNT

The number of non-NULL number pairs

REGR_R2

The coefficient of determination (goodness of fit)

formula - regression r2

REGR_AVGX

The average of the independent values (x values)

formula - average x

REGR_AVGY

The average of the dependent values (y values)

formula - average y

REGR_SXX

Auxiliary function

formula - auxiliary sxx

REGR_SXY

Auxiliary function

formula - auxiliary sxy

REGR_SYY

Auxiliary function

formula - auxiliary syy

Examples

Query using REGR_COUNT
SELECT 
    id, department, hire_date, 
    REGR_COUNT(starting_salary, current_salary) 
        OVER (PARTITION BY department ORDER BY hire_date) REGR_COUNT 
FROM employee_table ORDER BY department, hire_date;
Result
ID DEPARTMENT HIRE_DATE REGR_COUNT
2005 ACCOUNTS 2013-01-01 1
2003 ACCOUNTS 2015-07-01 2
2002 ACCOUNTS 2017-01-01 4
2004 ACCOUNTS 2017-01-01 4
2001 ACCOUNTS 2018-07-01 5
1003 HR 2014-01-01 1
1002 HR 2016-01-01 3
1004 HR 2016-01-01 3
1001 HR 2018-07-01 4
Query using REGR_INTERCEPT
SELECT 
    id, department, hire_date, 
    REGR_INTERCEPT(starting_salary, current_salary) 
        OVER (PARTITION BY department ORDER BY hire_date) REGR_INTERCEPT 
FROM employee_table ORDER BY department, hire_date;
Result
ID DEPARTMENT HIRE_DATE REGR_INTERCEPT
2005 ACCOUNTS 2013-01-01 NULL
2003 ACCOUNTS 2015-07-01 NULL
2002 ACCOUNTS 2017-01-01 96666.667
2004 ACCOUNTS 2017-01-01 96666.6674
2001 ACCOUNTS 2018-07-01 46000
1003 HR 2014-01-01 NULL
1002 HR 2016-01-01 57500
1004 HR 2016-01-01 57500
1001 HR 2018-07-01 70757.576
Query using REGR_SLOPE
SELECT 
    id, department, hire_date, 
    REGR_SLOPE(starting_salary, current_salary) 
        OVER (PARTITION BY department ORDER BY hire_date) REGR_SLOPE 
FROM employee_table ORDER BY department, hire_date;
Result
ID DEPARTMENT HIRE_DATE REGR_SLOPE
2005 ACCOUNTS 2013-01-01 NULL
2003 ACCOUNTS 2015-07-01 NULL
2002 ACCOUNTS 2017-01-01 -0.66666667
2004 ACCOUNTS 2017-01-01 -0.66666667
2001 ACCOUNTS 2018-07-01 0
1003 HR 2014-01-01 NULL
1002 HR 2016-01-01 -0.25
1004 HR 2016-01-01 -0.25
1001 HR 2018-07-01 -0.41414141