REGR_FUNCTIONS

Purpose

With the help of the linear regression functions you can determine a least-square regression line.

Syntax

regr_functions::=

Usage Notes

  • If either expr1 or expr2 is NULL, then the corresponding number pair is not considered for the computation.
  • Exasol interprets expr2 as independent variable (x value) and expr1 as dependent variable (y value).
  • The regression functions are described in the following table:
  • Function Description Formula
    REGR_SLOPE Slope of the regression line formula - regression slope
    REGR_INTERCEPT y-intercept of the regression line formula - regression y intercept
    REGR_COUNT Number of non-NULL number pairs  
    REGR_R2 Coefficient of determination (goodness of fit) formula - regression r2
    REGR_AVGX Average of the independent values (x values) formula - average x
    REGR_AVGY Average of the dependent values (y values) formula - average y
    REGR_SXX Auxiliary function formula - auxiliary sxx
    REGR_SXY Auxiliary function formula - auxiliary sxy
    REGR_SYY Auxiliary function formula - auxiliary syy

Examples

Analytic Function

SELECT 
    id, department, hire_date, 
    REGR_COUNT(starting_salary, current_salary) 
        OVER (PARTITION BY department ORDER BY hire_date) REGR_COUNT 
FROM employee_table ORDER BY department, hire_date;
Result
ID DEPARTMENT HIRE_DATE REGR_COUNT
2005 ACCOUNTS 2013-01-01 1
2003 ACCOUNTS 2015-07-01 2
2002 ACCOUNTS 2017-01-01 4
2004 ACCOUNTS 2017-01-01 4
2001 ACCOUNTS 2018-07-01 5
1003 HR 2014-01-01 1
1002 HR 2016-01-01 3
1004 HR 2016-01-01 3
1001 HR 2018-07-01 4
SELECT 
    id, department, hire_date, 
    REGR_INTERCEPT(starting_salary, current_salary) 
        OVER (PARTITION BY department ORDER BY hire_date) REGR_INTERCEPT 
FROM employee_table ORDER BY department, hire_date;
Result
ID DEPARTMENT HIRE_DATE REGR_INTERCEPT
2005 ACCOUNTS 2013-01-01 NULL
2003 ACCOUNTS 2015-07-01 NULL
2002 ACCOUNTS 2017-01-01 96666.667
2004 ACCOUNTS 2017-01-01 96666.6674
2001 ACCOUNTS 2018-07-01 46000
1003 HR 2014-01-01 NULL
1002 HR 2016-01-01 57500
1004 HR 2016-01-01 57500
1001 HR 2018-07-01 70757.576
SELECT 
    id, department, hire_date, 
    REGR_SLOPE(starting_salary, current_salary) 
        OVER (PARTITION BY department ORDER BY hire_date) REGR_SLOPE 
FROM employee_table ORDER BY department, hire_date;
Result
ID DEPARTMENT HIRE_DATE REGR_SLOPE
2005 ACCOUNTS 2013-01-01 NULL
2003 ACCOUNTS 2015-07-01 NULL
2002 ACCOUNTS 2017-01-01 -0.66666667
2004 ACCOUNTS 2017-01-01 -0.66666667
2001 ACCOUNTS 2018-07-01 0
1003 HR 2014-01-01 NULL
1002 HR 2016-01-01 -0.25
1004 HR 2016-01-01 -0.25
1001 HR 2018-07-01 -0.41414141