REGR_FUNCTIONS
Purpose
With the help of the linear regression functions you can determine a least-square regression line.
Syntax
regr_functions::=
Usage notes
- If either
expr1
orexpr2
isNULL
, then the corresponding number pair is not considered for the computation. - Exasol interprets
expr2
as independent variable (x value) andexpr1
as dependent variable (y value).
Regression functions
-
REGR_SLOPE
-
The slope of the regression line
-
REGR_INTERCEPT
-
The y-intercept of the regression line
-
REGR_COUNT
-
The number of non-NULL number pairs
-
REGR_R2
-
The coefficient of determination (goodness of fit)
-
REGR_AVGX
-
The average of the independent values (x values)
-
REGR_AVGY
-
The average of the dependent values (y values)
-
REGR_SXX
-
Auxiliary function
-
REGR_SXY
-
Auxiliary function
-
REGR_SYY
-
Auxiliary function
Examples
Query using REGR_COUNT
SELECT
id, department, hire_date,
REGR_COUNT(starting_salary, current_salary)
OVER (PARTITION BY department ORDER BY hire_date) REGR_COUNT
FROM employee_table ORDER BY department, hire_date;
Result
ID | DEPARTMENT | HIRE_DATE | REGR_COUNT |
2005 | ACCOUNTS | 2013-01-01 | 1 |
2003 | ACCOUNTS | 2015-07-01 | 2 |
2002 | ACCOUNTS | 2017-01-01 | 4 |
2004 | ACCOUNTS | 2017-01-01 | 4 |
2001 | ACCOUNTS | 2018-07-01 | 5 |
1003 | HR | 2014-01-01 | 1 |
1002 | HR | 2016-01-01 | 3 |
1004 | HR | 2016-01-01 | 3 |
1001 | HR | 2018-07-01 | 4 |
Query using REGR_INTERCEPT
SELECT
id, department, hire_date,
REGR_INTERCEPT(starting_salary, current_salary)
OVER (PARTITION BY department ORDER BY hire_date) REGR_INTERCEPT
FROM employee_table ORDER BY department, hire_date;
Result
ID | DEPARTMENT | HIRE_DATE | REGR_INTERCEPT |
2005 | ACCOUNTS | 2013-01-01 | NULL |
2003 | ACCOUNTS | 2015-07-01 | NULL |
2002 | ACCOUNTS | 2017-01-01 | 96666.667 |
2004 | ACCOUNTS | 2017-01-01 | 96666.6674 |
2001 | ACCOUNTS | 2018-07-01 | 46000 |
1003 | HR | 2014-01-01 | NULL |
1002 | HR | 2016-01-01 | 57500 |
1004 | HR | 2016-01-01 | 57500 |
1001 | HR | 2018-07-01 | 70757.576 |
Query using REGR_SLOPE
SELECT
id, department, hire_date,
REGR_SLOPE(starting_salary, current_salary)
OVER (PARTITION BY department ORDER BY hire_date) REGR_SLOPE
FROM employee_table ORDER BY department, hire_date;
Result
ID | DEPARTMENT | HIRE_DATE | REGR_SLOPE |
2005 | ACCOUNTS | 2013-01-01 | NULL |
2003 | ACCOUNTS | 2015-07-01 | NULL |
2002 | ACCOUNTS | 2017-01-01 | -0.66666667 |
2004 | ACCOUNTS | 2017-01-01 | -0.66666667 |
2001 | ACCOUNTS | 2018-07-01 | 0 |
1003 | HR | 2014-01-01 | NULL |
1002 | HR | 2016-01-01 | -0.25 |
1004 | HR | 2016-01-01 | -0.25 |
1001 | HR | 2018-07-01 | -0.41414141 |