regression: Simple Linear Regression (by Hand)

Exercise template for computing the prediction from a simple linear prediction by hand, based on randomly-generated marginal means/variances and correlation.

Name:
regression
Type:
Preview:

For 56 firms the number of employees \(X\) and the amount of expenses for continuing education \(Y\) (in EUR) were recorded. The statistical summary of the data set is given by:

Variable \(X\) Variable \(Y\)
Mean 46 220
Variance 140 1827

The correlation between \(X\) and \(Y\) is equal to 0.61.

Estimate the expected amount of money spent for continuing education by a firm with 44 employees using least squares regression.

First, the regression line \(y_i = \beta_0 + \beta_1 x_i + \varepsilon_i\) is determined. The regression coefficients are given by: \[\begin{eqnarray*} && \hat \beta_1 = r \cdot \frac{s_y}{s_x} = 0.61 \cdot \sqrt{\frac{1827}{140}} = 2.20361, \\ && \hat \beta_0 = \bar y - \hat \beta_1 \cdot \bar x = 220 - 2.20361 \cdot 46 = 118.63386. \end{eqnarray*}\]

The estimated amount of money spent by a firm with 44 employees is then given by: \[\begin{eqnarray*} \hat y = 118.63386 + 2.20361 \cdot 44 = 215.593. \end{eqnarray*}\]

For 46 firms the number of employees \(X\) and the amount of expenses for continuing education \(Y\) (in EUR) were recorded. The statistical summary of the data set is given by:

Variable \(X\) Variable \(Y\)
Mean 44 236
Variance 78 1644

The correlation between \(X\) and \(Y\) is equal to 0.76.

Estimate the expected amount of money spent for continuing education by a firm with 40 employees using least squares regression.

First, the regression line \(y_i = \beta_0 + \beta_1 x_i + \varepsilon_i\) is determined. The regression coefficients are given by: \[\begin{eqnarray*} && \hat \beta_1 = r \cdot \frac{s_y}{s_x} = 0.76 \cdot \sqrt{\frac{1644}{78}} = 3.48913, \\ && \hat \beta_0 = \bar y - \hat \beta_1 \cdot \bar x = 236 - 3.48913 \cdot 44 = 82.47826. \end{eqnarray*}\]

The estimated amount of money spent by a firm with 40 employees is then given by: \[\begin{eqnarray*} \hat y = 82.47826 + 3.48913 \cdot 40 = 222.043. \end{eqnarray*}\]

For 65 firms the number of employees \(X\) and the amount of expenses for continuing education \(Y\) (in EUR) were recorded. The statistical summary of the data set is given by:

Variable \(X\) Variable \(Y\)
Mean 44 259
Variance 93 2529

The correlation between \(X\) and \(Y\) is equal to 0.83.

Estimate the expected amount of money spent for continuing education by a firm with 38 employees using least squares regression.

First, the regression line \(y_i = \beta_0 + \beta_1 x_i + \varepsilon_i\) is determined. The regression coefficients are given by: \[\begin{eqnarray*} && \hat \beta_1 = r \cdot \frac{s_y}{s_x} = 0.83 \cdot \sqrt{\frac{2529}{93}} = 4.32824, \\ && \hat \beta_0 = \bar y - \hat \beta_1 \cdot \bar x = 259 - 4.32824 \cdot 44 = 68.55757. \end{eqnarray*}\]

The estimated amount of money spent by a firm with 38 employees is then given by: \[\begin{eqnarray*} \hat y = 68.55757 + 4.32824 \cdot 38 = 233.031. \end{eqnarray*}\]

Description:
Computing coefficients and a point prediction from a simple linear prediction (by hand). Internally, a full bivariate data set is simulated but only the marginal means and variances and the correlation coefficient are presented in the exercise.
Solution feedback:
Yes
Randomization:
Random numbers
Mathematical notation:
Yes
Verbatim R input/output:
No
Images:
No
Other supplements:
No
Raw: (1 random version)
PDF:
regression-Rmd-pdf
regression-Rnw-pdf
HTML:
regression-Rmd-html
regression-Rnw-html

(Note that the HTML output contains mathematical equations in MathML, rendered by MathJax using ‘mathjax = TRUE’. Instead it is also possible to use ‘converter = “pandoc-mathjax”’ so that LaTeX equations are rendered by MathJax directly.)

Demo code:

library("exams")

set.seed(403)
exams2html("regression.Rmd", mathjax = TRUE)
set.seed(403)
exams2pdf("regression.Rmd")

set.seed(403)
exams2html("regression.Rnw", mathjax = TRUE)
set.seed(403)
exams2pdf("regression.Rnw")