The
least squares regression line, written as \(y = ax + b\), is the unique line of best fit that models the linear relationship between \(x\) and \(y\). It is calculated by minimizing the sum of the squares of the
residuals.
A
residual is the vertical distance between an observed data point \((x_i, y_i)\) and the predicted point on the regression line \((x_i, \hat{y}_i)\).$$ \text{Residual} = \text{observed } y - \text{predicted } y = y_i - \hat{y}_i $$A key property is that the least squares regression line always passes through the point of means, \((\bar{x}, \bar{y})\).