Sum of Squared Errors (SSE).
- Definition:
- SSE is a statistical metric used to evaluate the goodness of fit of a regression model. It quantifies the discrepancy between the predicted values (from the model) and the actual observed values in a dataset.
- Specifically, SSE measures the sum of the squared differences between each observed value and its corresponding predicted value.
- Mathematical Formulation:
- Suppose we have a dataset with N data points. Let’s denote the actual observations as { x_i } and the estimated or forecasted values as { x̂_i }.
- The SSE is calculated as follows: [ SSE = \sum_{i=1}^{N} (x_i – x̂_i)^2 ]
- Step-by-Step Explanation:
- Create a three-column table:
- Column 1: Actual measurements (values of your observations).
- Column 2: Error measurements (difference between each measurement and the mean).
- Column 3: Squares of the errors (squared deviation from the mean).
- Calculate the mean of the full dataset: [ \text{Mean} = \frac{\sum_{i=1}^{N} x_i}{N} ]
- Compute the error for each measurement: [ \text{Error}_i = x_i – \text{Mean} ]
- Square the errors: [ \text{Squared Error}_i = (\text{Error}_i)^2 ]
- Sum up the squared errors to obtain the SSE.
- Create a three-column table:
- Interpretation:
- A lower SSE indicates a better fit of the model to the data. In other words, a smaller SSE means that the predicted values are closer to the actual observations.
- SSE is commonly used in linear regression, where the goal is to find the best-fitting line through the data points.