A brief review of the ANOVA table using a simple (two-variable) regression. Highlights for FRM candidates:
The key is to see that the regression breaks into two pieces: a residual (difference between the observed Y and the predicted/fitted Y) and the regression (difference between the predicted/fitted Y and the average Y). Add these two pieces together and you get, for each observed Y, the difference between the observed Y and the average Y. The average Y is a flat line, so these two pieces characterize the "sources of variation" in the regression: how much variation is due to the regression line itself (regression sum of squares, ESS) and how much variation is due to the residual (RSS)?
Understanding this ought to lead to an intuitive grasp of: the R^2 (coefficient of determination) = ESS/(ESS + RSS) = ESS/TSS
The F ratio is the mean ESS divided by mean RSS/. F = [ESS/d.f.]/[RSS/d.f.]. For d.f., recall that the d.f. for TSS is n - 1 and the d.f. for RSS is n - number of variables (so, in the simple two-variable case, d.f. RSS = n - 2.
The F ratio allows for a test of the (joint) hypothesis that the explanatory/independent variable(s) are significant; i.e., "Does disposal income, in fact, have an impact on lotto spend?"
The Significance of F is the p value. It can be manually computed in Excel with = FDIST(). In this example, the F of 52.7 corresponds to an p value (F significance) of 0.009%. How to interpret this? We can say, "We reject the null with (1-p) confidence." Or, in this case, "We reject the idea that disposable income has no impact on lotto spend with 99.991% (1-0.009%) confidence."
Comments
Be the first to leave a comment!
Leave a Comment