00:01
Ok, so if we find a linear regression model to fit this data, a linear regression model is of the form y hat equals a plus bx, where y hat is going to be the predicted percentage purity and x is just the pollution count.
00:20
Count, then we can find that the regression equation is given by 96 .45, that's the intercept, minus 2 .90 times x, the pollution count.
00:39
To test for the significance of the regression, what we do is we do our anova test.
00:47
So you can put all this data into a facility like excel and ask it to do a linear regression for you and it's going to give you the following output.
00:57
So our degrees of freedom for the regression is just going to be the number of predictor variables of which there is one and for the residual is going to be the number of data points which is 15 minus 1 minus the number of degrees of freedom of the regression so that's 13.
01:17
The sum of squares for the regression and the residual respectively are 16 .49 and 2 .38 and the mean square then which is the sum of squares divided by the degrees of freedom is just given by 16 .49 here and 0 .183 here and the f test statistic then which is the ratio of the mean square for the regression divided by the mean square for the residual is given by, sorry, i should have really put these table lines in earlier, that's given by 90 .13.
02:02
And so the p value for this f value of 90 .13, where the f distribution we're looking at has degrees of freedom 1 and 13, that p -value is 0 .000000328 and that is far less than any significance level alpha we would use and so we'd say the regression is significant...