00:02
All right.
00:03
So we're going to use some data to make a least squares regression line.
00:09
And so the data consists of the years by going in increments of five years.
00:14
And we are looking to make a model that predicts the average size of a farm y by the number of farms x.
00:25
And so here's our number of farms, variables, and total acreage over here.
00:31
And we want to make our function y hat.
00:35
And the notation i'm going to use is a plus bx.
00:42
You might see it in some texts as beta not for the intercept term and beta 1 for the slope term.
00:51
Either way, it's the same thing.
00:53
It's just different notation.
00:54
It's good to get used to seeing the different notation.
00:57
But i'm going to use this one here.
00:58
And so the way we do this, to get our b value, we get the correlation coefficient and we multiply it by the sample standard deviation of the y variable, divided by the sample standard deviation of the x variable.
01:12
And then a is calculated as y bar minus b times x bar.
01:19
And the y bar and x bar are the means of each variable.
01:24
And something you should always do when you're making a model is to look at.
01:29
At the data and i've already made the line but even without the line you can see that the data the points follow it's a very linear graph based on the data that we have here so it's it's pretty linear you could make a case there's there might be a curve here but for that we would need more data to make that assumption or do a little further analysis here anyway let's go ahead and and go through this because even even before we go, even still, on this range from, what's our lowest value, two point, from like a number of farms up from two up to, say, five and a half, it looks very linear.
02:15
So a linear model would be sufficient based on observation.
02:20
Okay, so let's go ahead and do the calculations here.
02:22
So we need the correlation coefficient, well, the means and standardizations as well.
02:28
So i use my spreadsheet to do this.
02:31
So here's the actual work, the spreadsheet calculations.
02:35
And so to get the mean, i use my average function and write average and you put in your data, pops the mean.
02:49
Same thing with the standard deviation, but you do std -e -v dot s and you put in your data and then out pops the standard deviation.
02:57
Make sure you do s because it is the sample standard deviation.
03:01
And then the correlation coefficient, we get negative 0 .980.
03:06
Or negative .98.
03:08
And for that, i use the function corel.
03:12
And for that, you put in your x data, followed by the y data, and then out pops that number, which is really nice.
03:19
And from that, we can make our equation...