00:01
We are given the following sample data points x, y, and we want to answer the following six questions, a through f, based on this data.
00:09
Part a, we want to produce a scatter plot of the data points.
00:13
This scatter plot is already included on the whiteboard and is found on the left, with the data points x, y, marked with black crosses or x's.
00:22
Next, in part b, we want to compute the sums below, as well as the pearson coefficient r.
00:29
I've already included the values of the sums.
00:31
These are simply found by taking the formula themselves.
00:34
So sum x is a sum of all x values, some y is some of all y values, and so on.
00:38
The correlation coefficient r is given by the formula i'm listing now.
00:43
Notice it takes as input the sample size n, and the different sums we just computed.
00:48
Plugging in n equals 4 in the sums gives r equals negative 0 .9876.
00:54
Next, in part c, we want to find the best fit line for this data.
00:58
To do so, we find x bar, y bar, a, the intercept, b, the slope.
01:03
The means for our sample data, x and y, our first calculated as follows.
01:08
Some of the data divided by n for both...