00:01
All right, so we've got some data here, which is price in euros, which is x and bookings, which is y.
00:08
And this is at a hotel.
00:09
And the hotel manager wants to establish the influence of price on the booking.
00:14
So what we're going to do is make a regression equation of x on y.
00:18
So it looks like this.
00:20
Y is equal to a plus bx or y hat, the predicted number of bookings.
00:26
B is going to be taking the correlation coefficient r times a standard.
00:32
Of y divided by the standard deviation of x and some multiplication here r times that value and then a the intercept term is equal to y bar the mean the y is minus b which we'll find times me the x's okay let's go and do this and i use the spreadsheet to do the calculate the uh to find the average in standard variations using the average standard deviation and the correlation coefficient here corral these are the cell references for the dataset for the places where you would find the data in your spreadsheet.
01:07
And so it's a substitution exercise.
01:09
So at this point, so b is the correlation negative 0 .17 times the standard of y, 3 .16, divided by 2 .19 right here.
01:21
And i'm rounding when i'm writing it now, but when i did the calculations, i used the full precision.
01:26
And then a, the mean of the ys is 41, minus the b value, which we'll find, times the me of the x is, which is 13.
01:34
And this is what we get.
01:38
B is negative .25.
01:41
So let's rewrite our equation, y hat, is equal to 44 .25.
01:46
That's the a value, the intercept term, minus our b value, which is at negative 1 quarter, x.
01:53
And we're going to make a prediction.
01:55
A likely demand for booking when the price is 30.
01:58
So we put 30 in for x here and we get 36 .75.
02:04
Bookings that is so you could probably say 37 but maybe 36 36 to 37 bookings when the price is 30 all right now we're going to talk about primary and secondary data here so this is question one this is question two so we have primary versus secondary data so primary data so primary is data collected directly by the researcher.
02:43
So this is directly from the researcher.
02:59
Right, so directly from the researcher.
03:01
And this, then you know, then the researcher would know exactly what type of data they're getting, what they're looking for.
03:07
And that's why this is a, this is useful here.
03:12
And this is like survey data, interviews, experiments, observations where the researchers recording behaviors or events.
03:22
So advantage of this is it's tailored to meet the specific needs of the researcher, of the experiment.
03:29
They have greater control of accuracy and quality.
03:34
It can be time -consuming, and it can be costly as well.
03:41
Then secondary data is data collected from someone else.
03:46
So collected from somewhere else, from slash buy someone else.
03:54
All right, so there's that.
04:02
And this could be government reports, census data, economic reports, like the federal open market committee, releases their data on job reports in the united states.
04:13
Historical records is another example of this.
04:16
It's useful because it's inexpensive and relatively easy to access, but the disadvantages are that it may not fit the research exactly as you need it.
04:26
So that's primary and secondary data.
04:29
Another question in this exercise was asking about null and alternative hypotheses, null versus alternative hypotheses.
04:53
And you might see this as h0, h1.
04:57
And so what this is, you're testing a parameters or some sort of assumption.
05:01
So the null hypothesis would be like some parameter, we'll call it row, is like equal to zero, against the alternative row is not equal to zero.
05:09
So this is the null hypothesis is the kind of what you believe the world to be or what the underlying assumption is, though, with the current assumptions right here.
05:23
Whereas the alternative, this is what you're testing for.
05:25
You significantly different from zero.
05:26
It's significantly different from what is believed to be the norm.
05:30
So that's our null and alternative hypotheses.
05:33
And related to those are type 1 and type 2 hypothesis, or errors.
05:45
And so i like to think about this in a 2x2 table.
05:48
So here's your test result right here.
05:57
And this is where you have your h0 is true.
06:06
H0 is false.
06:12
H0 is not true would mean you retain h0.
06:15
Where h0 is false is you reject.
06:17
So that's another thing to notice.
06:18
So this is retain h0, reject h0.
06:31
So the 2 by 2 table i'm talking about is this.
06:40
And this is reality here or the truth.
06:45
And this is h0 is true.
06:50
H.
06:50
Not is false.
06:59
And so here's your test result.
07:01
So let's say you believe h0 is true.
07:06
And your result says, hey, retain your h0 is true.
07:09
Retain it.
07:10
And h .n.
07:10
Is true.
07:10
That means we've done the right thing.
07:12
This is the correct decision.
07:15
This is correct.
07:17
But let's say you reject the null hypothesis, but h -not is in fact true.
07:23
That means you've made a type 1 error.
07:27
You've rejected it when you shouldn't.
07:29
It's like, whoops, we shouldn't have done that.
07:31
Or, excuse me, that's a type 1...