00:01
Hello viewers.
00:02
In this problem we have to explain the idea behind the simpson paradox.
00:06
Now a simpson paradox in statistics is an effect that occurs when a marginal association between any two categorical variable is qualitatively different from the partial association between the same two variables after controlling one or more variables.
00:22
Now in part a of the problem, we have to construct a frequency marginal distribution.
00:27
A marginal distribution of a variable is defined as the frequency distribution of either row or column variable for the desired contingency table respectively.
00:38
It is useful to eradicate the influence of either row or column variable for the desired contingency table.
00:45
Now, we have to add 35, 25 and 20 to obtain the marginal distribution of y1.
00:53
And similarly, we have to add 35 and 65 to obtain.
00:57
The marginal to obtain the corresponding marginal distribution of x1 so adding the corresponding values will give us the frequency marginal distribution table therefore this will be a marginal distribution table where the addition of 35 25 25 20 gives us 80 and the addition of 3565 gives us 100 and the grand total of 80 and 220 gives us 300 so this is part a of a problem now in part b we have to construct a relative frequency marginal distribution so the relative frequency is defined as a number of times the value occurs divided by the total number of observations of in the dataset so it can be written as r f which is the relative frequency so this is frequency divided by total frequency.
01:56
The relative frequency marginal distribution for the row variable is obtained by calculating or dividing the row total for each of the value by the contingency table total.
02:09
That is, for y1 it will be 80 divided by 300.
02:17
Hence, using the same rule, we can construct a table.
02:22
For the frequency marginal distribution.
02:26
Hence, 80 divided by 300 gives us 0 .267, while 220 divided by 300 gives us 0 .77, 733, and so on and so forth...