Question

The Law of Large Numbers states that as the size of a sample drawn from a random variable increases, the mean of more samples gets closer and closer to the true population mean. We are going to simulate this using the steps below and then answer some questions: First simulation: Generate a simulated population (you could also use real data here) population <- runif(10000, min=1950, max=2020) Set the highest number of samples needed max_sample_size <- 1000 This vector will hold the mean calculated for each sample size mean_vec <- rep(0, max_sample_size) Illustrate the law of large numbers by calculating means for all of the different samples from size n=1 to n=max_sample_size. for (n in 1:max_sample_size) { Draw a random sample of length n from the plant height data values <- sample(population, n) Notice that we are filtering the data to only include adult plants and remove the outliers Calculate the sample mean and store it in the mean_vec mean_vec[n] <- mean(values) } Finally, plot the sample mean vs. the sample size plot(seq(1, max_sample_size), mean_vec, xlab="Sample size", ylab="Sample mean") Draw a red horizontal line with a Y-intercept equal to mu abline(h=mean(population), col="red") lln_mean <- data.frame(Sample_Size=seq(1, max_sample_size), mean_vec) library(ggplot2) sp_lln_mean <- ggplot(data=lln_mean, aes(x=Sample_Size, y=mean_vec)) + geom_point(shape=1) + geom_hline(yintercept=mean(population), color="red", size=1) + labs(title="Simulated Uniform Distribution Means for Different Sample Sizes", x="Sample size") (a) Why, at the maximum value of the sample size, does the sample mean still not exactly equal the population mean? (hint, you can increase the max sample size and see how the graph reacts) (b) Try changing the distribution from a uniform to a binomial (rbinom) with a size=3 and prob=0.30 and rerunning the code. Does it appear the Law of Large Numbers appear to be dependent on the distribution of the random variable? 5. population <- runif(10000, min=1950, max=2020) Set the highest number of samples needed max_sample_size <- 1000 This vector will hold the mean calculated for each sample size median_vec <- rep(0, max_sample_size) Illustrate the law of large numbers by calculating means for all of the different samples from size n=1 to n=max_sample_size. for (n in 1:max_sample_size) { Draw a random sample of length n from the plant height data values <- sample(population, n) Notice that we are filtering the data to only include adult plants and remove the outliers Calculate the sample mean and store it in the mean_vec median_vec[n] <- median(values) } Finally, plot the sample mean vs. the sample size plot(seq(1, max_sample_size), median_vec, xlab="Sample size", ylab="Sample median") Draw a red horizontal line with a Y-intercept equal to mu abline(h=median(population), col="red") lln_median <- data.frame(Sample_Size=seq(1, max_sample_size), median_vec) library(ggplot2) sp_lln_median <- ggplot(data=lln_median, aes(x=Sample_Size, y=median_vec)) + geom_point(shape=1) + geom_hline(yintercept=mean(population), color="red", size=1) + labs(title="Simulated Uniform Distribution Medians for Different Sample Sizes", x="Sample size") (a) In your textbook, the law of large numbers is only applied to means. Based on the plot you just produced, do you think it applies to medians? (b) Change the statistic in the code to be a standard deviation. Do you think the law of large numbers applies to them also? Why?

Name: the law of large numbers states that as the size of a sample drawn from a random variable increases the mean of more samples gets closer and closer to the true population mean we are going t 14663
Uploaded: 2022-02-04T02:35:46-08:00
Duration: 1 min 16 s
Channel: Ameer Said
Description: the law of large numbers states that as the size of a sample drawn from a random variable increases the mean of more samples gets closer and closer to the true population mean we are going t 14663

          The Law of Large Numbers states that as the size of a sample drawn from a random variable increases, the mean of more samples gets closer and closer to the true population mean. We are going to simulate this using the steps below and then answer some questions: 

First simulation: Generate a simulated population (you could also use real data here) 
population <- runif(10000, min=1950, max=2020) 

Set the highest number of samples needed 
max_sample_size <- 1000 

This vector will hold the mean calculated for each sample size 
mean_vec <- rep(0, max_sample_size) 

Illustrate the law of large numbers by calculating means for all of the different samples from size n=1 to n=max_sample_size. 
for (n in 1:max_sample_size) { 
    Draw a random sample of length n from the plant height data 
    values <- sample(population, n) 
    Notice that we are filtering the data to only include adult plants and remove the outliers 
    Calculate the sample mean and store it in the mean_vec 
    mean_vec[n] <- mean(values) 
} 

Finally, plot the sample mean vs. the sample size 
plot(seq(1, max_sample_size), mean_vec, xlab="Sample size", ylab="Sample mean") 

Draw a red horizontal line with a Y-intercept equal to mu 
abline(h=mean(population), col="red") 

lln_mean <- data.frame(Sample_Size=seq(1, max_sample_size), mean_vec) 
library(ggplot2) 
sp_lln_mean <- ggplot(data=lln_mean, aes(x=Sample_Size, y=mean_vec)) + 
    geom_point(shape=1) + 
    geom_hline(yintercept=mean(population), color="red", size=1) + 
    labs(title="Simulated Uniform Distribution Means for Different Sample Sizes", x="Sample size")

(a) Why, at the maximum value of the sample size, does the sample mean still not exactly equal the population mean? (hint, you can increase the max sample size and see how the graph reacts) 
(b) Try changing the distribution from a uniform to a binomial (rbinom) with a size=3 and prob=0.30 and rerunning the code. Does it appear the Law of Large Numbers appear to be dependent on the distribution of the random variable? 

5. population <- runif(10000, min=1950, max=2020) 

Set the highest number of samples needed 
max_sample_size <- 1000 

This vector will hold the mean calculated for each sample size 
median_vec <- rep(0, max_sample_size) 

Illustrate the law of large numbers by calculating means for all of the different samples from size n=1 to n=max_sample_size. 
for (n in 1:max_sample_size) { 
    Draw a random sample of length n from the plant height data 
    values <- sample(population, n) 
    Notice that we are filtering the data to only include adult plants and remove the outliers 
    Calculate the sample mean and store it in the mean_vec 
    median_vec[n] <- median(values) 
} 

Finally, plot the sample mean vs. the sample size 
plot(seq(1, max_sample_size), median_vec, xlab="Sample size", ylab="Sample median") 

Draw a red horizontal line with a Y-intercept equal to mu 
abline(h=median(population), col="red") 

lln_median <- data.frame(Sample_Size=seq(1, max_sample_size), median_vec) 
library(ggplot2) 
sp_lln_median <- ggplot(data=lln_median, aes(x=Sample_Size, y=median_vec)) + 
    geom_point(shape=1) + 
    geom_hline(yintercept=mean(population), color="red", size=1) + 
    labs(title="Simulated Uniform Distribution Medians for Different Sample Sizes", x="Sample size")

(a) In your textbook, the law of large numbers is only applied to means. Based on the plot you just produced, do you think it applies to medians? 
(b) Change the statistic in the code to be a standard deviation. Do you think the law of large numbers applies to them also? Why?

Added by Brooke J.

Elementary Statistics a Step by Step Approach

Allan G. Bluman 9th Edition

Instant Answer

Solved by Expert Ameer Said

02/04/2022

Step 1

Even with a large sample size, there will still be some variation in the sample means due to the randomness of sampling. ** Show more…

Show all steps

Thanks for your feedback!

The Law of Large Numbers states that as the size of a sample drawn from a random variable increases, the mean of more samples gets closer and closer to the true population mean. We are going to simulate this using the steps below and then answer some questions: First simulation: Generate a simulated population (you could also use real data here) population <- runif(10000, min=1950, max=2020) Set the highest number of samples needed max_sample_size <- 1000 This vector will hold the mean calculated for each sample size mean_vec <- rep(0, max_sample_size) Illustrate the law of large numbers by calculating means for all of the different samples from size n=1 to n=max_sample_size. for (n in 1:max_sample_size) { Draw a random sample of length n from the plant height data values <- sample(population, n) Notice that we are filtering the data to only include adult plants and remove the outliers Calculate the sample mean and store it in the mean_vec mean_vec[n] <- mean(values) } Finally, plot the sample mean vs. the sample size plot(seq(1, max_sample_size), mean_vec, xlab="Sample size", ylab="Sample mean") Draw a red horizontal line with a Y-intercept equal to mu abline(h=mean(population), col="red") lln_mean <- data.frame(Sample_Size=seq(1, max_sample_size), mean_vec) library(ggplot2) sp_lln_mean <- ggplot(data=lln_mean, aes(x=Sample_Size, y=mean_vec)) + geom_point(shape=1) + geom_hline(yintercept=mean(population), color="red", size=1) + labs(title="Simulated Uniform Distribution Means for Different Sample Sizes", x="Sample size") (a) Why, at the maximum value of the sample size, does the sample mean still not exactly equal the population mean? (hint, you can increase the max sample size and see how the graph reacts) (b) Try changing the distribution from a uniform to a binomial (rbinom) with a size=3 and prob=0.30 and rerunning the code. Does it appear the Law of Large Numbers appear to be dependent on the distribution of the random variable? 5. population <- runif(10000, min=1950, max=2020) Set the highest number of samples needed max_sample_size <- 1000 This vector will hold the mean calculated for each sample size median_vec <- rep(0, max_sample_size) Illustrate the law of large numbers by calculating means for all of the different samples from size n=1 to n=max_sample_size. for (n in 1:max_sample_size) { Draw a random sample of length n from the plant height data values <- sample(population, n) Notice that we are filtering the data to only include adult plants and remove the outliers Calculate the sample mean and store it in the mean_vec median_vec[n] <- median(values) } Finally, plot the sample mean vs. the sample size plot(seq(1, max_sample_size), median_vec, xlab="Sample size", ylab="Sample median") Draw a red horizontal line with a Y-intercept equal to mu abline(h=median(population), col="red") lln_median <- data.frame(Sample_Size=seq(1, max_sample_size), median_vec) library(ggplot2) sp_lln_median <- ggplot(data=lln_median, aes(x=Sample_Size, y=median_vec)) + geom_point(shape=1) + geom_hline(yintercept=mean(population), color="red", size=1) + labs(title="Simulated Uniform Distribution Medians for Different Sample Sizes", x="Sample size") (a) In your textbook, the law of large numbers is only applied to means. Based on the plot you just produced, do you think it applies to medians? (b) Change the statistic in the code to be a standard deviation. Do you think the law of large numbers applies to them also? Why?