• Home
  • Textbooks
  • Introductory Statistics with Randomization and Simulation
  • Inference for numerical data

Introductory Statistics with Randomization and Simulation

David Diez

Chapter 4

Inference for numerical data - all with Video Answers

Educators


Chapter Questions

03:17

Problem 1

Identify the critical t. An independent random sample is selected from an approximately normal population with unknown standard deviation. Find the degrees of freedom and the critical $t$ value $\left(\mathrm{t}^{*}\right)$ for the given sample size and confidence level.
(a) $n=6, \mathrm{CL}=90 \%$
(c) $n=29, \mathrm{CL}=95 \%$
(b) $n=21, \mathrm{CL}=98 \%$
(d) $n=12, \mathrm{CL}=99 \%$

Prabhakar Kumar
Prabhakar Kumar
Numerade Educator
02:20

Problem 2

Working backwards, Part I. A $90 \%$ confidence interval for a population mean is (65,77) . The population distribution is approximately normal and the population standard deviation is unknown. This confidence interval is based on a simple random sample of 25 observations. Calculate the sample mean, the margin of error, and the sample standard deviation.

Jameson Kuper
Jameson Kuper
Numerade Educator
01:21

Problem 3

Working backwards, Part II. A 95\% confidence interval for a population mean, $\mu$, is given as $(18.985,21.015) .$ This confidence interval is based on a simple random sample of 36 observations. Calculate the sample mean and standard deviation. Assume that all conditions necessary for inference are satisfied. Use the $t$ distribution in any calculations.

Harsh Gadhiya
Harsh Gadhiya
Numerade Educator
04:27

Problem 4

Find the p-value. An independent random sample is selected from an approximately normal population with an unknown standard deviation. Find the p-value for the given set of hypotheses and $T$ test statistic. Also determine if the null hypothesis would be rejected at $\alpha=0.05$.
(a) $H_{A}: \mu>\mu_{0}, n=11, T=1.91$
(c) $H_{A}: \mu \neq \mu_{0}, n=7, T=0.83$
(b) $H_{A}: \mu<\mu_{0}, n=17, T=-3.45$
(d) $H_{A}: \mu>\mu_{0}, n=28, T=2.13$

Willis James
Willis James
Numerade Educator
03:18

Problem 5

Sleep habits of New Yorkers. New York is known as "the city that never sleeps". A random sample of 25 New Yorkers were asked how much sleep they get per night. Statistical summaries of these data are shown below. Do these data provide strong evidence that New Yorkers sleep more or less than 8 hours a night on average?
\begin{tabular}{rrrrr}
\hline $\mathrm{n}$ & $\bar{x}$ & $\mathrm{~s}$ & $\min$ & $\max$ \\
\hline 25 & 7.73 & 0.77 & 6.17 & 9.78 \\
\hline
\end{tabular}
(a) Write the hypotheses in symbols and in words.
(b) Check conditions, then calculate the test statistic, $T,$ and the associated degrees of freedom.
(c) Find and interpret the p-value in this context. Drawing a picture may be helpful.
(d) What is the conclusion of the hypothesis test?
(e) If you were to construct a $95 \%$ confidence interval that corresponded to this hypothesis test, would you expect 8 hours to be in the interval?

Sheryl Ezze
Sheryl Ezze
Numerade Educator
02:18

Problem 6

Fuel efficiency of Prius. Fueleconomy.gov, the official US government source for fuel economy information, allows users to share gas mileage information on their vehicles. The histogram below shows the distribution of gas mileage in miles per gallon (MPG) from 14 users who drive a 2012 Toyota Prius. The sample mean is $53.3 \mathrm{MPG}$ and the standard deviation is $5.2 \mathrm{MPG}$. Note that these data are user estimates and since the source data cannot be verified, the accuracy of these estimates are not guaranteed.
(a) We would like to use these data to evaluate the average gas mileage of all 2012 Prius drivers. Do you think this is reasonable? Why or why not?
(b) The EPA claims that a 2012 Prius gets 50 MPG (city and highway mileage combined). Do these data provide strong evidence against this estimate for drivers who participate on fueleconomy.gov? Note any assumptions you must make as you proceed with the test.
(c) Calculate a $95 \%$ confidence interval for the average gas mileage of a 2012 Prius by drivers who participate on fueleconomy.gov.

James Kiss
James Kiss
Numerade Educator
02:46

Problem 7

Find the mean. You are given the following hypotheses:
$$
\begin{array}{l}
H_{0}: \mu=60 \\
H_{A}: \mu<60
\end{array}
$$
We know that the sample standard deviation is 8 and the sample size is 20. For what sample mean would the p-value be equal to $0.05 ?$ Assume that all conditions necessary for inference are satisfied.

Eleanor Archer
Eleanor Archer
Numerade Educator
01:14

Problem 8

$t^{*}$ vs. $z^{*}$. For a given confidence level, $t_{d f}^{*}$ is larger than $z^{*}$. Explain how $t_{d f}^{*}$ being slightly larger than $z^{*}$ affects the width of the confidence interval.

Nick Johnson
Nick Johnson
Numerade Educator
04:26

Problem 9

Climate change, Part I. Is there strong evidence of climate change? Let's consider a small scale example, comparing how temperatures have changed in the US from 1968 to $2008 .$ The daily high temperature reading on January 1 was collected in 1968 and 2008 for 51 randomly selected locations in the continental US. Then the difference between the two readings (temperature in 2008 - temperature in 1968 ) was calculated for each of the 51 different locations. The average of these 51 values was 1.1 degrees with a standard deviation of 4.9 degrees.
(a) Is there a relationship between the observations collected in 1968 and $2008 ?$ Or are the observations in the two groups independent? Explain.
(b) Write hypotheses for this research in symbols and in words.
(c) Check the conditions required to complete this test.
(d) Calculate the test statistic and find the p-value.
(e) What do you conclude? Interpret your conclusion in context.
(f) What type of error might we have made? Explain in context what the error means.
(g) Based on the results of this hypothesis test, would you expect a confidence interval for the average difference between the temperature measurements from 1968 and 2008 to include $0 ?$ Explain your reasoning.

James Kiss
James Kiss
Numerade Educator
03:42

Problem 10

High School and Beyond, Part $I$. The National Center of Education Statistics conducted a survey of high school seniors, collecting test data on reading, writing, and several other subjects. Here we examine a simple random sample of 200 students from this survey. Side-by-side box plots of reading and writing scores as well as a histogram of the differences in scores are shown below.
(a) Is there a clear difference in the average reading and writing scores?
(b) Are the reading and writing scores of each student independent of each other?
(c) Create hypotheses appropriate for the following research question: is there an evident difference in the average scores of students in the reading and writing exam?
(d) Check the conditions required to complete this test.
(e) The average observed difference in scores is $\bar{x}_{\text {read-write }}=-0.545,$ and the standard deviation of the differences is 8.887 points. Do these data provide convincing evidence of a difference between the average scores on the two exams?
(f) What type of error might we have made? Explain what the error means in the context of the application.
(g) Based on the results of this hypothesis test, would you expect a confidence interval for the average difference between the reading and writing scores to include $0 ?$ Explain your reasoning.

Harsh Gadhiya
Harsh Gadhiya
Numerade Educator
01:41

Problem 11

Climate change, Part II. We considered the differences between the temperature readings in January 1 of 1968 and 2008 at 51 locations in the continental US in Exercise $4.9 .$ The mean and standard deviation of the reported differences are 1.1 degrees and 4.9 degrees.
(a) Calculate a $95 \%$ confidence interval for the average difference between the temperature measurements between 1968 and 2008 .
(b) Interpret this interval in context.
(c) Does the confidence interval provide convincing evidence that the temperature was different in 2008 than in 1968 in the continental US? Explain.

Sheryl Ezze
Sheryl Ezze
Numerade Educator
01:53

Problem 12

High school and beyond, Part II. We considered the differences between the reading and writing scores of a random sample of 200 students who took the High School and Beyond Survey in Exercise $4.11 .$ The mean and standard deviation of the differences are $\bar{x}_{\text {read-write }}=-0.545$ and 8.887 points.
(a) Calculate a $95 \%$ confidence interval for the average difference between the reading and writing scores of all students.
(b) Interpret this interval in context.
(c) Does the confidence interval provide convincing evidence that there is a real difference in the average scores? Explain.

James Kiss
James Kiss
Numerade Educator
01:52

Problem 13

Gifted children. Researchers collected a simple random sample of 36 children who had been identified as gifted in a large city. The following histograms show the distributions of the IQ scores of mothers and fathers of these children. Also provided are some sample statistics. 30
(a) Are the IQs of mothers and the IQs of fathers in this data set related? Explain.
(b) Conduct a hypothesis test to evaluate if the scores are equal on average. Make sure to clearly state your hypotheses, check the relevant conditions, and state your conclusion in the context of the data.

Sheryl Ezze
Sheryl Ezze
Numerade Educator
01:29

Problem 14

Paired or not? In each of the following scenarios, determine if the data are paired.
(a) We would like to know if Intel's stock and Southwest Airlines' stock have similar rates of return. To find out, we take a random sample of 50 days for Intel's stock and another random sample of 50 days for Southwest's stock.
(b) We randomly sample 50 items from Target stores and note the price for each. Then we visit Walmart and collect the price for each of those same 50 items.
(c) A school board would like to determine whether there is a difference in average SAT scores for students at one high school versus another high school in the district. To check, they take a simple random sample of 100 students from each high school.

Sheryl Ezze
Sheryl Ezze
Numerade Educator
01:53

Problem 15

Math scores of 13 year olds, Part I. The National Assessment of Educational Progress tested a simple random sample of 1,000 thirteen year old students in both 2004 and 2008 (two separate simple random samples). The average and standard deviation in 2004 were 257 and 39 , respectively. In $2008,$ the average and standard deviation were 260 and $38,$ respectively. Calculate a $90 \%$ confidence interval for the change in average scores from 2004 to 2008 , and interpret this interval in the context of the application. (Reminder: check conditions.) $^{31}$

James Kiss
James Kiss
Numerade Educator
View

Problem 16

Work hours and education, Part $I .$ The General Social Survey collects data on demographics, education, and work, among many other characteristics of US residents. The histograms below display the distributions of hours worked per week for two education groups: those with and without a college degree. $^{32}$ Suppose we want to estimate the average difference between the number of hours worked per week by all Americans with a college degree and those without a college degree. Summary information for each group is shown in the tables.
(a) What is the parameter of interest, and what is the point estimate?
(b) Are conditions satisfied for estimating this difference using a confidence interval?
(c) Create a $95 \%$ confidence interval for the difference in number of hours worked between the two groups, and interpret the interval in context.
(d) Can you think of any real world justification for your results? (Note: There isn't a single correct answer to this question.)

Rashmi Sinha
Rashmi Sinha
Numerade Educator
01:53

Problem 17

Math scores of 13 year olds, Part II. Exercise 4.15 provides data on the average math scores from tests conducted by the National Assessment of Educational Progress in 2004 and 2008 . Two separate simple random samples were taken in each of these years. The average and standard deviation in 2004 were 257 and $39,$ respectively. In $2008,$ the average and standard deviation were 260 and $38,$ respectively.
(a) Do these data provide strong evidence that the average math score for 13 year old students has changed from 2004 to $2008 ?$ Use a $10 \%$ significance level.
(b) It is possible that your conclusion in part (a) is incorrect. What type of error is possible for this conclusion? Explain.
(c) Based on your hypothesis test, would you expect a $90 \%$ confidence interval to contain the null value? Explain.

James Kiss
James Kiss
Numerade Educator
02:59

Problem 18

Work hours and education, Part II. The General Social Survey described in Exercise 4.16 included random samples from two groups: US residents with a college degree and US residents without a college degree. For the 505 sampled US residents with a college degree, the average number of hours worked each week was 41.8 hours with a standard deviation of 15.1 hours. For those 667 without a degree, the mean was 39.4 hours with a standard deviation of 15.1 hours. Conduct a hypothesis test to check for a difference in the average number of hours worked for the two groups.

Diane Koenig
Diane Koenig
Numerade Educator
00:40

Problem 19

Does the Paleo diet work? The Paleo diet allows only for foods that humans typically consumed over the last 2.5 million years, excluding those agriculture-type foods that arose during the last 10,000 years or so. Researchers randomly divided 500 volunteers into two equal-sized groups. One group spent 6 months on the Paleo diet. The other group received a pamphlet about controlling portion sizes. Randomized treatment assignment was performed, and at the beginning of the study, the average difference in weights between the two groups was about 0. After the study, the Paleo group had lost on average 7 pounds with a standard deviation of 20 pounds while the control group had lost on average 5 pounds with a standard deviation of 12 pounds.
(a) The $95 \%$ confidence interval for the difference between the two population parameters (Paleo
- control) is given as (-0.891,4.891) . Interpret this interval in the context of the data.
(b) Based on this confidence interval, do the data provide convincing evidence that the Paleo diet is more effective for weight loss than the pamphlet (control)? Explain your reasoning.
(c) Without explicitly performing the hypothesis test, do you think that if the Paleo group had lost 8 instead of 7 pounds on average, and everything else was the same, the results would then indicate a significant difference between the treatment and control groups? Explain your reasoning.

Joanna Quigley
Joanna Quigley
Numerade Educator
01:02

Problem 20

Weight gain during pregnancy. In $2004,$ the state of North Carolina released to the public a large data set containing information on births recorded in this state. This data set has been of interest to medical researchers who are studying the relationship between habits and practices of expectant mothers and the birth of their children. The following histograms show the distributions of weight gain during pregnancy by 867 younger moms (less than 35 years old) and 133 mature moms ( 35 years old and over) who have been randomly sampled from this large data set. The average weight gain of younger moms is 30.56 pounds, with a standard deviation of 14.35 pounds, and the average weight gain of mature moms is 28.79 pounds, with a standard deviation of 13.48 pounds. Calculate a $95 \%$ confidence interval for the difference between the average weight gain of younger and mature moms. Also comment on whether or not this interval provides strong evidence that there is a significant difference between the two population means.

Lynn Larson
Lynn Larson
Numerade Educator
03:05

Problem 21

Body fat in women and men. The third National Health and Nutrition Examination Survey collected body fat percentage (BF) data from 13,601 subjects whose ages are 20 to $80 .$ A summary table for these data is given below. Note that BF is given as mean $\pm$ standard error. Construct a $95 \%$ confidence interval for the difference in average body fat percentages between men and women, and explain the meaning of this interval. Tip: the standard error can be calculated as $S E=\sqrt{S E_{M}^{2}+S E_{W}^{2}}$

James Kiss
James Kiss
Numerade Educator
04:32

Problem 22

Child care hours, Part I. The China Health and Nutrition Survey aims to examine the effects of the health, nutrition, and family planning policies and programs implemented by national and local governments. One of the variables collected on the survey is the number of hours parents spend taking care of children in their household under age 6 (feeding, bathing, dressing, holding, or watching them). In 2006,487 females and 312 males were surveyed for this question. On average, females reported spending 31 hours with a standard deviation of 31 hours, and males reported spending 16 hours with a standard deviation of 21 hours. Calculate a $95 \%$ confidence interval for the difference between the average number of hours Chinese males and females spend taking care of their children under age 6 . Also comment on whether this interval suggests a significant difference between the two population parameters. You may assume that conditions for inference are satisfied. $^{34}$

Bryan Meares
Bryan Meares
Numerade Educator
View

Problem 23

Cleveland vs. Sacramento. Average income varies from one region of the country to another, and it often reflects both lifestyles and regional living expenses. Suppose a new graduate is considering a job in two locations, Cleveland, OH and Sacramento, CA, and he wants to see whether the average income in one of these cities is higher than the other. He would like to conduct a $t$ test based on two small samples from the 2000 Census, but he first must consider whether the conditions are met to implement the test. Below are histograms for each city. Should he move forward with the $t$ test? Explain your reasoning.

Victor Salazar
Victor Salazar
Numerade Educator
01:08

Problem 24

Oscar winners. The first Oscar awards for best actor and best actress were given out in 1929. The histograms below show the age distribution for all of the best actor and best actress winners from 1929 to 2012. Summary statistics for these distributions are also provided. Is a $t$ test appropriate for evaluating whether the difference in the average ages of best actors and actresses 35 might be due to chance? Explain your reasoning.

Jeff Vermeire
Jeff Vermeire
Numerade Educator
View

Problem 25

Friday the $13^{\text {th }}$, Part I. In the early 1990's, researchers in the UK collected data on traffic flow, number of shoppers, and traffic accident related emergency room admissions on Friday the $13^{\text {th }}$ and the previous Friday, Friday the $6^{\text {th }}$. The histograms below show the distribution of number of cars passing by a specific intersection on Friday the $6^{\text {th }}$ and Friday the $13^{\text {th }}$ for many such date pairs. Also given are some sample statistics, where the difference is the number of cars on the 6 th minus the number of cars on the 13 th .36
(a) Are there any underlying structures in these data that should be considered in an analysis? Explain.
(b) What are the hypotheses for evaluating whether the number of people out on Friday the $6^{\text {th }}$ is different than the number out on Friday the $13^{\text {th }} ?$
(c) Check conditions to carry out the hypothesis test from part (b).
(d) Calculate the test statistic and the p-value.
(e) What is the conclusion of the hypothesis test?
(f) Interpret the p-value in this context.
(g) What type of error might have been made in the conclusion of your test? Explain.

Rashmi Sinha
Rashmi Sinha
Numerade Educator
02:39

Problem 26

Diamonds, Part I. Prices of diamonds are determined by what is known as the 4 Cs: cut, clarity, color, and carat weight. The prices of diamonds go up as the carat weight increases, but the increase is not smooth. For example, the difference between the size of a 0.99 carat diamond and a 1 carat diamond is undetectable to the naked human eye, but the price of a 1 carat diamond tends to be much higher than the price of a 0.99 diamond. In this question we use two random samples of diamonds, 0.99 carats and 1 carat, each sample of size $23,$ and compare the average prices of the diamonds. In order to be able to compare equivalent units, we first divide the price for each diamond by 100 times its weight in carats. That is, for a 0.99 carat diamond, we divide the price by 99. For a 1 carat diamond, we divide the price by $100 .$ The distributions and some sample statistics are shown below. $^{37}$ Conduct a hypothesis test to evaluate if there is a difference between the average standardized prices of 0.99 and 1 carat diamonds. Make sure to state your hypotheses \begin{tabular}{|l|}
\hline \\
\hline
\end{tabular} clearly, check relevant conditions, and interpret your results in context of the data.

Sheryl Ezze
Sheryl Ezze
Numerade Educator
View

Problem 27

Friday the $13^{\text {th }}$, Part II. The Friday the $13^{\text {th }}$ study reported in Exercise 4.25 also provides data on traffic accident related emergency room admissions. The distributions of these counts from Friday the $6^{\text {th }}$ and Friday the $13^{\text {th }}$ are shown below for six such paired dates along with summary statistics. You may assume that conditions for inference are met.
(a) Conduct a hypothesis test to evaluate if there is a difference between the average numbers of traffic accident related emergency room admissions between Friday the $6^{\text {th }}$ and Friday the $13^{\text {th }}$.
(b) Calculate a $95 \%$ confidence interval for the difference between the average numbers of traffic accident related emergency room admissions between Friday the $6^{\text {th }}$ and Friday the $13^{\text {th }}$.
(c) The conclusion of the original study states, "Friday 13th is unlucky for some. The risk of hospital admission as a result of a transport accident may be increased by as much as $52 \%$. Staying at home is recommended." Do you agree with this statement? Explain your reasoning.

Rashmi Sinha
Rashmi Sinha
Numerade Educator
01:32

Problem 28

Diamonds, Part II. In Exercise $4.26,$ we discussed diamond prices (standardized by weight) for diamonds with weights 0.99 carats and 1 carat. See the table for summary statistics, and then construct a $95 \%$ confidence interval for the average difference between the standardized prices of 0.99 and 1 carat diamonds. You may assume the conditions for inference are met.

James Kiss
James Kiss
Numerade Educator
03:46

Problem 29

Chicken diet and weight, Part I. Chicken farming is a multi-billion dollar industry, and any methods that increase the growth rate of young chicks can reduce consumer costs while increasing company profits, possibly by millions of dollars. An experiment was conducted to measure and compare the effectiveness of various feed supplements on the growth rate of chickens. Newly hatched chicks were randomly allocated into six groups, and each group was given a different feed supplement. Below are some summary statistics from this data set along with box plots showing the distribution of weights by feed type.
(a) Describe the distributions of weights of chickens that were fed linseed and horsebean.
(b) Do these data provide strong evidence that the average weights of chickens that were fed linseed and horsebean are different? Use a $5 \%$ significance level.
(c) What type of error might we have committed? Explain.
(d) Would your conclusion change if we used $\alpha=0.01 ?$

James Kiss
James Kiss
Numerade Educator
View

Problem 30

Fuel efficiency of manual and automatic cars, Part $1 .$ Each year the US Environmental Protection Agency (EPA) releases fuel economy data on cars manufactured in that year. Below are summary statistics on fuel efficiency (in miles/gallon) from random samples of cars with manual and automatic transmissions manufactured in $2012 .$ Do these data provide strong evidence of a difference between the average fuel efficiency of cars with manual and automatic transmissions in terms of their average city mileage? Assume that conditions for inference are satisfied. 39

Rashmi Sinha
Rashmi Sinha
Numerade Educator
03:46

Problem 31

Chicken diet and weight, Part II. Casein is a common weight gain supplement for humans. Does it have an effect on chickens? Using data provided in Exercise 4.29 , test the hypothesis that the average weight of chickens that were fed casein is different than the average weight of chickens that were fed soybean. If your hypothesis test yields a statistically significant result, discuss whether or not the higher average weight of chickens can be attributed to the casein diet. Assume that conditions for inference are satisfied.

James Kiss
James Kiss
Numerade Educator
View

Problem 32

Fuel efficiency of manual and automatic cars, Part II. The table provides summary statistics on highway fuel economy of cars manufactured in 2012 (from Exercise 4.30 ). Use these statistics to calculate a $98 \%$ confidence interval for the difference between average highway mileage of manual and automatic cars, and interpret this interval in the context of the data. ${ }^{40}$

Rashmi Sinha
Rashmi Sinha
Numerade Educator
02:10

Problem 33

Gaming and distracted eating, Part I. A group of researchers are interested in the possible effects of distracting stimuli during eating, such as an increase or decrease in the amount of food consumption. To test this hypothesis, they monitored food intake for a group of 44 patients who were randomized into two equal groups. The treatment group ate lunch while playing solitaire, and the control group ate lunch without any added distractions. Patients in the treatment group ate 52.1 grams of biscuits, with a standard deviation of 45.1 grams, and patients in the control group ate 27.1 grams of biscuits, with a standard deviation of 26.4 grams. Do these data provide convincing evidence that the average food intake (measured in amount of biscuits consumed) is different for the patients in the treatment group? Assume that conditions for inference are satisfied. $^{41}$

Jameson Kuper
Jameson Kuper
Numerade Educator
02:00

Problem 34

Gaming and distracted eating, Part II. The researchers from Exercise 4.33 also investigated the effects of being distracted by a game on how much people eat. The 22 patients in the treatment group who ate their lunch while playing solitaire were asked to do a serial-order recall of the food lunch items they ate. The average number of items recalled by the patients in this group was $4.9,$ with a standard deviation of $1.8 .$ The average number of items recalled by the patients in the control group (no distraction) was $6.1,$ with a standard deviation of $1.8 .$ Do these data provide strong evidence that the average number of food items recalled by the patients in the treatment and control groups are different?

Nick Johnson
Nick Johnson
Numerade Educator
03:21

Problem 35

Prison isolation experiment, Part I. Subjects from Central Prison in Raleigh, NC, volunteered for an experiment involving an "isolation" experience. The goal of the experiment was to find a treatment that reduces subjects' psychopathic deviant T scores. This score measures a person's need for control or their rebellion against control, and it is part of a commonly used mental health test called the Minnesota Multiphasic Personality Inventory (MMPI) test. The experiment had three treatment groups:

James Kiss
James Kiss
Numerade Educator
03:21

Problem 35

Prison isolation experiment, Part I. Subjects from Central Prison in Raleigh, NC, volunteered for an experiment involving an "isolation" experience. The goal of the experiment was to find a treatment that reduces subjects' psychopathic deviant T scores. This score measures a person's need for control or their rebellion against control, and it is part of a commonly used mental health test called the Minnesota Multiphasic Personality Inventory (MMPI) test. The experiment had three treatment groups:

James Kiss
James Kiss
Numerade Educator
01:24

Problem 36

4.36 True or false, Part I. Determine if the following statements are true or false, and explain your reasoning for statements you identify as false.
(a) When comparing means of two samples where $n_{1}=20$ and $n_{2}=40,$ we can use the normal model for the difference in means since $n_{2} \geq 30$.
(b) As the degrees of freedom increases, the T distribution approaches normality.
(c) We use a pooled standard error for calculating the standard error of the difference between means when sample sizes of groups are equal to each other.

Tyler Moulton
Tyler Moulton
Numerade Educator
03:46

Problem 37

Chicken diet and weight, Part III. In Exercises 4.29 and 4.31 we compared the effects of two types of feed at a time. A better analysis would first consider all feed types at once: casein, horsebean, linseed, meat meal, soybean, and sunflower. The ANOVA output below can be used to test for differences between the average weights of chicks on different diets.

James Kiss
James Kiss
Numerade Educator
01:52

Problem 38

Student performance across discussion sections. A professor who teaches a large introductory statistics class (197 students) with eight discussion sections would like to test if student performance differs by discussion section, where each discussion section has a different teaching assistant. The summary table below shows the average final exam score for each discussion section as well as the standard deviation of scores and the number of students in each section.

James Kiss
James Kiss
Numerade Educator
03:03

Problem 39

Coffee, depression, and physical activity. Caffeine is the world's most widely used stimulant, with approximately $80 \%$ consumed in the form of coffee. Participants in a study investigating the relationship between coffee consumption and exercise were asked to report the number of hours they spent per week on moderate (e.g., brisk walking) and vigorous (e.g., strenuous sports and jogging) exercise. Based on these data the researchers estimated the total hours of metabolic equivalent tasks (MET) per week, a value always greater than $0 .$ The table below gives summary statistics of MET for women in this study based on the amount of coffee consumed. 3
(a) Write the hypotheses for evaluating if the average physical activity level varies among the different levels of coffee consumption.
(b) Check conditions and describe any assumptions you must make to proceed with the test.
(c) Below is part of the output associated with this test. Fill in the empty cells.
(d) What is the conclusion of the test?

James Kiss
James Kiss
Numerade Educator
01:05

Problem 40

Work hours and education, Part III. In Exercises 4.16 and 4.18 you worked with data from the General Social Survey in order to compare the average number of hours worked per week by US residents with and without a college degree. However, this analysis didn't take advantage of the original data which contained more accurate information on educational attainment (less than high school, high school, junior college, Bachelor's, and graduate school). Using ANOVA, we can consider educational attainment levels for all 1,172 respondents at once instead of re-categorizing them into two groups. Below are the distributions of hours worked by educational attainment and relevant summary statistics that will be helpful in carrying out this analysis.
(a) Write hypotheses for evaluating whether the average number of hours worked varies across the five groups.
(b) Check conditions and describe any assumptions you must make to proceed with the test.
(c) Below is part of the output associated with this test. Fill in the empty cells.
(d) What is the conclusion of the test?

Sheryl Ezze
Sheryl Ezze
Numerade Educator
02:29

Problem 41

GPA and major. Undergraduate students taking an introductory statistics course at Duke University conducted a survey about GPA and major. The side-by-side box plots show the distribution of GPA among three groups of majors. Also provided is the ANOVA output.
(a) Write the hypotheses for testing for a difference between average GPA across majors.
(b) What is the conclusion of the hypothesis test?
(c) How many students answered these questions on the survey, i.e. what is the sample size?

Jameson Kuper
Jameson Kuper
Numerade Educator
02:12

Problem 42

Child care hours, Part II. Exercise 4.22 introduces the China Health and Nutrition Survey which, among other things, collects information on number of hours Chinese parents spend taking care of their children under age $6 .$ The side by side box plots below show the distribution of this variable by educational attainment of the parent. Also provided below is the ANOVA output for comparing average hours across educational attainment categories.
(a) Write the hypotheses for testing for a difference between the average number of hours spent on child care across educational attainment levels.
(b) What is the conclusion of the hypothesis test?

Kari Hasz
Kari Hasz
Numerade Educator
01:24

Problem 43

True or false, Part II. Determine if the following statements are true or false in ANOVA, and explain your reasoning for statements you identify as false.
(a) As the number of groups increases, the modified significance level for pairwise tests increases as well.
(b) As the total sample size increases, the degrees of freedom for the residuals increases as well.
(c) The constant variance condition can be somewhat relaxed when the sample sizes are relatively consistent across groups.
(d) The independence assumption can be relaxed when the total sample size is large.

Tyler Moulton
Tyler Moulton
Numerade Educator
03:37

Problem 44

True or false, Part III. Determine if the following statements are true or false, and explain your reasoning for statements you identify as false.

If the null hypothesis that the means of four groups are all the same is rejected using ANOVA at a $5 \%$ significance level, then ...
(a) we can then conclude that all the means are different from one another.
(b) the standardized variability between groups is higher than the standardized variability within groups.
(c) the pairwise analysis will identify at least one pair of means that are significantly different.
(d) the appropriate $\alpha$ to be used in pairwise comparisons is $0.05 / 4=0.0125$ since there are four groups.

Rashmi Sinha
Rashmi Sinha
Numerade Educator
03:21

Problem 45

Prison isolation experiment, Part II. Exercise 4.35 introduced an experiment that was conducted with the goal of identifying a treatment that reduces subjects' psychopathic deviant $\mathrm{T}$ scores, where this score measures a person's need for control or his rebellion against control. In Exercise 4.35 you evaluated the success of each treatment individually. An alternative analysis involves comparing the success of treatments. The relevant ANOVA output is given below.
v(a) What are the hypotheses?
(b) What is the conclusion of the test? Use a $5 \%$ significance level.
(c) If in part (b) you determined that the test is significant, conduct pairwise tests to determine which groups are different from each other. If you did not reject the null hypothesis in part (b), recheck your solution.

James Kiss
James Kiss
Numerade Educator
01:58

Problem 46

Poker winnings. An aspiring poker player recorded her winnings and losses over 50 evenings of play, summarized in the figure on the left. The daily winnings averaged $\$ 90.08$, but were very volatile with a standard deviation of $\$ 703.68 .$ The poker player would like to better understand how precise the standard deviation estimate is of the volatility in her long term play, so she constructed a bootstrap distribution for the standard deviation, shown on the right.
(a) Describe the distribution.
(b) Determine whether the bootstrap method is suitable for constructing a confidence interval for the standard deviation in this exercise.

Kari Hasz
Kari Hasz
Numerade Educator
01:49

Problem 47

Heights of adults. Researchers studying anthropometry collected body girth measurements and skeletal diameter measurements, as well as age, weight, height and gender, for 507 physically active individuals. The histogram below shows the sample distribution of heights in centimeters. We would like to get $95 \%$ confidence bounds for the standard deviation of the heights in the population. For this exercise, you may assume the sample is a simple random sample from the population of interest. $^{44}$

Hailey Tomashek
Hailey Tomashek
Numerade Educator