🎉 Announcing Numerade's $26M Series A, led by IDG Capital!Read how Numerade will revolutionize STEM Learning # The Practice of Statistics for AP* ## Daren S. Starnes, Daniel S. Yates, David S. Moore ## Chapter 10 ## Comparing Two Populations or Groups ## Educators     + 1 more educators ### Problem 1 Toyota or Nissan? Are Toyota or Nissan owners more satisfied with their vehicles? Let’s design a study to find out. We’ll select a random sample of 400 Toyota owners and a separate random sample of 400 Nissan owners. Then we’ll ask each individual in the sample: “Would you say that you are generally satisfied with your (Toyota/Nissan) vehicle?” (a) Is this a problem about comparing means or comparing proportions? Explain. (b) What type of study design is being used to produce data? Megan J. Numerade Educator ### Problem 2 Binge drinking Who is more likely to binge drink—male or female college students? The Harvard School of Public Health surveys random samples of male and female undergraduates at four-year colleges and universities about whether they have engaged in binge drinking. (a) Is this a problem about comparing means or comparing proportions? Explain. (b) What type of study design is being used to produce data? Gina K. Numerade Educator ### Problem 3 Computer gaming Do experienced computer game players earn higher scores when they play with someone present to cheer them on or when they play alone? Fifty teenagers who are experienced at playing a particular computer game have volunteered for a study. We randomly assign 25 of them to play the game alone and the other 25 to play the game with a supporter present. Each player’s score is recorded. (a) Is this a problem about comparing means or comparing proportions? Explain. (b) What type of study design is being used to produce data? Gina K. Numerade Educator ### Problem 4 Credit cards and incentives A bank wants to know which of two incentive plans will most increase the use of its credit cards. It offers each incentive to a group of current credit card customers, determined at random, and compares the amount charged during the following six months. (a) Is this a problem about comparing means or comparing proportions? Explain. (b) What type of study design is being used to produce data? Gina K. Numerade Educator ### Problem 5 I want red! A candy maker offers Child and Adult bags of jelly beans with different color mixes. The company claims that the Child mix has 30% red jelly beans while the Adult mix contains 15% red jelly beans. Assume that the candy maker’s claim is true. Suppose we take a random sample of 50 jelly beans from the Child mix and a separate random sample of 100 jelly beans from the Adult mix. (a) Find the probability that the proportion of red jelly beans in the Child sample is less than or equal to the proportion of red jelly beans in the Adult sample. Show your work. (b) Suppose that the Child and Adult samples contain an equal proportion of red jelly beans. Based on your result in part (a), would this give you reason to doubt the company’s claim? Explain. Megan J. Numerade Educator ### Problem 6 Literacy A researcher reports that 80$\%$of high school graduates but only 40$\%$of high school dropouts would pass a basic literacy test. Assume that the researcher's claim is true. Suppose we give a basic literacy test to a random sample of 60 high school graduates and a separate random sample of 75 high school dropouts. (a) Find the probability that the proportion of graduates who pass the test is at least 0.20 higher than the proportion of dropouts who pass. Show your work. (b) Suppose that the difference in the sample proportions (graduate – dropout) who pass the test is exactly 0.20. Based on your result in part (a), would this give you reason to doubt the researcher’s claim? Explain. R M. Numerade Educator ### Problem 7 Explain why the conditions for using two-sample z procedures to perform inference about$p_{1}-p_{2}$are not met in the settings of Exercises 7 through 10 . Don’t drink the water! The movie A Civil Action (Touchstone Pictures, 1998) tells the story of a major legal battle that took place in the small town of Woburn, Massachusetts. A town well that supplied water to eastern Woburn residents was contaminated by industrial chemicals. During the period that residents drank water from this well, 16 of the 414 babies born had birth defects. On the west side of Woburn, 3 of the 228 babies born during the same time period had birth defects. R M. Numerade Educator ### Problem 8 Explain why the conditions for using two-sample z procedures to perform inference about$p_{1}-p_{2}$are not met in the settings of Exercises 7 through 10 . In-line skaters A study of injuries to in-line skaters used data from the National Electronic Injury Surveillance System, which collects data from a random sample of hospital emergency rooms. The researchers interviewed 161 people who came to emergency rooms with injuries from in-line skating. Wrist injuries (mostly fractures) were the most common.$^{6}$The interviews found that 53 people were wearing wrist guards and 6 of these had wrist injuries. Of the 108 who did not wear wrist guards, 45 had wrist injuries. R M. Numerade Educator ### Problem 9 Explain why the conditions for using two-sample z procedures to perform inference about$p_{1}-p_{2}$are not met in the settings of Exercises 7 through 10 .. Shrubs and fire Fire is a serious threat to shrubs in dry climates. Some shrubs can resprout from their roots after their tops are destroyed. One study of resprouting took place in a dry area of Mexico.$^{7}$The investigators randomly assigned shrubs to treatment and control groups. They clipped the tops of all the shrubs. They then applied a propane torch to the stumps of the treatment group to simulate a fire. All 12 of the shrubs in the treatment group resprouted. Only 8 of the 12 shrubs in the control group resprouted. R M. Numerade Educator ### Problem 10 Explain why the conditions for using two-sample z procedures to perform inference about$p_{1}-p_{2}$are not met in the settings of Exercises 7 through 10 . Broken crackers We don’t like to find broken crackers when we open the package. How can makers reduce breaking? One idea is to microwave the crackers for 30 seconds right after baking them. Breaks start as hairline cracks called “checking.” Assign 65 newly baked crackers to the microwave and another 65 to a control group that is not microwaved. After one day, none of the microwave group and 16 of the control group show checking.$^{8}$ R M. Numerade Educator ### Problem 11 Who uses instant messaging? Do younger people use online instant messaging (IM) more often than older people? A random sample of IM users found that 73 of the 158 people in the sample aged 18 to 27 said they used IM more often than email. In the 28 to 39 age group, 26 of 143 people used IM more often than email.$^{9}$Construct and interpret a 90% confidence interval for the difference between the proportions of IM users in these age groups who use IM more often than email. Gina K. Numerade Educator ### Problem 12 Listening to rap Is rap music more popular among young blacks than among young whites? A sample survey compared 634 randomly chosen blacks aged 15 to 25 with 567 randomly selected whites in the same age group. It found that 368 of the blacks and 130 of the whites listened to rap music every day.$^{10}$Construct and interpret a 95% confidence interval for the difference between the proportions of black and white young people who listen to rap every day. R M. Numerade Educator ### Problem 13 Young adults living at home A surprising number of young adults (ages 19 to 25) still live in their parents’ homes. A random sample by the National Institutes of Health included 2253 men and 2629 women in this age group.$^{11}$The survey found that 986 of the men and 923 of the women lived with their parents. (a) Construct and interpret a 99% confidence interval for the difference in population proportions (men minus women). (b) Does your interval from part (a) give convincing evidence of a difference between the population proportions? Explain. R M. Numerade Educator ### Problem 14 Fear of crime The elderly fear crime more than younger people, even though they are less likely to be victims of crime. One study recruited separate random samples of 56 black women and 63 black men over the age of 65 from Atlantic City, New Jersey. Of the women, 27 said they “felt vulnerable” to crime; 46 of the men said this.$^{12}$(a) Construct and interpret a 90% confidence interval for the difference in population proportions (men minus women). (b) Does your interval from part (a) give convincing evidence of a difference between the population proportions? Explain. R M. Numerade Educator ### Problem 15 Who owns iPods? As part of the Pew Internet and American Life Project, researchers surveyed a random sample of 800 teens and a separate random sample of 400 young adults. For the teens, 79% said that they own an iPod or MP3 player. For the young adults, this figure was 67%. Is there a significant difference between the population proportions? State appropriate hypotheses for a significance test to answer this question. Define any parameters you use. Gina K. Numerade Educator ### Problem 16 Steroids in high school A study by the National Athletic Trainers Association surveyed random samples of 1679 high school freshmen and 1366 high school seniors in Illinois. Results showed that 34 of the freshmen and 24 of the seniors had used anabolic steroids. Steroids, which are dangerous, are sometimes used to improve athletic performance.$^{13}$Is there a significant difference between the population proportions? State appropriate hypotheses for a significance test to answer this question. Define any parameters you use. Gus S. Numerade Educator ### Problem 17 Who owns iPods? Refer to Exercise 15. (a) Carry out a significance test at the$\alpha=0.05$level. (b) Construct and interpret a 95$\%$confidence interval for the difference between the population proportions. Explain how the confidence interval is consistent with the results of the test in part (a). R M. Numerade Educator ### Problem 18 Steroids in high school Refer to Exercise 16. (a) Carry out a significance test at the$\alpha=0.05$level. (b) Construct and interpret a 95$\%$confidence interval for the difference between the population proportions. Explain how the confidence interval is consistent with the results of the test in part (a). R M. Numerade Educator ### Problem 19 What’s wrong? “Would you marry a person from a lower social class than your own?” Researchers asked this question of a random sample of 385 black, never-married students at two historically black colleges in the South. Of the 149 men in the sample, 91 said “Yes.” Among the 236 women, 117 said “Yes.”$^{14}$Is there reason to think that different proportions of men and women in this student population would be willing to marry beneath their class? Holly carried out the significance test shown below to answer this question. Unfortunately, she made some mistakes along the way. Identify as many mistakes as you can, and tell how to correct each one. State: I want to perform a test of $$H_{0} : p_{1}=\rho_{2}$$ $$H_{a} : p_{1} \neq p_{2}$$ at the 95% confidence level. Plan: If conditions are met, I’ll do a one-sample$z$test for comparing two proportions.$\cdot$Random The data came from a random sample of 385 black, never-married students.$\cdot$Normal One student's answer to the question should have no relationship to another student's answer.$\cdot$Independent The counts of successes and falures in the two groups -$91,58,117,$and$119-$are all at least 10 . Do: From the data,$\hat{p}_{1}=\frac{91}{149}=0.61$and$\hat{p}_{2}=\frac{117}{236}=0.46\bullet$Test statistic $$z=\frac{(0.61-0.46)-0}{\sqrt{\frac{0.61(0.39)}{149}+\frac{0.46(0.54)}{236}}}=2.91$$$\cdot P$value From Table$A, P(z \geq 2.91)=1-0.9982=$0.0018 . Conclude: The P-value,$0.0018,$is less than$0.05,$so I'll reject the null hypothesis. This proves that a higher proportion of men than women are willing to marry someone from a social class lower than their own. R M. Numerade Educator ### Problem 20 What’s wrong? A driving school wants to find out which of its two instructors is more effective at preparing students to pass the state’s driver’s license exam. An incoming class of 100 students is randomly assigned to two groups, each of size 50. One group is taught by Instructor A; the other is taught by Instructor B. At the end of the course, 30 of Instructor A’s students and 22 of Instructor B’s students pass the state exam. Do these results give convincing evidence that Instructor A is more effective? Min Jae carried out the significance test shown below to answer this question. Unfortunately, he made some mistakes along the way. Identify as many mistakes as you can, and tell how to correct each one. State: I want to perform a test of $$H_{0} : p_{1}-p_{2}=0$$ $$H_{a} : p_{1}-p_{2}>0$$ where$p_{1}=$the proportion of Instructor A's students that passed the state exam and$p_{2}=$the proportion of Instructor B's students that passed the state exam. Since no significance level was stated, I'll use$\sigma=0.05$Plan: If conditions are met, I'll do a two-sample$z$test for comparing two proportions.$\bullet$Random The data came from two random samples of 50 students.$\bullet$Normal The counts of successes and failures in the two groups -$30,20,22$, and$28-$are all at least$10 .\bullet$Independent There are at least 1000 students who take this driving school's class. Do: From the data,$\hat{p}_{1}=\frac{20}{50}=0.40$and$\hat{p}_{2}=\frac{30}{50}=0.60 .$So the pooled proportion of successes is $$\hat{p}_{C}=\frac{22+30}{50+50}=0.52$$$\bullet$Test statistic $$z=\frac{(0.40-0.60)-0}{\sqrt{\frac{0.52(0.48)}{100}+\frac{0.52(0.48)}{100}}}=-2.83$$ Conclude: The P-value,$0.9977,$is greater than$\alpha=0.05,$so we fail to reject the null hypothesis. There is not convincing evidence that Instructor A's pass rate is higher than Instructor B's. R M. Numerade Educator ### Problem 21 Did the random assignment work? A large clinical trial of the effect of diet on breast cancer assigned women at random to either a normal diet or a low-fat diet. To check that the random assignment did produce comparable groups, we can compare the two groups at the start of the study. Ask if there is a family history of breast cancer: 3396 of the$19,541$women in the low-fat group and 4929 of the$29,294$women in the control group said "Yes."$^{15}$If the random assignment worked well, there should not be a significant difference in the proportions with a family history of breast cancer. (a) How significant is the observed difference? Carry out an appropriate test to help answer this question. (b) Describe a Type I and a Type II error in this setting. Which is more serious? Explain. R M. Numerade Educator ### Problem 22 Preventing strokes Aspirin prevents blood from clotting and so helps prevent strokes. The Second European Stroke Prevention Study asked whether adding another anticlotting drug, named dipyridamole, would be more effective for patients who had already had a stroke. Here are the data on strokes and deaths during the two years of the study:$^{16}$$$\begin{array}{ll} &{\text { Number of }} & {\text { Number of }} \\ & {\text { patients }} & {\text { strokes }} \\ \hline \text {Aspirin alone } & 1649 & {206} \\ \text {Aspirin + dipyridamole }& {1650} & {157}\end{array}$$ The study was a randomized comparative experiment. (a) Is there a significant difference in the proportion of strokes between these two treatments? Carry out an appropriate test to help answer this question. (b) Describe a Type I and a Type II error in this setting. Which is more serious? Explain. Wendi O. Numerade Educator ### Problem 23 Exercises 23 through 26 involve the following setting. Some women would like to have children but cannot do so for medical reasons. One option for these women is a procedure called in vitro fertilization (IVF), which involves injecting a fertilized egg into the woman’s uterus. Prayer and pregnancy Two hundred women who were about to undergo IVF served as subjects in an experiment. Each subject was randomly assigned to either a treatment group or a control group. Women in the treatment group were intentionally prayed for by several people (called intercessors) who did not know them, a process known as intercessory prayer. The praying continued for three weeks following IVF. The intercessors did not pray for the women in the control group. Here are the results: 44 of the 88 women in the treatment group got pregnant, compared to 21 out of 81 in the control group.$^{17}$Is the pregnancy rate significantly higher for women who received intercessory prayer? To find out, researchers perform a test of$H_{0} : p_{1}=p_{2}$versus$H_{a} : p_{1}>p_{2},$where$p_{1}$and$p_{2}$are the actual pregnancy rates for women like those in the study who do and don't receive intercessory prayer, respectively. (a) Name the appropriate test and check that the conditions for carrying out this test are met. (b) The appropriate test from part (a) yields a P-value of 0.0007. Interpret this P-value in context. (c) What conclusion should researchers draw at the$\alpha=0.05$significance level? Explain. (d) The women in the study did not know if they were being prayed for. Explain why this is important. R M. Numerade Educator ### Problem 24 Exercises 23 through 26 involve the following setting. Some women would like to have children but cannot do so for medical reasons. One option for these women is a procedure called in vitro fertilization (IVF), which involves injecting a fertilized egg into the woman’s uterus. Acupuncture and pregnancy A study reported in the medical journal Fertility and Sterility sought to determine whether the ancient Chinese art of acupuncture could help infertile women become pregnant.$^{18}$One hundred sixty healthy women who planned to have IVF were recruited for the study. Half of the subjects (80) were randomly assigned to receive acupuncture 25 minutes before embryo transfer and again 25 minutes after the transfer. The remaining 80 women were assigned to a control group and instructed to lie still for 25 minutes after the embryo transfer. Results are shown in the table below. $$\begin{array}{ll}&{\text { Acupuncture group }} & {\text { Control group }} \\ \text { Pregnant } & \quad\quad\quad\quad {34} & \quad\quad\quad {21} \\ \text { Not Pregnant } & \quad\quad\quad\quad {46} & \quad\quad\quad {59} \\ \text { Total } & \quad\quad\quad\quad {80} & \quad\quad\quad {80}\end{array}$$ Is the pregnancy rate significantly higher for women who received acupuncture? To find out, researchers perform a test of$H_{0} : p_{1}=p_{2}$versus$H_{a} : p_{1}>p_{2},$where$p_{1}$and$p_{2}$are the actual pregnancy rates for women like those in the study who do and don't receive acupuncture, respectively. (a) Name the appropriate test and check that the conditions for carrying out this test are met. (b) The appropriate test from part (a) yields a P-value of 0.0152. Interpret this P-value in context. (c) What conclusion should researchers draw at the$\alpha=0.05$significance level? Explain. (d) What flaw in the design of the experiment prevents us from drawing a cause-and-effect conclusion? Explain. R M. Numerade Educator ### Problem 25 Exercises 23 through 26 involve the following setting. Some women would like to have children but cannot do so for medical reasons. One option for these women is a procedure called in vitro fertilization (IVF), which involves injecting a fertilized egg into the woman’s uterus. Prayer and pregnancy Construct and interpret a 99$\%$confidence interval for$p_{1}-p_{2}$in Exercise 23 . Explain what additional information the confidence interval provides. R M. Numerade Educator ### Problem 26 Exercises 23 through 26 involve the following setting. Some women would like to have children but cannot do so for medical reasons. One option for these women is a procedure called in vitro fertilization (IVF), which involves injecting a fertilized egg into the woman’s uterus. Acupuncture and pregnancy Construct and interpret a 95$\%$confidence interval for$p_{1}-p_{2}$in Exercise$24 .$Explain what additional information the confidence interval provides. R M. Numerade Educator ### Problem 27 Children make choices Many new products introduced into the market are targeted toward children. The choice behavior of children with regard to new products is of particular interest to companies that design marketing strategies for these products. As part of one study, randomly selected children in different age groups were compared on their ability to sort new products into the correct product category (milk or juice).$^{19}$Here are some of the data: Age group$\quad N \quad$Number who sorted correctly 4 to 5 -year-olds$\quad 50 \quad\quad 10 \quad$6 - to 7 -year-olds$\quad 53 \quad\quad 28 \quad$Are these two age groups equally skilled at sorting? Use information from the Minitab output below to support your answer. R M. Numerade Educator ### Problem 28 Police radar and speeding Do drivers reduce excessive speed when they encounter police radar? Researchers studied the behavior of a sample of drivers on a rural interstate highway in Maryland where the speed limit was 55 miles per hour. They measured speed with an electronic device hidden in the pavement and, to eliminate large trucks, considered only vehicles less than 20 feet long. During some time periods (determined at random), police radar was set up at the measurement location. Here are some of the data:$^{20}$$$\begin{array}{ll}& {\text { Number of vehicles }} & {\text { Number over } 65 \mathrm{mph}} \\ \text { No radar } & \quad\quad\quad {12,931} & \quad\quad\quad {5,690} \\ \text { Radar } & \quad\quad\quad {3,285} & \quad\quad\quad{1,051}\end{array}$$ (a) The researchers chose a rural highway so that cars would be separated rather than in clusters, because some cars might slow when they see other cars slowing. Explain why this is important. (b) Does the proportion of speeding drivers differ significantly when radar is being used and when it isn’t? Use information from the Minitab computer output below to support your answer. R M. Numerade Educator ### Problem 29 Multiple choice: Select the best answer for Exercises 29 to 32. A sample survey interviews SRSs of 500 female college students and 550 male college students. Each student is asked whether he or she worked for pay last summer. In all, 410 of the women and 484 of the men say “Yes.” Exercises 29 to 31 are based on this survey. Take$\rho_{M}$and$p_{F}$to be the proportions of all college males and females who worked last summer. We conjectured before seeing the data that men are more likely to work. The hypotheses to be tested are (a)$H_{0} : p_{M}-p_{F}=0$versus$H_{a} : p_{M}-p_{F} \neq 0$(b)$H_{0} : p_{M}-p_{F}=0$versus$H_{a} : p_{M}-p_{F}>0$(c)$H_{0} : p_{M}-p_{P}=0$versus$H_{a} : p_{M}-p_{F}<0$(d)$H_{0} : p_{M}-p_{P}>0$versus$H_{a} : p_{M}-p_{P}=0$(e)$H_{0} : p_{M}-p_{F} \neq 0$versus$H_{a} : p_{M}-p_{F}=0$ R M. Numerade Educator ### Problem 30 Multiple choice: Select the best answer for Exercises 29 to 32. A sample survey interviews SRSs of 500 female college students and 550 male college students. Each student is asked whether he or she worked for pay last summer. In all, 410 of the women and 484 of the men say “Yes.” Exercises 29 to 31 are based on this survey. The pooled sample proportion who worked last summer is about (a)$\hat{p}_{\mathrm{C}}=1.70 . \quad(\mathrm{d}) \hat{p}_{\mathrm{C}}=0.85$(b)$\hat{p}_{\mathrm{C}}=0.89 . \quad$(e)$\hat{p}_{\mathrm{C}}=0.82$(c)$\hat{p}_{\mathrm{C}}=0.88$ R M. Numerade Educator ### Problem 31 Multiple choice: Select the best answer for Exercises 29 to 32. A sample survey interviews SRSs of 500 female college students and 550 male college students. Each student is asked whether he or she worked for pay last summer. In all, 410 of the women and 484 of the men say “Yes.” Exercises 29 to 31 are based on this survey. The 95$\%$confidence interval for the difference$p_{M}-p_{F}$in the proportions of college men and women who worked last summer is about (a)$0.06 \pm 0.00095$(b)$0.06 \pm 0.043$(c)$0.06 \pm 0.036$(d)$-0.06 \pm 0.043$(e)$-0.06 \pm 0.036$ R M. Numerade Educator ### Problem 32 Multiple choice: Select the best answer for Exercises 29 to 32. A sample survey interviews SRSs of 500 female college students and 550 male college students. Each student is asked whether he or she worked for pay last summer. In all, 410 of the women and 484 of the men say “Yes.” Exercises 29 to 31 are based on this survey. In an experiment to learn whether Substance M can help restore memory, the brains of 20 rats were treated to damage their memories. The rats were trained to run a maze. After a day, 10 rats (determined at random) were given M and 7 of them succeeded in the maze. Only 2 of the 10 control rats were successful. The two-sample z test for “no difference” against “a significantly higher proportion of the M group succeeds” (a) gives$z=2.25, P<0.02$(b) gives$z=2.60, P<0.005$(c) gives$z=2.25, P<0.04$but not$<0.02$(d) should not be used because the Random condition is violated. (e) should not be used because the Normal condition is violated. R M. Numerade Educator ### Problem 33 Exercises 33 and 34 refer to the following setting. Thirty randomly selected seniors at Council High School were asked to report the age (in years) and mileage of their main vehicles. Here is a scatterplot of the data: We used Minitab to perform a least-squares regression analysis for these data. Part of the computer output from this regression is shown below. Predictor$\quad$coef$\quad$stdev$\quad$t-ratio$\quad \mathrm{P}$Constant$-13832 \qquad 8773 \qquad-1.58 \qquad 0.126$Age$\quad 14954 \qquad 1546 \qquad 9.67 \quad 0.000s=22723 \qquad R-s q=77.08 \qquad R-s q(a d j)=76.18$Drive my car (3.2) (a) What is the equation of the least-squares regression line? Be sure to define any symbols you use. (b) Interpret the slope of the least-squares line in the context of this problem. (c) One student reported that her 10-year-old car had 110,000 miles on it. Find the residual for this data value. Show your work. R M. Numerade Educator ### Problem 34 Exercises 33 and 34 refer to the following setting. Thirty randomly selected seniors at Council High School were asked to report the age (in years) and mileage of their main vehicles. Here is a scatterplot of the data: We used Minitab to perform a least-squares regression analysis for these data. Part of the computer output from this regression is shown below. Predictor$\quad$coef$\quad$stdev$\quad$t-ratio$\quad \mathrm{P}$Constant$-13832 \qquad 8773 \qquad-1.58 \qquad 0.126$Age$\quad 14954 \qquad 1546 \qquad 9.67 \quad 0.000s=22723 \qquad R-s q=77.08 \qquad R-s q(a d j)=76.18\$
Drive my car (3.2, 4.3)
(a) Explain what the value of r2 tells you about how well the least-squares line fits the data.
(b) The mean age of the students’ cars in the sample was x 8 years. Find the mean mileage of the cars in the sample. Show your work.
(c) Interpret the value of s in the context of this setting.
(d) Would it be reasonable to use the least-squares line to predict a car’s mileage from its age for a Council High School teacher? Justify your answer. R M.