🎉 Announcing Numerade's $26M Series A, led by IDG Capital!Read how Numerade will revolutionize STEM Learning

The Practice of Statistics for AP*

Daren S. Starnes, Daniel S. Yates, David S. Moore

Chapter 11

Inference for Distributions of Categorical Data

Educators

ae
SS
+ 4 more educators

Problem 1

Aw, nuts! A company claims that each batch of its deluxe mixed nuts contains 52% cashews, 27% almonds, 13% macadamia nuts, and 8% brazil nuts. To test this claim, a quality control inspector takes a random sample of 150 nuts from the latest batch. The one-way table below displays the sample data.
(a) State appropriate hypotheses for performing a test of the company’s claim.
(b) Calculate the expected counts for each type of nut. Show your work.

ae
Amany E.
Numerade Educator

Problem 2

Roulette Casinos are required to verify that their games operate as advertised. American roulette wheels have 38 slots—18 red, 18 black, and 2 green. In one casino, managers record data from a random sample of 200 spins of one of their American roulette wheels. The one-way table below displays the results.
$\begin{array}{llll}{\text { Color: }} & {\text { Red }} & {\text { Black }} & {\text { Green }} \\ \hline \text { Count: } & {85} & {99} & {16} \\ \hline\end{array}$
(a) State appropriate hypotheses for testing whether these data give convincing evidence that the distribution of outcomes on this wheel is not what it should be.
(b) Calculate the expected counts for each color. Show your work.

Ahmad R.
Numerade Educator

Problem 3

Aw, nuts! Calculate the chi-square statistic for the data in Exercise 1. Show your work.

Anna M.
Numerade Educator

Problem 4

Roulette Calculate the chi-square statistic for the data in Exercise 2. Show your work.

Bryan M.
Numerade Educator

Problem 5

Aw, nuts! Refer to Exercises 1 and 3.
(a) Confirm that the expected counts are large enough to use a chi-square distribution. Which distribution (specify the degrees of freedom) should you use?
(b) Sketch a graph like Figure 11.4 (page 683) that shows the P-value.
(c) Use Table C to find the P-value. Then use your calculator’s C2cdf command.
(d) What conclusion would you draw about the company’s claimed distribution for its deluxe mixed nuts? Justify your answer.

Bryan M.
Numerade Educator

Problem 6

Roulette Refer to Exercises 2 and 4.
(a) Confirm that the expected counts are large enough to use a chi-square distribution. Which distribution (specify the degrees of freedom) should you use?
(b) Sketch a graph like Figure 11.4 (page 683) that shows the P-value.
(c) Use Table C to find the P-value. Then use your calculator’s C2cdf command.
(d) What conclusion would you draw about whether the roulette wheel is operating correctly? Justify your answer.

Bryan M.
Numerade Educator

Problem 7

Birds in the trees Researchers studied the behavior of birds that were searching for seeds and insects in an Oregon forest. In this forest, 54% of the trees were Douglas firs, 40% were ponderosa pines, and 6% were other types of trees. At a randomly selected time during the day, the researchers observed 156 red-breasted nuthatches: 70 were seen in Douglas firs, 79 in ponderosa pines, and 7 in other types of trees.2 Do these data suggest that nuthatches prefer particular types of trees when they’re searching for seeds and insects? Carry out a chi-square goodness-of-fit test to help answer this question.

Bryan M.
Numerade Educator

Problem 8

Seagulls by the seashore Do seagulls show a preference for where they land? To answer this question, biologists conducted a study in an enclosed outdoor space with a piece of shore whose area was made up of 56% sand, 29% mud, and 15% rocks. The biologists chose 200 seagulls at random. Each seagull was released into the outdoor space on its own and observed until it landed somewhere on the piece of shore. In all, 128 seagulls landed on the sand, 61 landed in the mud, and 11 landed on the rocks. Carry out a chi-square goodness-of-fit test. What do you conclude?

Anna M.
Numerade Educator

Problem 9

No chi-square A school’s principal wants to know if students spend about the same amount of time on homework each night of the week. She asks a random sample of 50 students to keep track of their homework time for a week. The following table displays the average amount of time (in minutes) students reported per night:
Explain carefully why it would not be appropriate to perform a chi-square goodness-of-fit test using these data.

SS
Shane S.
Numerade Educator

Problem 10

No chi-square The principal in Exercise 9 also asked the random sample of students to record whether they did all of the homework that was assigned on each of the five school days that week. Here are the data:
Explain carefully why it would not be appropriate to perform a chi-square goodness-of-fit test using these data.

MS
Michael S.
Numerade Educator

Problem 11

Benford’s law Faked numbers in tax returns, invoices, or expense account claims often display patterns that aren’t present in legitimate records. Some patterns are obvious and easily avoided by a clever crook. Others are more subtle. It is a striking fact that the first digits of numbers in legitimate records often follow a model known as Benford’s law.3 Call the first digit of a randomly chosen record X for short. Benford’s law gives this probability model for X (note that a first digit can’t be 0):
(a) Are these data inconsistent with Benford’s law? Carry out an appropriate test at the A 0.05 level to support your answer. If you find a significant result, perform a follow-up analysis.
(b) Describe a Type I error and a Type II error in this setting, and give a possible consequence of each. Which do you think is more serious?

DK
Deniz K.
Numerade Educator

Problem 12

Housing According to the Census Bureau, the distribution by ethnic background of the New York City population in a recent year was
Hispanic: 28$\%$ Black: 24$\%$ White: 35$\%$
Asian: 12$\%$ Others: 1$\%$
The manager of a large housing complex in the city wonders whether the distribution by race of the complex’s residents is consistent with the population distribution. To find out, she records data from a random sample of 800 residents. The table below displays the sample data.4

Are these data significantly different from the city’s distribution by race? Carry out an appropriate test at the A 0.05 level to support your answer. If you find a significant result, perform a follow-up analysis.

Anna M.
Numerade Educator

Problem 13

Skittles Statistics teacher Jason Mole sky contacted Mars, Inc., to ask about the color distribution for Skittles candies. Here is an excerpt from the response he received: “The original flavor blend for the SKITTLES BITE SIZE CANDIES is lemon, lime, orange, strawberry and grape. They were chosen as a result of consumer preference tests we conducted. The flavor blend is 20 percent of each flavor.”
(a) State appropriate hypotheses for a significance test of the company’s claim.
(b) Find the expected counts for a bag of Skittles with 60 candies.
(c) How large a C2 statistic would you need to get in order to have significant evidence against the company’s claim at the A 0.05 level? At the A 0.01 level?
(d) Create a set of observed counts for a bag with 60 candies that gives a P-value between 0.01 and 0.05. Show the calculation of your chi-square statistic.

AW
Andre W.
Numerade Educator

Problem 14

Is your random number generator working? Use your calculator’s RandInt function to generate 200 digits from 0 to 9 and store them in a list.
(a) State appropriate hypotheses for a chi-square goodness-of-fit test to determine whether your calculator’s random number generator gives each digit an equal chance to be generated.
(b) Carry out the test. Report your observed counts, expected counts, chi-square statistic, P-value, and your conclusion.

Bryan M.
Numerade Educator

Problem 15

What’s your sign? The University of Chicago’s General Social Survey (GSS) is the nation’s most important social science sample survey. For reasons known only to social scientists, the GSS regularly asks a random sample of people their astrological sign. Here are the counts of responses from a recent GSS:
If births are spread uniformly across the year, we expect all 12 signs to be equally likely. Are these data inconsistent with that belief? Carry out an appropriate test to support your answer. If you find a significant result, perform a follow-up analysis.

Anna M.
Numerade Educator

Problem 16

Munching Froot Loops Kellogg’s Froot Loops cereal comes in six fruit flavors: orange, lemon, cherry, raspberry, blueberry, and lime. Charise poured out her morning bowl of cereal and methodically counted the number of cereal pieces of each flavor. Here are her data:
Test the null hypothesis that the population of Froot Loops produced by Kellogg’s contains an equal prportion of each flavor. If you find a significant result, perform a follow-up analysis.

Anna M.
Numerade Educator

Problem 17

Mendel and the peas Gregor Mendel (1822–1884), an Austrian monk, is considered the father of genetics. Mendel studied the inheritance of various traits in pea plants. One such trait is whether the pea is smooth or wrinkled. Mendel predicted a ratio of 3 smooth peas for every 1 wrinkled pea. In one experiment, he observed 423 smooth and 133 wrinkled peas. The data were produced in such a way that the Random and Independent conditions are met. Carry out a chi-square goodness-of-fit test based on Mendel’s prediction. What do you conclude?

Anna M.
Numerade Educator

Problem 18

You say tomato The paper “Linkage Studies of the Tomato” (Transactions of the Canadian Institute, 1931) reported the following data on phenotypes resulting from crossing tall cut-leaf tomatoes with dwarf potato-leaf tomatoes. We wish to investigate whether the following frequencies are consistent with genetic laws, which state that the phenotypes should occur in the ratio 9:3:3:1.

Anna M.
Numerade Educator

Problem 19

An appropriate null hypothesis to test whether the
trees in the forest are randomly distributed is
(a) H0:M 25, where M the mean number of trees in each quadrant.
(b) H0:p 0.25, where p the proportion of all trees in the forest that are in Quadrant 1.
(c) H0:n1 n2 n3 n4 25, where ni is the number of trees from the sample in Quadrant i.
(d) H0:p1 p2 p3 p4 0.25, where pi is the actual proportion of trees in the forest that are in Quadrant i.
(e) H0:pppp ˆˆˆˆ 1 2 3 4 0. , 25 where pˆ i is the proportion of trees in the sample that are in Quadrant i.

Anna M.
Numerade Educator

Problem 20

The chi-square statistic is
(a) $\frac{(18-25)^{2}}{25}+\frac{(22-25)^{2}}{25}+\frac{(39-25)^{2}}{25}+\frac{(21-25)^{2}}{25}$
(b) $\frac{(25-18)^{2}}{18}+\frac{(25-22)^{2}}{22}+\frac{(25-39)^{2}}{39}+\frac{(25-21)^{2}}{21}$
(c) $\frac{(18-25)}{25}+\frac{(22-25)}{25}+\frac{(39-25)}{25}+\frac{(21-25)}{25}$
(d) $\frac{(18-25)^{2}}{100}+\frac{(22-25)^{2}}{100}+\frac{(39-25)^{2}}{100}+\frac{(21-25)^{2}}{100}$
(e) $\frac{(0.18-0.25)^{2}}{0.25}+\frac{(0.22-0.25)^{2}}{0.25}+\frac{(0.39-0.25)^{2}}{0.25}$ $+\frac{(0.21-0.25)^{2}}{0.25}$

Anna M.
Numerade Educator

Problem 21

The P-value for a chi-square goodness-of-fit test is 0.0129. The correct conclusion is
(a) reject H0 at A 0.05; there is strong evidence that the trees are randomly distributed.
(b) reject H0 at A 0.05; there is not strong evidence that the trees are randomly distributed.
(c) reject H0 at A 0.05; there is strong evidence that the trees are not randomly distributed.
(d) fail to reject H0 at A 0.05; there is not strong evidence that the trees are randomly distributed.
(e) fail to reject H0 at A 0.05; there is strong evidence that the trees are randomly distributed.

Anna M.
Numerade Educator

Problem 22

Your teacher prepares a large container full of colored beads. She claims that 1/8 of the beads are red, 1/4 are blue, and the remainder are yellow. Your class will take a simple random sample of beads from the container to test the teacher’s claim. The smallest number of beads you can take so that the conditions for performing inference are met is
(a) 15
(b) 16
(c) 30
(d) 40
(e) 80

AR
Allaa R.
Numerade Educator

Problem 23

Reading and grades (1.3) Write a few sentences comparing the distributions of English grades for light and heavy readers.

Anna M.
Numerade Educator

Problem 24

Reading and grades (10.2) Summary statistics for the two groups from Minitab are provided below.
$\begin{array}{lllll}{\text { Heavy }} & {47} & {3.640} & {0.324} & {0.047} \\ {\text { Light }} & {32} & {3.356} & {0.380} & {0.067}\end{array}$
(a) Explain why it is acceptable to use two-sample t procedures in this setting.
(b) Construct and interpret a 95% confidence inter- val for the difference in the mean English grade for light and heavy readers.
(c) Does the interval in part (b) provide convincing evidence that reading more causes an increase in students’ English grades? Justify your answer.

Bryan M.
Numerade Educator

Problem 25

Reading and grades (3.2) The Fathom scatterplot below shows the number of books read and the English grade for all 79 students in the study. A least-squares regression line has been added to the graph.
(a) Interpret the meaning of the y intercept in context.
(b) The student who reported reading 17 books for pleasure had an English GPA of 2.85. Find this student’s residual. Show your work.
(c) How strong is the relationship between English grades and number of books read? Give appropriate evidence to support your answer.

Bryan M.
Numerade Educator

Problem 26

Yahtzee (5.3, 6.3) In the game of Yahtzee, 5 six-sided dice are rolled simultaneously. To get a Yahtzee, the player must get the same number on all 5 dice.
(a) Luis says that the probability of getting a Yahtzee in one roll of the dice is $\left(\frac{1}{6}\right)^{5} .$ Explain why Luis is wrong.
(b) Nassir decides to keep rolling all 5 dice until he gets a Yahtzee. He is surprised when he still hasn’t gotten a Yahtzee after 25 rolls. Should he be? Calculate an appropriate probability to support your answer.

Bryan M.
Numerade Educator