• Home
  • Textbooks
  • The Practice of Statistics for AP
  • More about Regression

The Practice of Statistics for AP

Daren S. Starnes, Daniel S. Yates, David S. Moore

Chapter 12

More about Regression - all with Video Answers

Educators


Section 1

Inference for Linear Regression

04:59

Problem 1

Oil and residuals Exercise 59 on page 193 (Chapter 3 ) examined data on the depth of small
defects in the Trans-Alaska Oil Pipeline. Researchers compared the results of measurements on 100 defects made in the field with measurements of the same defects made in the laboratory. 5 The figure below shows a residual plot for the least-squares regression line based on these data. Are the conditions for performing inference about the slope $\beta$ of the population regression line met? Justify your answer.

Colin Fenster
Colin Fenster
Numerade Educator
02:46

Problem 2

SAT Math scores In Chapter $3,$ we examined data on the percent of high school graduates in each
state who took the SAT and the state's mean SAT Math score in a recent year. The figure below shows a residual plot for the least-squares regression line based on these data. Are the conditions for performing inference about the slope $\beta$ of the population regression line met? Justify your answer.

Colin Fenster
Colin Fenster
Numerade Educator
05:03

Problem 3

Prey attracts predators Here is one way in which nature regulates the size of animal populations: high population density attracts predators, which remove a higher proportion of the population than when the density of the prey is low. One study looked at kelp perch and their common predator, the kelp bass. The researcher set up four large circular pens on sandy ocean bottoms off the coast of southern California. He chose young perch at random from a large group and placed 10, 20, 40, and 60 perch in the four pens. Then he dropped the nets protecting the pens, allowing bass to swarm in, and counted the perch left after two hours. Here are data on the proportions of perch eaten in four repetitions of this setup:6
The explanatory variable is the number of perch (the prey) in a confined area. The response variable is the proportion of perch killed by bass (the predator) in two hours when the bass are allowed access to the perch. A scatterplot of the data shows a linear relationship.
We used Minitab software to carry out a least-squares regression analysis for these data. A residual plot and a histogram of the residuals are shown below. Check whether the conditions for performing inference about the regression model are met.

Colin Fenster
Colin Fenster
Numerade Educator
04:42

Problem 4

Beer and BAC How well does the number of beers a person drinks predict his or her blood alcohol content (BAC)? Sixteen volunteers with an initial BAC of 0 drank a randomly assigned number of cans of
beer. Thirty minutes later, a police officer measured their BAC. Least-squares regression was performed
on the data. A residual plot and a histogram of the residuals are shown below. Check whether the
conditions for performing inference about the regression model are met.

Colin Fenster
Colin Fenster
Numerade Educator
05:40

Problem 5

Prey attracts predators Refer to Exercise 3. Computer output from the least-squares regression
analysis on the perch data is shown below.
The model for regression inference has three parameters: $\alpha, \beta,$ and $\sigma .$ Explain what each parameter represents in context. Then provide an estimate for each.

Colin Fenster
Colin Fenster
Numerade Educator
05:20

Problem 6

Beer and BAC Refer to Exercise 4. Computer output from the least-squares regression analysis on the beer and blood alcohol data is shown below. The model for regression inference has three parameters: $\alpha, \beta,$ and $\sigma .$ Explain what each parameter represents in context. Then provide an estimate for each.

Colin Fenster
Colin Fenster
Numerade Educator
09:34

Problem 7

Prey attracts predators Refer to Exercise 5.
(a) Interpret the value of SEb in context.
(b) Find the critical value for a 90% confidence interval for the slope of the true regression line. Then
calculate the confidence interval. Show your work.
(c) Interpret the interval from part (b) in context.
(d) Explain the meaning of “90% confident” in context.

Colin Fenster
Colin Fenster
Numerade Educator
08:22

Problem 8

Beer and BAC Refer to Exercise 6.
(a) Interpret the value of SEb in context.
(b) Find the critical value for a 99% confidence interval for the slope of the true regression line. Then
calculate the confidence interval. Show your work.
(c) Interpret the interval from part (b) in context.
(d) Explain the meaning of “99% confident” in context.

Colin Fenster
Colin Fenster
Numerade Educator
05:05

Problem 9

Beavers and beetles Do beavers benefit beetles? Researchers laid out 23 circular plots, each four meters in diameter, at random in an area where beavers were cutting down cottonwood trees. In
each plot, they counted the number of stumps from trees cut by beavers and the number of clusters of beetle larvae. Ecologists think that the new sprouts from stumps are more tender than other cottonwood growth, so that beetles prefer them. If so, more stumps should produce more beetle larvae.$^{7}$ Minitab output for a regression analysis on these data is shown below. Construct and interpret a 99% confidence interval for the slope of the population regression line. Assume that the conditions for performing inference are met.
Regression Analysis: Beetle larvae versus Stumps $$
\begin{array}{lcccc}
\text { Predictor } & \text { Coef } & \text { SE Coef } & \text { T } & \text { P } \\
\text { Constant } & -1.286 & 2.853 & -0.45 & 0.657 \\
\text { Stumps } & 11.894 & 1.136 & 10.47 & 0.000 \\
\mathrm{~S}=6.41939 & \mathrm{R}-\mathrm{Sq} & =83.98\% &\mathrm{R}-\mathrm{Sq}(\mathrm{adj}) & =83.18\%
\end{array}
$$

Colin Fenster
Colin Fenster
Numerade Educator
04:11

Problem 10

Ideal proportions The students in Mr. Shenk’s class measured the arm spans and heights (in
inches) of a random sample of 18 students from their large high school. Some computer output
from a least-squares regression analysis on these data is shown below. Construct and interpret a 90%
confidence interval for the slope of the population regression line. Assume that the conditions for performing inference are met.

Colin Fenster
Colin Fenster
Numerade Educator
03:25

Problem 11

Beavers and beetles Refer to Exercise 9.
(a) How many clusters of beetle larvae would you predict in a circular plot with 5 tree stumps cut by
beavers? Show your work.
(b) About how far off do you expect the prediction in part (a) to be from the actual number of clusters of
beetle larvae? Justify your answer.

Colin Fenster
Colin Fenster
Numerade Educator
03:22

Problem 12

Ideal proportions Refer to Exercise 10.
(a) What height would you predict for a student with an arm span of 76 inches? Show your work.
(b) About how far off do you expect the prediction in part (a) to be from the student’s actual height? Justify your answer.

Colin Fenster
Colin Fenster
Numerade Educator
08:49

Problem 13

Weeds among the corn Lamb’s-quarter is a common weed that interferes with the growth of corn. An agriculture researcher planted corn at the same rate in 16 small plots of ground and then weeded the plots by hand to allow a fixed number of lamb’s-quarter plants to grow in each meter of corn row. The decision of how many of these plants to leave in each plot was made at random. No other weeds were allowed to grow. Here are the yields of corn (bushels per acre) in each of the plots:$^{8}$
(a) A scatterplot of the data with the least-squares line added is shown below. Describe what this graph tells you about the relationship between these two variables. Minitab output from a linear regression on these data is shown below.
(b) What is the equation of the least-squares regression line for predicting corn yield from the number of lamb’s quarter plants per meter? Define any variables you use.
(c) Interpret the slope and y intercept of the regression line in context.
(d) Do these data provide convincing evidence that more weeds reduce corn yield? Carry out an
appropriate test at the A 0.05 level to help answer this question.

Colin Fenster
Colin Fenster
Numerade Educator
07:40

Problem 14

Time at the table Does how long young children remain at the lunch table help predict how much
they eat? Here are data on a random sample of 20 toddlers observed over several months.$^(9} “Time” is
the average number of minutes a child spent at the table when lunch was served. “Calories” is the average number of calories the child consumed during lunch, calculated from careful observation of
what the child ate each day.
(a) A scatterplot of the data with the least-squares line added is shown below. Describe what this graph tells you about the relationship between these two variables. Minitab output from a linear regression on these data is shown below.
(b) What is the equation of the least-squares regression line for predicting calories consumed
from time at the table? Define any variables you use.
(c) Interpret the slope of the regression line in context. Does it make sense to interpret the y intercept in this case? Why or why not?
(d) Do these data provide convincing evidence of a negative linear relationship between time at the table and calories consumed in the population of toddlers? Carry out an appropriate test at the A 0.01 level to help answer this question.

Colin Fenster
Colin Fenster
Numerade Educator
07:51

Problem 15

Weeds among the corn Refer to Exercise 13.
(a) Construct and interpret a 90% confidence interval for the slope of the true regression line. Explainhow your results are consistent with the significance test in Exercise 13.
(b) Interpret each of the following in context:
(i) $s$
(ii) $r^{2}$
(iii) The standard error of the slope

Colin Fenster
Colin Fenster
Numerade Educator
06:31

Problem 16

Time at the table Refer to Exercise 14.
(a) Construct and interpret a 98% confidence interval for the slope of the population regression line. Explain how your results are consistent with the significance test in Exercise 14.
(b) Interpret each of the following in context:
(i) $s$
(ii) $r^{2}$
(iii) The standard error of the slope

Colin Fenster
Colin Fenster
Numerade Educator
04:58

Problem 17

Paired tires Exercise 69 in Chapter 8 (page 519) compared two methods for estimating tire wear. The
first method used the amount of weight lost by a tire. The second method used the amount of wear in the grooves of the tire. A random sample of 16 tires was obtained. Both methods were used to estimate the total distance traveled by each tire. The scatterplot below displays the two estimates (in thousands of miles) for each tire. 10
Computer output from a least-squares regression analysis of these data is shown below. Assume that the conditions for regression inference are met.
Predictor Coef SE Coef T P
Constant 1.351 2.105 0.64 0.531
Weight 0.79021 0.07104 11.12 0.000
S 2.62078 R-Sq 89.8% R-Sq(adj) 89.1%

(a) Verify that the 99% confidence interval for the slope of the population regression line is (0.5785, 1.001).
(b) Researchers want to test whether there is a difference in the two methods of estimating tire wear.
Explain why an appropriate pair of hypotheses for this test is $H_{0} : \beta=1$ versus $H_{a} : \beta \neq 1$
(c) What conclusion would you draw for this significance test based on your interval in part (a)? Justify your answer.

Colin Fenster
Colin Fenster
Numerade Educator
04:31

Problem 18

Stats teachers' cars A random sample of AP Statistics teachers was asked to report the age (in years) and mileage of their primary vehicles. A scatterplot of the data is shown below. Computer output from a least-squares regression analysis of these data is shown below. Assume that the conditions for regression inference are met.
(a) Verify that the 95% confidence interval for the slope of the population regression line is (9016.4,
14,244.8).
(b) A national automotive group claims that the typical driver puts 15,000 miles per year on his or her main vehicle. We want to test whether AP Statistics teachers are typical drivers. Explain why an appropriate pair of hypotheses for this test is $H_{0} : \beta=15,000$ versus $H_{a} : \beta \neq 15,000$
(c) What conclusion would you draw for this significance test based on your interval in part (a)? Justify your answer..

Colin Fenster
Colin Fenster
Numerade Educator
04:13

Problem 19

Is wine good for your heart? A researcher from the University of California, San Diego, collected data
on average per capita wine consumption and heart disease death rate in a random sample of 19 countries for which data were available. The following table displays the data.$^{11}$
(a) Is there statistically significant evidence of a negative linear relationship between wine consumption and heart disease deaths in the population of countries? Carry out an appropriate significance test
at the A 0.05 level.
(b) Calculate and interpret a 95% confidence interval tor the slope $\beta$ ot the population regression line.

Bryan Meares
Bryan Meares
Numerade Educator
04:41

Problem 20

The professor swims Here are data on the time (in minutes) Professor Moore takes to swim 2000 yards and his pulse rate (beats per minute) after swimming on a random sample of 23 days:
(a) Is there statistically significant evidence of a negative linear relationship between Professor Moore’s swim time and his pulse rate in the population of days on which he swims 2000 yards? Carry out an appropriate significance test at the $\alpha=0.05$ level.
(b) Calculate and interpret a 95% confidence inter-val for the slope $\beta$ of the population regression line.

Bryan Meares
Bryan Meares
Numerade Educator
01:53

Problem 21

Multiple choice: Select the best answer for Exercises 21 to 26.
The equation of the least-squares regression line for predicting selling price from appraised value is
Multiple choice: Select the best answer for Exercises 21 to 26.
(a) $\widehat{\text { price }}=79.49+0.1126$ (appraised value)
(b) $\widehat{\text { price }}=0.1126+1.0466$ (appraised value).
(c) $\widehat{\text { price }}=127.27+1.0466$ (appraised value).
(d) $\widehat{\text { price }}=1.0466+127.27$ (appraised value).
(e) $\widehat{\text { price }}=1.0466+69.7299$ (appraised value).

Colin Fenster
Colin Fenster
Numerade Educator
01:15

Problem 22

Multiple choice: Select the best answer for Exercises 21 to 26.
What is the correlation between selling price and appraised value?
(a) 0.1126 (c) -0.861 (e) -0.928
(b) 0.861 (d) 0.928

Colin Fenster
Colin Fenster
Numerade Educator
02:06

Problem 23

Multiple choice: Select the best answer for Exercises 21 to 26.
The slope $\beta$ of the population regression line describes
(a) the exact increase in the selling price of an individual unit when its appraised value increases by $1000.
(b) the average increase in the appraised value in a population of units when selling price increases by $1000.
(c) the average increase in selling price in a population of units when appraised value increases
by $1000.
(d) the average selling price in a population of units when a unit’s appraised value is 0.
(e) the average increase in appraised value in a sample of 16 units when selling price increases by $1000.

Colin Fenster
Colin Fenster
Numerade Educator
00:56

Problem 24

Multiple choice: Select the best answer for Exercises 21 to 26.
Is there significant evidence that selling price increases as appraised value increases? To answer
this question, test the hypotheses
(a) $H_{0} : \beta=0$ versus $H_{a} : \beta > 0$
(b) $H_{0} : \beta=0$ versus $H_{a} : \beta < 0$
(c) $H_{0} : \beta=0$ versus $H_{a} : \beta \neq 0$
(d) $H_{0} : \beta>0$ versus $H_{a} : \beta=0$
(e) $H_{0} : \beta=1$ versus $H_{a} : \beta > 1$

Colin Fenster
Colin Fenster
Numerade Educator
00:53

Problem 25

Multiple choice: Select the best answer for Exercises 21 to 26.
Confidence intervals and tests for these data use the t distribution with degrees of freedom
(a) 9.29. (c) 15. (e) 30.
(b) 14. (d) 16.

Colin Fenster
Colin Fenster
Numerade Educator
02:29

Problem 26

Multiple choice: Select the best answer for Exercises 21 to 26.
A 95$\%$ confidence interval for the population slope $\beta$ is
(a) $1.0466 \pm 149.5706$ (d) $1.0466 \pm 0.1983$
(b) $1.0466 \pm 0.2415$ (e) $1.0466 \pm 0.1126$
(c) $1.0466 \pm 0.2387$

Colin Fenster
Colin Fenster
Numerade Educator
02:46

Problem 27

Exercises 27 to 30 refer to the following setting. Does the color in which words are printed affect your ability to read them? Do the words themselves affect your ability to name the color in which they are printed? Mr. Starnes designed a study to investigate these questions using the 16 students in his AP Statistics class as subjects. Each student performed two tasks in a random order while a partner timed: (1) read 32 words aloud as quickly as possible, and (2) say the color in which each of 32 words is printed as quickly as possible.
Color words (4.2) Let’s review the design of the study.
(a) Explain why this was an experiment and not an observational study.
(b) Did Mr. Starnes use a completely randomized design or a randomized block design? Why do you think he chose this experimental design?
(c) Explain the purpose of the random assignment in the context of the study. The data from Mr. Starnes’s experiment are shown below. For each subject, the time to perform the two
tasks is given to the nearest second.

Colin Fenster
Colin Fenster
Numerade Educator
02:10

Problem 28

Exercises 27 to 30 refer to the following setting. Does the color in which words are printed affect your ability to read them? Do the words themselves affect your ability to name the color in which they are printed? Mr. Starnes designed a study to investigate these questions using the 16 students in his AP Statistics class as subjects. Each student performed two tasks in a random order while a partner timed: (1) read 32 words aloud as quickly as possible, and (2) say the color in which each of 32 words is printed as quickly as possible.

Color words (1.3) Do the data provide evidence of a difference in the average time required to perform the two tasks? Include an appropriate graph and numerical summaries in your answer.

Bryan Meares
Bryan Meares
Numerade Educator
02:09

Problem 29

Exercises 27 to 30 refer to the following setting. Does the color in which words are printed affect your ability to read them? Do the words themselves affect your ability to name the color in which they are printed? Mr. Starnes designed a study to investigate these questions using the 16 students in his AP Statistics class as subjects. Each student performed two tasks in a random order while a partner timed: (1) read 32 words aloud as quickly as possible, and (2) say the color in which each of 32 words is printed as quickly as possible.

Color words (9.3) Explain why it is not safe to use paired t procedures to do inference about the difference in the mean time to complete the two tasks.

Bryan Meares
Bryan Meares
Numerade Educator
04:11

Problem 30

Exercises 27 to 30 refer to the following setting. Does the color in which words are printed affect your ability to read them? Do the words themselves affect your ability to name the color in which they are printed? Mr. Starnes designed a study to investigate these questions using the 16 students in his AP Statistics class as subjects. Each student performed two tasks in a random order while a partner timed: (1) read 32 words aloud as quickly as possible, and (2) say the color in which each of 32 words is printed as quickly as possible.
Color words (3.2, 12.1) Can we use a student’s word task time to predict his or her color task time?
(a) Make an appropriate scatterplot to help answer this question. Describe what you see.
(b) Use your calculator to find the equation of the least-squares regression line. Define any symbols
you use.
(c) What is the residual for the student who completed the word task in 9 seconds? Show your work.
(d) Assume that the conditions for performing inference about the slope of the true regression line are met. The P-value for a test of $H_{0} : \beta=0$ versus $H_{a} : \beta>0$ is 0.0215 . Explain what this value means in context.

Bryan Meares
Bryan Meares
Numerade Educator
07:07

Problem 31

Exercises 31 and 32 refer to the following setting. Yellowstone National Park surveyed a random sample of 1526 winter visitors to the park. They asked each person whether they owned, rented, or had never used a snowmobile. Respondents were also asked whether they belonged to an environmental organization (like the Sierra Club). The two-way table summarizes the survey responses.
Snowmobiles (5.2, 5.3)
(a) If we choose a survey respondent at random, what’s the probability that this individual
(i) is a snowmobile owner?
(ii) belongs to an environmental organization or owns a snowmobile?
(iii) has never used a snowmobile given that the person belongs to an environmental organization?
(b) Are the events “is a snowmobile owner” and “belongs to an environmental organization”
independent for the members of the sample? Justify your answer.
(c) If we choose two survey respondents at random, what’s the probability that
(i) both are snowmobile owners?
(ii) at least one of the two belongs to an environmental organization?

Colin Fenster
Colin Fenster
Numerade Educator
02:50

Problem 32

Exercises 31 and 32 refer to the following setting. Yellowstone National Park surveyed a random sample of 1526 winter visitors to the park. They asked each person whether they owned, rented, or had never used a snowmobile. Respondents were also asked whether they belonged to an environmental organization (like the Sierra Club). The two-way table summarizes the survey responses.

Snowmobiles (11.2) Do these data provide convincing evidence of an association between environmental club membership and snowmobile use for the population of visitors to Yellowstone
National Park? Carry out an appropriate test at the 5% significance level.

Bryan Meares
Bryan Meares
Numerade Educator