• Home
  • Textbooks
  • The Practice of Statistics for AP
  • More about Regression

The Practice of Statistics for AP

Daren S. Starnes, Daniel S. Yates, David S. Moore

Chapter 12

More about Regression - all with Video Answers

Educators


Section 1

Inference for Linear Regression

View

Problem 1

Oil and residuals Exercise 53 on page 194 (Chapter 3 ) examined data on the depth of small defects in the Trans-Alaska Oil Pipeline. Researchers compared the results of measurements on 100 defects made in the field with measurements of the same defects made in the laboratory. ${ }^{6}$ The figure below shows a residual plot for the least-squares regression line based on these data. Explain why the conditions for performing inference about the slope $\beta$ of the population regression line are not met.

Lainey Roebuck
Lainey Roebuck
Numerade Educator
View

Problem 2

SAT Math scores In Chapter 3 , we examined data on the percent of high school graduates in each state who took the SAT and the state's mean SAT Math score in a recent year. The figure below shows a residual plot for the least-squares regression line based on these data. Explain why the conditions for performing inference about the slope $\beta$ of the population regression line are not met.

Lainey Roebuck
Lainey Roebuck
Numerade Educator
View

Problem 3

Beer and BAC How well does the number of beers a person drinks predict his or her blood alcohol content (BAC)? Sixteen volunteers aged 21 or older with an initial BAC of 0 took part in a study to find out. Each volunteer drank a randomly assigned number of cans of beer. Thirty minutes later, a police officer measured their BAC. Least-squares regression was performed on the data. A residual plot and a histogram of the residuals are shown below. Check whether the conditions for performing inference about the regression model are met.

Lainey Roebuck
Lainey Roebuck
Numerade Educator
View

Problem 4

Prey attracts predators Here is one way in which nature regulates the size of animal populations: high population density attracts predators, which remove a higher proportion of the population than when the density of the prey is low. One study looked at kelp perch and their common predator, the kelp bass. The researcher set up four large circular pens on sandy ocean bottoms off the coast of southem California. He chose young perch at random from a large group and placed $10,20,40,$ and 60 perch in the four pens. Then he dropped the nets protecting the pens, allowing bass to swarm in, and counted the perch left after two hours. Here are data on the proportions of perch eaten in four repetitions of this setup: $^{7}$
$$
\begin{array}{cllll}
\hline \text { Number of Perch } & {4}{c} {\text { Proportion Killed }} \\
10 & 0.0 & 0.1 & 0.3 & 0.3 \\
20 & 0.2 & 0.3 & 0.3 & 0.6 \\
40 & 0.075 & 0.3 & 0.6 & 0.725 \\
60 & 0.517 & 0.55 & 0.7 & 0.817 \\
\hline
\end{array}
$$
The explanatory variable is the number of perch (the prey) in a confined area. The response variable is the proportion of perch killed by bass (the predator) in two hours when the bass are allowed access to the perch. A scatterplot of the data shows a linear relationship. We used Minitab software to carry out a least-squares regression analysis for these data. A residual plot and a histogram of the residuals are shown below. Check whether the conditions for performing inference about the regression model are met.

Lainey Roebuck
Lainey Roebuck
Numerade Educator
View

Problem 5

Beer and BAC Refer to Exercise 3 . Computer output from the least-squares regression analysis on the beer and blood alcohol data is shown below.
Dependent variable is: $\quad$ BAC
No selector
R squared $=80.0$ \& $R$ squared (adjusted) $=78.68$ $s=0.0204$ with $16-2=14$ degrees of freedom
$$
\begin{aligned}
&\text { variable }\\
&\text { Coefficient a.e. of coeff} \quad \quad \quad{t-ratio} \quad \quad\quad{prob }\\
&\begin{array}{lllll}
\hline \text { Constant } & -0.012701 & 0.0126 & -1.00 & 0.3320 \\
\text { Meers } & 0.017964 & 0.0024 & 7.84 & \leq 0.0001
\end{array}\\
&\text { }
\end{aligned}
$$
The model for regression inference has three parameters:
$\alpha, \beta,$ and $\sigma .$ Explain what each parameter represents in context. Then provide an estimate for each.

Lainey Roebuck
Lainey Roebuck
Numerade Educator
View

Problem 6

Prey attracts predators Refer to Exercise 4 Computer output from the least-squares regression analysis on the perch data is shown below.
$$
\begin{array}{lllll}
\text { Predictor } & \text { Coef } & \text { Stdev. } & \text { t-ratio } & \text { p } \\
\text { Constant } & 0.12049 & 0.09269 & 1.30 & 0.215 \\
\text { Perch } & 0.008569 & 0.002456 & 3.49 & 0.004 \\
\mathrm{~S}=0.1886 & \mathrm{R}-\mathrm{Sq}=46.5 \% & \mathrm{R}-\mathrm{Sq}(\mathrm{adj}) & =42.7 \%
\end{array}
$$
The model for regression inference has three parameters: $\alpha, \beta,$ and $\sigma .$ Explain what each parameter represents in context. Then provide an estimate for each.

Lainey Roebuck
Lainey Roebuck
Numerade Educator
View

Problem 7

(a) Give the standard error of the slope, $\mathrm{SE}_{b}$. Interpret this value in context.
(b) Find the critical value for a $99 \%$ confidence interval for the slope of the true regression line. Then calculate the confidence interval. Show your work.
(c) Interpret the interval from part (b) in context.
(d) Explain the meaning of "99\% confident" in context.

Lainey Roebuck
Lainey Roebuck
Numerade Educator
View

Problem 8

Prey attracts predators Refer to Exercise 6 .
(a) Give the standard error of the slope, $\mathrm{SE}_{b}$. Interpret this value in context.
(b) Find the critical value for a $90 \%$ confidence interval for the slope of the true regression line. Then calculate the confidence interval. Show your work.
(c) Interpret the interval from part (b) in context.
(d) Explain the meaning of "90\% confident" in context.

Lainey Roebuck
Lainey Roebuck
Numerade Educator
04:05

Problem 9

Beavers and beetles Do beavers benefit beetles? Researchers laid out 23 circular plots, each 4 meters in diameter, at random in an area where beavers were cutting down cottonwood trees. In each plot, they counted the number of stumps from trees cut by beavers and the number of clusters of beetle larvae. Ecologists think that the new sprouts from stumps are more tender than other cottonwood growth, so that beetles prefer them. If so, more stumps should produce more beetle larvae. ${ }^{8}$ Minitab output for a regression analysis on these data is shown below. Construct and interpret a $99 \%$ confidence interval for the slope of the population regression line. Assume that the conditions for performing inference are met.
$$
\begin{aligned}
&\text { Regression Analysis: Beetle larvae versus Stumps }\\
&\begin{array}{lllll}
\text { Predictor } & \text { Coef } & \text { SE Coef } & \text { T } & \text { P } \\
\text { Constant } & -1.286 & 2.853 & -0.45 & 0.657 \\
\text { Stumps } & 11.894 & 1.136 & 10.47 & 0.000
\end{array}\\
&\begin{array}{ll}
S=6.41939 & R-S q=83.9 \% & R-S q(a d j)=83.1 \%
\end{array}
\end{aligned}
$$

Jon Southam
Jon Southam
Numerade Educator
03:51

Problem 10

Ideal proportions The students in Mr. Shenk's class measured the arm spans and heights (in inches) of a random sample of 18 students from their large high school. Some computer output from a least-squares regression analysis on these data is shown below. Construct and interpret a $90 \%$ confidence interval for the slope of the population regression line. Assume that the conditions for performing inference are met.
$\begin{array}{lllrl}\text { Predictor } & \text { Coef } & \text { Stdev } & \text { t-ratio } & \text { p } \\ \text { Constant } & 11.547 & 5.600 & 2.06 & 0.056 \\ \text { Armspan } & 0.84042 & 0.08091 & 10.39 & 0.000 \\ \mathrm{~S}=1.613 & \mathrm{R}-\mathrm{Sq}=87.1 \% & \mathrm{R}-\mathrm{Sq}(\mathrm{adj}) & =86.3 \%\end{array}$

Sheryl Ezze
Sheryl Ezze
Numerade Educator
03:25

Problem 11

Beavers and beetles Refer to Exercise 9 .
(a) How many clusters of beetle larvae would you predict in a circular plot with 5 tree stumps cut by beavers? Show your work.
(b) About how far off do you expect the prediction in part (a) to be from the actual number of clusters of beetle larvae? Justify your answer.

Colin Fenster
Colin Fenster
Numerade Educator
View

Problem 12

Ideal proportions Refer to Exercise 10 .
(a) What height would you predict for a student with an arm span of 76 inches? Show your work.
(b) About how far off do you expect the prediction in part (a) to be from the student's actual height? Justify your answer.

Lainey Roebuck
Lainey Roebuck
Numerade Educator
View

Problem 13

Weeds among the corn Lamb's-quarter is a common weed that interferes with the growth of corn. An agriculture researcher planted corn at the same rate in 16 small plots of ground and then weeded the plots by hand to allow a fixed number of lamb'squarter plants to grow in each meter of corn row. The decision of how many of these plants to leave in each plot was made at random. No other weeds were allowed to grow. Here are the yields of corn (bushels per acre) in each of the plots: Some computer output from a least-squares regression analysis on these data is shown below.
$$
\begin{array}{lllll}
\text { Predictor } & \text { Coef } & \text { SE Coef } & \text { T } & \text { P } \\
\text { Constant } & 166.483 & 2.725 & 61.11 & 0.000 \\
\begin{array}{l}
\text { Weeds per } \\
\text { meter }
\end{array} & -1.0987 & 0.5712 & -1.92 & 0.075 \\
\mathrm{~S}=7.97665 & \mathrm{R}-\mathrm{Sq}=20.9 \% & \mathrm{R}-\mathrm{Sq}(\mathrm{adj}) & =15.3 \%
\end{array}
$$
(a) What is the equation of the least-squares regression line for predicting corn yield from the number of lamb's quarter plants per meter? Interpret the slope and $y$ intercept of the regression line in context.
(b) Explain what the value of $s$ means in this setting.
(c) Do these data provide convincing evidence at the $\alpha=0.05$ level that more weeds reduce corn yield? Assume that the conditions for performing inference are met.

Lainey Roebuck
Lainey Roebuck
Numerade Educator
View

Problem 14

Time at the table Does how long young children remain at the lunch table help predict how much they eat? Here are data on a random sample of 20 toddlers observed over several months. ${ }^{10}$ "Time" is the average number of minutes a child spent at the table when lunch was served. "Calories" is the average number of calories the child consumed during lunch, calculated from careful observation of what the child ate each day. Some computer output from a least-squares regression analysis on these data is shown below.
$$
\begin{array}{lllll}
\text { Predictor } & \text { Coef } & \text { SE Coef } & \text { T } & \text { P } \\
\text { Constant } & 560.65 & 29.37 & 19.09 & 0.000 \\
\text { Time } & -3.0771 & 0.8498 & -3.62 & 0.002 \\
S=23.3980 & R-S q=42.1 \% & R-S q(a d j)=38.9 \%
\end{array}
$$
(a) What is the equation of the least-squares regression line for predicting calories consumed from time at the table? Interpret the slope of the regression line in context. Does it make sense to interpret the $y$ intercept in this case? Why or why not?
(b) Explain what the value of $s$ means in this setting.
(c) Do these data provide convincing evidence at the $\alpha=0.01$ level of a linear relationship between time at the table and calories consumed in the population of toddlers? Assume that the conditions for performing inference are met.

Lainey Roebuck
Lainey Roebuck
Numerade Educator
View

Problem 15

Is wine good for your heart? A researcher from the University of California, San Diego, collected data on average per capita wine consumption and heart disease death rate in a random sample of 19 countries for which data were available. The following table displays the data. ${ }^{11}$
$$
\begin{array}{cccc}
\begin{array}{c}
\text { Alcohol from } \\
\text { wine } \\
\text { (liters/year) }
\end{array} & \begin{array}{c}
\text { Heart disease } \\
\text { death rate } \\
\text { (per 100,000) }
\end{array} & \begin{array}{c}
\text { Alcohol from } \\
\text { wine } \\
\text { (liters/year) }
\end{array} & \begin{array}{c}
\text { Heart disease } \\
\text { death rate } \\
\text { (per 100,000) }
\end{array} \\
2.5 & 211 & 7.9 & 107 \\
3.9 & 167 & 1.8 & 167 \\
2.9 & 131 & 1.9 & 266 \\
2.4 & 191 & 0.8 & 227 \\
2.9 & 220 & 6.5 & 86 \\
0.8 & 297 & 1.6 & 207 \\
9.1 & 71 & 5.8 & 115 \\
2.7 & 172 & 1.3 & 285 \\
0.8 & 211 & 1.2 & 199 \\
0.7 & 300 & &
\end{array}
$$
Is there statistically significant evidence of a negative linear relationship between wine consumption and heart disease deaths in the population of countries? Carry out an appropriate significance test at the $\alpha=0.05$ level.

Lainey Roebuck
Lainey Roebuck
Numerade Educator
View

Problem 16

The professor swims Here are data on the time (in minutes) Professor Moore takes to swim 2000 yards and his pulse rate (beats per minute) after swimming on a random sample of 23 days:
$$
\begin{array}{lrrrrrr}
\hline \text { Time: } & 34.12 & 35.72 & 34.72 & 34.05 & 34.13 & 35.72 \\
\text { Pulse: } & 152 & 124 & 140 & 152 & 146 & 128 \\
\text { Time: } & 36.17 & 35.57 & 35.37 & 35.57 & 35.43 & 36.05 \\
\text { Pulse: } & 136 & 144 & 148 & 144 & 136 & 124 \\
\text { Time: } & 34.85 & 34.70 & 34.75 & 33.93 & 34.60 & 34.00 \\
\text { Pulse: } & 148 & 144 & 140 & 156 & 136 & 148 \\
\text { Time: } & 34.35 & 35.62 & 35.68 & 35.28 & 35.97 & \\
\text { Pulse: } & 148 & 132 & 124 & 132 & 139 & \\
\hline
\end{array}
$$
Is there statistically significant evidence of a negative linear relationship between Professor Moore's swim time and his pulse rate in the population of days on which he swims 2000 yards? Carry out an appropriate significance test at the $\alpha=0.05$ level.

Lainey Roebuck
Lainey Roebuck
Numerade Educator
View

Problem 17

Stats teachers' cars A random sample of $\mathrm{AP}^{\mathbb{R}}$ Statistics teachers was asked to report the age (in years) and mileage of their primary vehicles. A scatterplot of the data is shown at top right.
Computer output from a least-squares regression analysis of these data is shown below $(\mathrm{df}=19)$. Assume that the conditions for regression inference are met.
$$
\begin{aligned}
&\text { Variable coef } \quad \text { SE Coef t-ratio prob }\\
&\begin{array}{llll}
\text { Constant } & 7288.54 & 6591 & 1.11 & 0.2826
\end{array}\\
&\begin{array}{lll}
\text { Car age } & 11630.6 & 1249 \quad\quad&<0.0001\\
\end{array}\\
&S=19280 \quad \mathrm{R}-\mathrm{Sq}=82.0 \% \quad \mathrm{RSq}(\mathrm{adj})=81.1 \%
\end{aligned}
$$
(a) Verify that the $95 \%$ confidence interval for the slope of the population regression line is $(9016.4,$
$$
14,244.8)
$$
(b) A national automotive group claims that the typical driver puts 15,000 miles per year on his or her main vehicle. We want to test whether $\mathrm{AP}^{R}$ Statistics teachers are typical drivers. Explain why an appropriate pair of hypotheses for this test is $H_{0}: \beta=15,000$ versus $H_{a}: \beta \neq 15,000$
(c) Compute the test statistic and $P$ -value for the test in part (b). What conclusion would you draw at the $\alpha=0.05$ significance level?
(d) Does the confidence interval in part (a) lead to the same conclusion as the test in part (c)? Explain.

Lainey Roebuck
Lainey Roebuck
Numerade Educator
View

Problem 18

Paired tires Exercise 71 in Chapter 8 (page 529 ) compared two methods for estimating tire wear. The first method used the amount of weight lost by a tire. The second method used the amount of wear in the grooves of the tire. A random sample of 16 tires was obtained. Both methods were used to estimate the total distance traveled by each tire. The following scatterplot displays the two estimates (in thousands of miles) for each tire. ${ }^{12}$ Computer output from a least-squares regression analysis of these data is shown below. Assume that the conditions for regression inference are met.
$$
\begin{array}{lllrl}
\text { Predictor } & \text { Coef } & \text { SE Coef } & \text { T } & \text { P } \\
\text { Constant } & 1.351 & 2.105 & 0.64 & 0.531 \\
\text { Weight } & 0.79021 & 0.07104 & 11.12 & 0.000 \\
\mathrm{~S}=2.62078 & \mathrm{R}-\mathrm{Sq}=89.8 \% & \mathrm{R}-\mathrm{Sq}(\mathrm{adj}) & =89.1 \%
\end{array}
$$
(a) Verify that the $99 \%$ confidence interval for the slope of the population regression line is (0.5787,1.0017)
(b) Researchers want to test whether there is a difference in the two methods of estimating tire wear. Explain why the researchers might think that an appropriate pair of hypotheses for this test is $H_{0}: \beta=1$ versus $H_{a}: \beta \neq 1$
(c) Compute the test statistic and $P$ -value for the test in part (b). What conclusion would you draw at the $\alpha=0.01$ significance level?
(d) Does the confidence interval in part (a) lead to the same conclusion as the test in part (c)? Explain.

Lainey Roebuck
Lainey Roebuck
Numerade Educator
01:39

Problem 19

Multiple choice: Select the best answer for Exercises, which are based on the following information. To determine property taxes, Florida reappraises real estate every year, and the county appraiser's Web site lists the current "fair market value" of each piece of property. Property usually sells for somewhat more than the appraised market value. We collected data on the appraised market values $x$ and actual selling prices $y$ (in thousands of dollars) of a random sample of 16 condominium units in Florida. We checked that the conditions for inference about the slope of the population regression line are met. Here is part of the Minitab output from a least-squares regression analysis using these data. ${ }^{13}$
$$
\begin{array}{lllll}
\text { Predictor } & \text { Coef } & \text { SE Coef } & \text { T } & \text { P } \\
\text { Constant } & 127.27 & 79.49 & 1.60 & 0.132 \\
\text { Appraisal } & 1.0466 & 0.1126 & 9.29 & 0.000 \\
\mathrm{~S}=69.7299 & \mathrm{R}-\mathrm{Sq}=86.1 \% & \mathrm{R}-\mathrm{Sq}(\mathrm{adj}) & =85.1 \%
\end{array}
$$
The equation of the least-squares regression line for predicting selling price from appraised value is
(a) price $=79.49+0.1126$ (appraised value).
(b) price $=0.1126+1.0466$ (appraised value).
(c) price $=127.27+1.0466$ (appraised value).
(d) price $=1.0466+127.27$ (appraised value).
(e) price $=1.0466+69.7299$ (appraised value).

Ana Carolina Da Cruz
Ana Carolina Da Cruz
Numerade Educator
02:11

Problem 20

Multiple choice: Select the best answer for Exercises, which are based on the following information. To determine property taxes, Florida reappraises real estate every year, and the county appraiser's Web site lists the current "fair market value" of each piece of property. Property usually sells for somewhat more than the appraised market value. We collected data on the appraised market values $x$ and actual selling prices $y$ (in thousands of dollars) of a random sample of 16 condominium units in Florida. We checked that the conditions for inference about the slope of the population regression line are met. Here is part of the Minitab output from a least-squares regression analysis using these data. ${ }^{13}$
$$
\begin{array}{lllll}
\text { Predictor } & \text { Coef } & \text { SE Coef } & \text { T } & \text { P } \\
\text { Constant } & 127.27 & 79.49 & 1.60 & 0.132 \\
\text { Appraisal } & 1.0466 & 0.1126 & 9.29 & 0.000 \\
\mathrm{~S}=69.7299 & \mathrm{R}-\mathrm{Sq}=86.1 \% & \mathrm{R}-\mathrm{Sq}(\mathrm{adj}) & =85.1 \%
\end{array}
$$
The slope $\beta$ of the population regression line describes
(a) the exact increase in the selling price of an individual unit when its appraised value increases by $\$ 1000$.
(b) the average increase in the appraised value in a population of units when selling price increases by $\$ 1000$.
(c) the average increase in selling price in a population of units when appraised value increases by $\$ 1000$.
(d) the average increase in the appraised value in the sample of units when selling price increases by $\$ 1000$.
(e) the average increase in selling price in the sample of units when the appraised value increases by $\$ 1000$.

Sheryl Ezze
Sheryl Ezze
Numerade Educator
01:44

Problem 21

Multiple choice: Select the best answer for Exercises, which are based on the following information. To determine property taxes, Florida reappraises real estate every year, and the county appraiser's Web site lists the current "fair market value" of each piece of property. Property usually sells for somewhat more than the appraised market value. We collected data on the appraised market values $x$ and actual selling prices $y$ (in thousands of dollars) of a random sample of 16 condominium units in Florida. We checked that the conditions for inference about the slope of the population regression line are met. Here is part of the Minitab output from a least-squares regression analysis using these data. ${ }^{13}$
$$
\begin{array}{lllll}
\text { Predictor } & \text { Coef } & \text { SE Coef } & \text { T } & \text { P } \\
\text { Constant } & 127.27 & 79.49 & 1.60 & 0.132 \\
\text { Appraisal } & 1.0466 & 0.1126 & 9.29 & 0.000 \\
\mathrm{~S}=69.7299 & \mathrm{R}-\mathrm{Sq}=86.1 \% & \mathrm{R}-\mathrm{Sq}(\mathrm{adj}) & =85.1 \%
\end{array}
$$
Is there convincing evidence that selling price increases as appraised value increases? To answer this question, test the hypotheses
(a) $\quad H_{0}: \beta=0$ versus $H_{a}: \beta > 0$.
(b) $H_{0}: \beta=0$ versus $H_{a}: \beta < 0$.
(c) $\quad H_{0}: \beta=0$ versus $H_{a}: \beta \neq 0$.
(d) $H_{0}: \beta > 0$ versus $H_{a}: \beta=0$.
(e) $\quad H_{0}: \beta=1$ versus $H_{a}: \beta>1$

Ana Carolina Da Cruz
Ana Carolina Da Cruz
Numerade Educator
01:29

Problem 22

Multiple choice: Select the best answer for Exercises, which are based on the following information. To determine property taxes, Florida reappraises real estate every year, and the county appraiser's Web site lists the current "fair market value" of each piece of property. Property usually sells for somewhat more than the appraised market value. We collected data on the appraised market values $x$ and actual selling prices $y$ (in thousands of dollars) of a random sample of 16 condominium units in Florida. We checked that the conditions for inference about the slope of the population regression line are met. Here is part of the Minitab output from a least-squares regression analysis using these data. ${ }^{13}$
$$
\begin{array}{lllll}
\text { Predictor } & \text { Coef } & \text { SE Coef } & \text { T } & \text { P } \\
\text { Constant } & 127.27 & 79.49 & 1.60 & 0.132 \\
\text { Appraisal } & 1.0466 & 0.1126 & 9.29 & 0.000 \\
\mathrm{~S}=69.7299 & \mathrm{R}-\mathrm{Sq}=86.1 \% & \mathrm{R}-\mathrm{Sq}(\mathrm{adj}) & =85.1 \%
\end{array}
$$
Which of the following is the best interpretation for the value 0.1126 in the computer output?
(a) For each increase of $\$ 1000$ in appraised value, the average selling price increases by about 0.1126 .
(b) When using this model to predict selling price, the predictions will typically be off by about 0.1126 .
(c) $11.26 \%$ of the variation in selling price is accounted for by the linear relationship between selling price and appraised value.
(d) There is a weak, positive linear relationship between selling price and appraised value.
(e) In repeated samples of size 16 , the sample slope will typically vary from the population slope by about $0.1126 .$

Sheryl Ezze
Sheryl Ezze
Numerade Educator
02:38

Problem 23

Multiple choice: Select the best answer for Exercises, which are based on the following information. To determine property taxes, Florida reappraises real estate every year, and the county appraiser's Web site lists the current "fair market value" of each piece of property. Property usually sells for somewhat more than the appraised market value. We collected data on the appraised market values $x$ and actual selling prices $y$ (in thousands of dollars) of a random sample of 16 condominium units in Florida. We checked that the conditions for inference about the slope of the population regression line are met. Here is part of the Minitab output from a least-squares regression analysis using these data. ${ }^{13}$
$$
\begin{array}{lllll}
\text { Predictor } & \text { Coef } & \text { SE Coef } & \text { T } & \text { P } \\
\text { Constant } & 127.27 & 79.49 & 1.60 & 0.132 \\
\text { Appraisal } & 1.0466 & 0.1126 & 9.29 & 0.000 \\
\mathrm{~S}=69.7299 & \mathrm{R}-\mathrm{Sq}=86.1 \% & \mathrm{R}-\mathrm{Sq}(\mathrm{adj}) & =85.1 \%
\end{array}
$$
A $95 \%$ confidence interval for the population slope $\beta$ is
(a) $1.0466 \pm 1.046$.
(d) $1.0466 \pm 0.2207$.
(b) $1.0466 \pm 0.2415$
(e) $1.0466 \pm 0.2400$.
(c) $1.0466 \pm 0.2387$.

Ana Carolina Da Cruz
Ana Carolina Da Cruz
Numerade Educator
01:36

Problem 24

Multiple choice: Select the best answer for Exercises, which are based on the following information. To determine property taxes, Florida reappraises real estate every year, and the county appraiser's Web site lists the current "fair market value" of each piece of property. Property usually sells for somewhat more than the appraised market value. We collected data on the appraised market values $x$ and actual selling prices $y$ (in thousands of dollars) of a random sample of 16 condominium units in Florida. We checked that the conditions for inference about the slope of the population regression line are met. Here is part of the Minitab output from a least-squares regression analysis using these data. ${ }^{13}$
$$
\begin{array}{lllll}
\text { Predictor } & \text { Coef } & \text { SE Coef } & \text { T } & \text { P } \\
\text { Constant } & 127.27 & 79.49 & 1.60 & 0.132 \\
\text { Appraisal } & 1.0466 & 0.1126 & 9.29 & 0.000 \\
\mathrm{~S}=69.7299 & \mathrm{R}-\mathrm{Sq}=86.1 \% & \mathrm{R}-\mathrm{Sq}(\mathrm{adj}) & =85.1 \%
\end{array}
$$
Which of the following would have resulted in a violation of the conditions for inference?
(a) If the entire sample was selected from one neighborhood
(b) If the sample size was cut in half
(c) If the scatterplot of $x=$ appraised value and $y=$ selling price did not show a perfect linear relationship
(d) If the histogram of selling prices had an outlier
(e) If the standard deviation of appraised values was different from the standard deviation of selling prices

John Long
John Long
Numerade Educator
View

Problem 25

Exercises 25 to 28 refer to the following setting. Does the color in which words are printed affect your ability to read them? Do the words themselves affect your ability to name the color in which they are printed? Mr. Starnes designed a study to investigate these questions using the 16 students in his AP $^{\text {R }}$ Statistics class as subjects. Each student performed two tasks in a random order while a partner timed: ( 1 ) read 32 words aloud as quickly as possible, and ( 2 ) say the color in which each of 32 words is printed as quickly as possible. Try both tasks for yourself using the word list below
$$
\begin{array}{llll}
\text { YELLOW } & \text { RED } & \text { BLUE } & \text { GREEN } \\
\text { RED } & \text { GREEN } & \text { YELLOW } & \text { YELLOW } \\
\text { GREEN } & \text { RED } & \text { BLUE } & \text { BLUE } \\
\text { YELLOW } & \text { BLUE } & \text { GREEN } & \text { RED } \\
\text { BLUE } & \text { YELLOW } & \text { RED } & \text { RED } \\
\text { RED } & \text { BLUE } & \text { YELLOW } & \text { GREN } \\
\text { BLUE } & \text { GREEN } & \text { GREEN } & \text { BLUE } \\
\text { GREEN } & \text { YELLOW } & \text { RED } & \text { YELLOW }
\end{array}
$$
Color words (4.2) Let's review the design of the study.
(a) Explain why this was an experiment and not an observational study.
(b) Did Mr. Starnes use a completely randomized design or a randomized block design? Why do you think he chose this experimental design?
(c) Explain the purpose of the random assignment in the context of the study.
The data from Mr. Starnes's experiment are shown below. For each subject, the time to perform the two tasks is given to the nearest second.
$$
\begin{array}{cccccc}
\hline \text { Subject } & \text { Words } & \text { Colors } & \text { Subject } & \text { Words } & \text { Colors } \\
1 & 13 & 20 & 9 & 10 & 16 \\
2 & 10 & 21 & 10 & 9 & 13 \\
3 & 15 & 22 & 11 & 11 & 11 \\
4 & 12 & 25 & 12 & 17 & 26 \\
5 & 13 & 17 & 13 & 15 & 20 \\
6 & 11 & 13 & 14 & 15 & 15 \\
7 & 14 & 32 & 15 & 12 & 18 \\
8 & 16 & 21 & 16 & 10 & 18 \\
\hline
\end{array}
$$

Lainey Roebuck
Lainey Roebuck
Numerade Educator
View

Problem 26

Exercises 25 to 28 refer to the following setting. Does the color in which words are printed affect your ability to read them? Do the words themselves affect your ability to name the color in which they are printed? Mr. Starnes designed a study to investigate these questions using the 16 students in his AP $^{\text {R }}$ Statistics class as subjects. Each student performed two tasks in a random order while a partner timed: ( 1 ) read 32 words aloud as quickly as possible, and ( 2 ) say the color in which each of 32 words is printed as quickly as possible. Try both tasks for yourself using the word list below
$$
\begin{array}{llll}
\text { YELLOW } & \text { RED } & \text { BLUE } & \text { GREEN } \\
\text { RED } & \text { GREEN } & \text { YELLOW } & \text { YELLOW } \\
\text { GREEN } & \text { RED } & \text { BLUE } & \text { BLUE } \\
\text { YELLOW } & \text { BLUE } & \text { GREEN } & \text { RED } \\
\text { BLUE } & \text { YELLOW } & \text { RED } & \text { RED } \\
\text { RED } & \text { BLUE } & \text { YELLOW } & \text { GREN } \\
\text { BLUE } & \text { GREEN } & \text { GREEN } & \text { BLUE } \\
\text { GREEN } & \text { YELLOW } & \text { RED } & \text { YELLOW }
\end{array}
$$
Color words (1.3) Do the data provide evidence of a difference in the average time required to perform the two tasks? Include an appropriate graph and numerical summaries in your answer.

Lainey Roebuck
Lainey Roebuck
Numerade Educator
View

Problem 27

Exercises 25 to 28 refer to the following setting. Does the color in which words are printed affect your ability to read them? Do the words themselves affect your ability to name the color in which they are printed? Mr. Starnes designed a study to investigate these questions using the 16 students in his AP $^{\text {R }}$ Statistics class as subjects. Each student performed two tasks in a random order while a partner timed: ( 1 ) read 32 words aloud as quickly as possible, and ( 2 ) say the color in which each of 32 words is printed as quickly as possible. Try both tasks for yourself using the word list below
$$
\begin{array}{llll}
\text { YELLOW } & \text { RED } & \text { BLUE } & \text { GREEN } \\
\text { RED } & \text { GREEN } & \text { YELLOW } & \text { YELLOW } \\
\text { GREEN } & \text { RED } & \text { BLUE } & \text { BLUE } \\
\text { YELLOW } & \text { BLUE } & \text { GREEN } & \text { RED } \\
\text { BLUE } & \text { YELLOW } & \text { RED } & \text { RED } \\
\text { RED } & \text { BLUE } & \text { YELLOW } & \text { GREN } \\
\text { BLUE } & \text { GREEN } & \text { GREEN } & \text { BLUE } \\
\text { GREEN } & \text { YELLOW } & \text { RED } & \text { YELLOW }
\end{array}
$$
Color words (9.3) Explain why it is not safe to use paired $t$ procedures to do inference about the difference in the mean time to complete the two tasks.

Lainey Roebuck
Lainey Roebuck
Numerade Educator
View

Problem 28

Exercises 25 to 28 refer to the following setting. Does the color in which words are printed affect your ability to read them? Do the words themselves affect your ability to name the color in which they are printed? Mr. Starnes designed a study to investigate these questions using the 16 students in his AP $^{\text {R }}$ Statistics class as subjects. Each student performed two tasks in a random order while a partner timed: ( 1 ) read 32 words aloud as quickly as possible, and ( 2 ) say the color in which each of 32 words is printed as quickly as possible. Try both tasks for yourself using the word list below
$$
\begin{array}{llll}
\text { YELLOW } & \text { RED } & \text { BLUE } & \text { GREEN } \\
\text { RED } & \text { GREEN } & \text { YELLOW } & \text { YELLOW } \\
\text { GREEN } & \text { RED } & \text { BLUE } & \text { BLUE } \\
\text { YELLOW } & \text { BLUE } & \text { GREEN } & \text { RED } \\
\text { BLUE } & \text { YELLOW } & \text { RED } & \text { RED } \\
\text { RED } & \text { BLUE } & \text { YELLOW } & \text { GREN } \\
\text { BLUE } & \text { GREEN } & \text { GREEN } & \text { BLUE } \\
\text { GREEN } & \text { YELLOW } & \text { RED } & \text { YELLOW }
\end{array}
$$
Color words (3.1,3.2,12.1) Can we use a student's word task time to predict his or her color task time?
(a) Make an appropriate scatterplot to help answer this question. Describe what you see.
(b) Use your calculator to find the equation of the leastsquares regression line. Define any symbols you use.
(c) Find and interpret the residual for the student who completed the word task in 9 seconds.
(d) Assume that the conditions for performing inference about the slope of the true regression line are met. The $P$ -value for a test of $H_{0}: \beta=0$ versus $H_{a}: \beta>0$ is $0.0215 .$ Explain what this value means in context.

Lainey Roebuck
Lainey Roebuck
Numerade Educator
View

Problem 29

Refer to the following setting. Yellowstone National Park surveyed a random sample of 1526 winter visitors to the park. They asked each person whether he or she owned, rented, or had never used a snowmobile. Respondents were also asked whether they belonged to an environmental organization (like the Sierra Club). The two-way table summarizes the survey responses.
$$
\begin{array}{lcrr}
\hline & {2}{c} {\text { Environmental Clubs }} & \\
{ } & \text { No } & \text { Yes } & \text { Total } \\
\text { Never used } & 445 & 212 & 657 \\
\text { Snowmobile renter } & 497 & 77 & 574 \\
\text { Snowmobile owner } & 279 & 16 & 295 \\
\text { Total } & 1221 & 305 & 1526 \\
\hline
\end{array}
$$
Snowmobiles (5.2,5.3)
(a) If we choose a survey respondent at random, what's the probability that this individual
(i) is a snowmobile owner?
(ii) belongs to an environmental organization or owns a snowmobile?
(iii) has never used a snowmobile given that the person belongs to an environmental organization?
(b) Are the events "is a snowmobile owner" and "belongs to an environmental organization" independent for the members of the sample? Justify your answer.
(c) If we choose two survey respondents at random, what's the probability that
(i) both are snowmobile owners?
(ii) at least one of the two belongs to an environmental organization?

Lainey Roebuck
Lainey Roebuck
Numerade Educator
View

Problem 30

Refer to the following setting. Yellowstone National Park surveyed a random sample of 1526 winter visitors to the park. They asked each person whether he or she owned, rented, or had never used a snowmobile. Respondents were also asked whether they belonged to an environmental organization (like the Sierra Club). The two-way table summarizes the survey responses.
$$
\begin{array}{lcrr}
\hline & {2}{c} {\text { Environmental Clubs }} & \\
{ } & \text { No } & \text { Yes } & \text { Total } \\
\text { Never used } & 445 & 212 & 657 \\
\text { Snowmobile renter } & 497 & 77 & 574 \\
\text { Snowmobile owner } & 279 & 16 & 295 \\
\text { Total } & 1221 & 305 & 1526 \\
\hline
\end{array}
$$
Snowmobiles (11.2) Do these data provide convincing evidence at the $5 \%$ significance level of an association between environmental club membership and snowmobile use for the population of visitors to Yellowstone National Park? Justify your answer.

Lainey Roebuck
Lainey Roebuck
Numerade Educator