• Home
  • Textbooks
  • Fundamentals of Statistics
  • Describing the Relation between Two Variables

Fundamentals of Statistics

Michael Sullivan III

Chapter 4

Describing the Relation between Two Variables - all with Video Answers

Educators


Section 1

Scatter Diagrams and Correlation

00:21

Problem 1

Describe the difference between univariate and bivariate data.

Trent Speier
Trent Speier
Numerade Educator
00:34

Problem 2

Explain what is meant by a lurking variable. Provide an example.

Trent Speier
Trent Speier
Numerade Educator
00:22

Problem 3

What does it mean to say that two variables are positively associated?

Trent Speier
Trent Speier
Numerade Educator
01:04

Problem 4

What does it mean to say that the linear correlation coefficient between two variables equals $1 ?$ What would the scatter diagram look like?

Kaylee Mcclellan
Kaylee Mcclellan
Numerade Educator
00:40

Problem 5

What does it mean if $r=0 ?$

Kaylee Mcclellan
Kaylee Mcclellan
Numerade Educator
00:11

Problem 6

Is the linear correlation coefficient a resistant measure? Support your answer.

Trent Speier
Trent Speier
Numerade Educator
00:36

Problem 7

Explain what is wrong with the following statement: "We have concluded that there is a high correlation between the gender of drivers and rates of automobile accidents."

Trent Speier
Trent Speier
Numerade Educator
00:55

Problem 8

Write a statement that explains the concept of correlation. Include a discussion of the role that $x_{i}-\bar{x}$ and $y_{i}-\bar{y}$ play in the computation.

Trent Speier
Trent Speier
Numerade Educator
00:34

Problem 9

Explain what is wrong with the following statement: $^{\text {th }}$ A recent study showed that the correlation between the number of acres on a farm and the amount of corn produced was 0.93 bushel."

Trent Speier
Trent Speier
Numerade Educator
00:55

Problem 10

Explain the difference between correlation and causation. When does a linear correlation coefficient that implies a strong positive correlation also imply causation?

Trent Speier
Trent Speier
Numerade Educator
01:06

Problem 11

Determine whether the scatter diagram indicates that a linear relation may exist between the two variables. If the relation is linear, determine whether it indicates a positive or negative association between the variables.
CAN'T COPY THE FIGURE

Manisha Sarker
Manisha Sarker
Numerade Educator
01:06

Problem 12

Determine whether the scatter diagram indicates that a linear relation may exist between the two variables. If the relation is linear, determine whether it indicates a positive or negative association between the variables.
CAN'T COPY THE FIGURE

Manisha Sarker
Manisha Sarker
Numerade Educator
01:06

Problem 13

Determine whether the scatter diagram indicates that a linear relation may exist between the two variables. If the relation is linear, determine whether it indicates a positive or negative association between the variables.
CAN'T COPY THE FIGURE

Manisha Sarker
Manisha Sarker
Numerade Educator
01:06

Problem 14

Determine whether the scatter diagram indicates that a linear relation may exist between the two variables. If the relation is linear, determine whether it indicates a positive or negative association between the variables.
CAN'T COPY THE FIGURE

Manisha Sarker
Manisha Sarker
Numerade Educator
01:13

Problem 15

Match the linear correlation coefficient to the scatter diagram. The scales on the $x$ - and $y$ -axes are the same for each scatter 7 diagram.
(a) $r=0.787$
(b) $r=0.523$
(c) $r=0.810$
(d) $r=0.946$
CAN'T COPY THE FIGURES

Alexander Cheng
Alexander Cheng
Numerade Educator
02:23

Problem 16

Match the linear correlation coefficient to the scatter diagram. The scales on the $x$ - and $y$ -axes are the same for each scatter diagram.
(a) $r=-0.969$
(b) $r=-0.049$
(c) $r=-1$
(d) $r=-0.992$
CAN'T COPY THE FIGURES

Trent Speier
Trent Speier
Numerade Educator
01:22

Problem 17

The following scatter diagram drawn in MINITAB shows the relation between the percentage of the population of a state that has at least a bachelor's degree and the median income (in dollars) of the state for 2003
CAN'T COPY THE FIGURE
(a) Describe the relation that appears to exist between level of education and median income.
(b) One observation appears to stick out from the rest. Which one? This particular observation is for the state of Alaska. Can you think of any reasons why the state of Alaska might have a high median income, given the proportion of the population that has at least a bachelor's degree?

Trent Speier
Trent Speier
Numerade Educator
00:56

Problem 18

The following scatter diagram drawn in Excel shows the relation between median income (in dollars) in a state and birthrate (births per 1000 women 15 to 44 years of age).
CAN'T COPY THE FIGURE
(a) Does there appear to be any relation between median income and birthrate?
(b) One observation sticks out from the rest. Which one? This particular observation is for the state of Utah. Are there any explanations for this result?

Trent Speier
Trent Speier
Numerade Educator
09:40

Problem 19

(a) draw a scatter diagram of the data, (b) by hand, compute the correlation coefficient, and (c) comment on the type of relation that appears to exist between $x$ and $y$.
$$\begin{array}{l|lllll}
x & 2 & 4 & 8 & 8 & 9 \\
\hline y & 1 & 2 & 4 & 5 & 6
\end{array}$$

Trent Speier
Trent Speier
Numerade Educator
08:45

Problem 20

(a) draw a scatter diagram of the data, (b) by hand, compute the correlation coefficient, and (c) comment on the type of relation that appears to exist between $x$ and $y$.
$$\begin{array}{l|lllll}
\boldsymbol{x} & \mathbf{2} & \mathbf{3} & \mathbf{5} & \mathbf{6} & \mathbf{6} \\
\hline \boldsymbol{y} & \mathbf{1 0} & \mathbf{9} & \mathbf{8} & \mathbf{3} & \mathbf{1}
\end{array}$$

Trent Speier
Trent Speier
Numerade Educator
10:17

Problem 21

(a) draw a scatter diagram of the data, (b) by hand, compute the correlation coefficient, and (c) comment on the type of relation that appears to exist between $x$ and $y$.
$$\begin{array}{l|cccccc}
x & 3 & 5 & 8 & 9 & 12 & 12 \\
\hline y & 18 & 20 & 16 & 10 & 12 & 8
\end{array}$$

Trent Speier
Trent Speier
Numerade Educator
07:51

Problem 22

(a) draw a scatter diagram of the data, (b) by hand, compute the correlation coefficient, and (c) comment on the type of relation that appears to exist between $x$ and $y$.
$$\begin{array}{c|cccccc}
x & 0 & 1 & 1 & 2 & 4 & 7 \\
\hline y & 3 & 5 & 4 & 6 & 8 & 9
\end{array}$$

Trent Speier
Trent Speier
Numerade Educator
03:33

Problem 23

A pediatrician wants 3 to determine the relation that may exist between a child's height and head circumference. She randomly selects 11 three-year-old children from her practice, measures their height and head circumference, and obtains the data shown in the table.
(a) If the pediatrician wants to use height to predict head circumference, determine which variable is the explanatory variable and which is the response variable.
(b) Draw a scatter diagram.
(c) Compute the linear correlation coefficient between the height and head circumference of a child.
(d) Comment on the type of relation that appears to exist between the height and head circumference of a child on the basis of the scatter diagram and linear correlation coefficient.
CAN'T COPY THE TABLE

Trent Speier
Trent Speier
Numerade Educator
04:07

Problem 24

A researcher wants to know if the gestation period of an animal can be used to predict life expectancy. She collects the following data:
CAN'T COPY THE TABLE
(a) Suppose the researcher wants to use the gestation period of an animal to predict its life expectancy. Determine which variable is the explanatory variable and which is the response variable.
(b) Draw a scatter diagram.
(c) Compute the linear correlation coefficient between gestation period and life expectancy.
(d) Comment on the type of relation that appears to exist between gestation period and life expectancy based on the scatter diagram and linear correlation coefficient.
(c) Remove the goat from the data set, and recompute the linear correlation coefficient between the gestation period and life expectancy. What effect did the removal of the data value have on the linear correlation coefficient? Provide a justification for this result.

Trent Speier
Trent Speier
Numerade Educator
02:53

Problem 25

An engineer wanted to determine how the weight of a car affects gas mileage. The following data represent the weight of various domestic cars and their gas mileage in the city for the 2005 model year.
CAN'T COPY THE TABLE
(a) Determine which variable is the likely explanatory variable and which is the likely response variable.
(b) Draw a scatter diagram of the data.
(c) Compute the linear correlation coefficient between the weight of a car and its miles per gallon in the city.
(d) Comment on the type of relation that appears to exist between the weight of a car and its miles per gallon in the city based on the scatter diagram and the linear correlation coefficient.

Trent Speier
Trent Speier
Numerade Educator
07:09

Problem 26

Research performed at NASA and led by Emily R. Morey-Holton measured the lengths of the right humerus and right tibia in 11 rats that were sent to space on Spacelab Life Sciences $2 .$ The following data were collected.
CAN'T COPY THE TABLE
(a) Draw a scatter diagram, treating the length of the right humerus as the explanatory variable and the length of the right tibia as the response variable.
(b) Compute the linear correlation coefficient between the length of the right humerus and the length of the right tibia.
(c) Comment on the type of relation that appears to exist between the length of the right humerus and the length of the right tibia based on the scatter diagram and the linear correlation coefficient.
(d) Convert the data to inches $(1 \mathrm{mm}=0.03937 \text { inch })$ and recompute the linear correlation coefficient. What effect did the conversion from millimeters to inches have on the linear correlation coefficient?

Trent Speier
Trent Speier
Numerade Educator
03:46

Problem 27

The following data represent the number of days absent and the final grade for a sample of college students in a general education course at a large midwestern state university.
CAN'T COPY THE TABLE
(a) The researcher wants to use the number of days absent to predict the final grade. Determine which variable is the explanatory variable and which is the response variable.
(b) Draw a scatter diagram of the data.
(c) Compute the linear correlation coefficient between the number of days absent and the final grade.
(d) Comment on the type of relation that appears to exist between the number of days absent and the final grade.
(c) Will going to class every day guarantee a passing grade? What other factors might need to be taken into account?

Trent Speier
Trent Speier
Numerade Educator
03:10

Problem 28

A study on antibiotic use among children in Manitoba, Canada, gave the following data for the number of prescriptions per 1000 children $x$ years after 1995
CAN'T COPY THE TABLE
(a) Draw a scatter diagram of the data, treating year as the explanatory variable. What type of relation, if any, appears to exist between year and antibiotic prescriptions among children?
(b) Compute the linear correlation coefficient between year and antibiotic prescriptions among children.
(c) Comment on the type of relation that appears to exist between year and antibiotic prescriptions among children on the basis of the scatter diagram and the linear correlation coefficient.

Trent Speier
Trent Speier
Numerade Educator
03:09

Problem 29

A doctor wanted to determine whether there was a relation between a male's age and his HDL (so-called good) cholesterol. He randomly selected 17 of his paticnts and determined their HDL cholesterol. He obtained the following data.
CAN'T COPY THE TABLE
(a) Draw a scatter diagram of the data, treating age as the explanatory variable. What type of relation, if any, appears to exist between age and HDL cholesterol?
(b) Compute the linear correlation coefficient between age and HDL cholesterol.
(c) Comment on the type of relation that appears to exist between age and HDL cholesterol on the basis of the scatter diagram and the linear correlation coefficient.

Trent Speier
Trent Speier
Numerade Educator
02:56

Problem 30

Cathy is conducting an experiment to measure the relation between a light bulb's intensity and the distance from the light source. She measures a 100-watt lightbulb's intensity 1 meter from the bulb and at 0.1-meter intervals up to 2 meters from the bulb and obtains the following data.
CAN'T COPY THE TABLE
(a) Draw a scatter diagram of the data, treating distance as the explanatory variable.
(b) Do you think that it is appropriate to compute the linear correlation coefficient between distance and intensity? Why?

Trent Speier
Trent Speier
Numerade Educator
08:35

Problem 31

Does Size Matter? Researchers wondered whether the size of a person's brain was related to the individual's mental capacity. They selected a sample of right-handed introductory psychology students who had SAT scores higher than $1350 .$ The subjects took the Wechsler (1981)
CAN'T COPY THE TABLE
Adult Intelligence Scale-Revised exam to obtain their IQ scores. Magnetic resonance imaging (MRI) scans were performed at the same facility for the subjects. The scans consisted of 18 horizontal magnetic resonance images. The computer counted all pixels with nonzero gray scale in each of the 18 images, and the total count served as an index for brain size.
(a) Draw a scatter diagram, treating MRI count as the explanatory variable and IO as the response variable. Comment on what you see.
(b) Compute the linear correlation coefficient between MRI count and IQ. Do you think that MRI count and
IO are linearly related?
(c) A lurking variable in the analysis is gender. Draw a scatter diagram, treating MRI count as the explanatory variable and IQ as the response variable, but use a different plotting symbol for each gender. For example, use a circle for males and a triangle for females. What do you notice?
(d) Compute the linear correlation coefficient between MRI count and IQ for females. Compute the linear correlation coefficient between MRI count and IQ for males. Do you believe that MRI count and IQ are linearly related? What is the moral?

Trent Speier
Trent Speier
Numerade Educator
02:48

Problem 32

The following data represent the number of licensed drivers in various age groups and the number of accidents within the age group by gender.
CAN'T COPY THE TABLE
(a) On the same graph, draw a scatter diagram for both males and females. Be sure
to use a different plotting symbol for each group. For example, use a square (?)
or an M for males and a plus sign $(+)$ or an $F$ for females. Treat number of licensed drivers as the explanatory variable.
(b) Based on the scatter diagrams, do you think that insurance companies are justified in charging different insurance rates for males and females? Why?
(c) Compute the linear correlation coefficient between number of licensed drivers and number of crashes for males.
(d) Compute the linear correlation coefficient between number of licensed drivers and number of crashes for females.
(c) Which gender has the stronger linear relation between number of licensed driv-
ers and number of crashes. Why?

Jon Southam
Jon Southam
Numerade Educator
03:03

Problem 33

Suppose we add the Ford Taurus to the data in Problem $25 .$ A Ford Taurus weighs 3305 pounds and gets 19 miles per gallon.
(a) Redraw the scatter diagram with the Taurus included.
(b) Recompute the linear correlation coefficient with the Taurus included.
(c) Compare the results of parts (a) and (b) with the results of Problem $25 .$ Why are the results here reasonable?
(d) Now suppose we add the Toyota Prius to the data in Problem 25 (remove the Taurus). A Toyota Prius weighs 2890 pounds and gets 60 miles per gallon. Redraw the scatter diagram with the Prius included. What do you notice?
(c) Recompute the linear correlation coefficient with the Prius included. How did this new value affect your result?
(f) Why does this observation not follow the pattern of the data?

Jon Southam
Jon Southam
Numerade Educator
02:20

Problem 34

Suppose we add humans to the data in Problem $24 .$ Humans have a gestation period of 268 days and a life expectancy of 76.5 years.
(a) Redraw the scatter diagram with humans included.
(b) Recompute the linear correlation coefficient with humans included.
(c) Compare the results of (a) and (b) with the results of Problem $24 .$ Provide a statement that explains the results.

Jon Southam
Jon Southam
Numerade Educator
01:43

Problem 35

Consider the following four data sets:
CAN'T COPY THE TABLE
(a) Compute the linear correlation coefficient for each data set.
(b) Draw a scatter diagram for each data set. Conclude that linear correlation coefficients and scatter diagrams must be used together in any statistical analysis of bivariate data.

Jon Southam
Jon Southam
Numerade Educator
02:35

Problem 36

The Best Predictor of the Winning Percentage The ultimate goal in any sport (besides having fun) is to win. One measure of how well a team does is the winning percentage. In baseball, a lot of effort goes into figuring out the variable that best predicts a team's winning percentage. The following data represent the winning percentages of teams in the National League along with potential explanatory variables. Which variable do you think is the best predictor of winning percentage? Why?
CAN'T COPY THE TABLE

Jon Southam
Jon Southam
Numerade Educator
02:26

Problem 37

One basic theory of investing is diversification. The idea is that you want to have a basket of stocks that do not all "move in the same direction" In other words, if one investment goes down, you don't want a second investment in your portfolio that is also likely to go down. One hallmark of a good portfolio is a low correlation between investments. The following data represent the annual rates of return for various stocks. If you only wish to invest in two of the stocks, which two would you select if your goal is to have low correlation between the two investments?Which two would you select if your goal is to have one stock go up when the other goes down?
CAN'T COPY THE TABLE

Jon Southam
Jon Southam
Numerade Educator
02:47

Problem 38

Lyme Disease versus Drownings Lyme disease is an inflammatory disease that results in skin rash and flulike symptoms. It is transmitted through the bite of an infected deer tick. The following data represent the number of reported cases of Lyme disease and the number of drowning deaths for a rural county in the United States.
$$\begin{array}{lllllllllllll}
\text { Month } & \mathbf{J} & \mathbf{F} & \mathbf{M} & \mathbf{A} & \mathbf{M} & \mathbf{J} & \mathbf{J} & \mathbf{A} & \mathbf{S} & \mathbf{O} & \mathbf{N} & \mathbf{D} \\
\hline \text { Cases of } & & & & & & & & & \\
\text { Lyme Disease } & 3 & 2 & 2 & 4 & 5 & 15 & 22 & 13 & 6 & 5 & 4 & 1 \\
\hline \text { Drowning } & & & & & & & & & & & & \\
\text { Deaths } & 0 & 1 & 2 & 1 & 2 & 9 & 16 & 5 & 3 & 3 & 1 & 0
\end{array}$$
(a) Draw a scatter diagram of the data using cases of Lyme disease as the explanatory variable.
(b) Compute the correlation coefficient for the data.
(c) Based on your results from parts (a) and (b), what type of relation appears to exist between the number of reported cases of Lyme disease and drowning deaths? Do you believe that an increase in cases of Lyme disease causes an increase in drowning deaths?

Trent Speier
Trent Speier
Numerade Educator
00:54

Problem 39

Based on data obtained from the $C I A$ World Factbook, the linear correlation coefficient between number of television stations in a country and life expectancy of residents of the country is $0.599 .$ What does this correlation imply? Do you believe that the more television stations a country has, the longer its population can expect to live? Why or why not?

Trent Speier
Trent Speier
Numerade Educator
00:15

Problem 40

A study on the relationship between caffeine consumption during pregnancy and sudden infant death syndrome (SIDS) showed that heavy caffeine consumption during pregnancy was associated with a significant risk of SIDS. The study was later criticized on the claim that parental smoking was not properly assessed. Explain why this might be a concern.

Trent Speier
Trent Speier
Numerade Educator
04:22

Problem 41

Consider the following set of data:
$$\begin{array}{lllllllll}
x & 2.2 & 3.7 & 3.9 & 4.1 & 2.6 & 4.1 & 2.9 & 4.7 \\
\hline y & 3.9 & 4.0 & 1.4 & 2.8 & 1.5 & 3.3 & 3.6 & 4.9
\end{array}$$
(a) Draw a scatter diagram of the data and compute the linear correlation coefficient.
(b) Draw a scatter diagram of the data and compute the linear correlation coefficient with the additional data point $(10.4,9.3) .$ Comment on the effect the additional data point has on the linear correlation coefficient. Explain why correlations should always be reported with scatter diagrams.

Trent Speier
Trent Speier
Numerade Educator
00:41

Problem 42

On the basis of the accompanying scatter diagram, explain what is wrong with the following statement: "Because the linear correlation coefficient between age and median income is 0.012 , there is no relation between age and median income."
CAN'T COPY THE FIGURE

Trent Speier
Trent Speier
Numerade Educator
01:35

Problem 43

For each of the following statements, explain whether you think the variables will have positive correlation, negative correlation, or no correlation. Support your opinion.
(a) Number of children in the household under the age of
3 and expenditures on diapers
(b) Interest rates on car loans and number of cars sold
(c) Number of hours per week on the treadmill and cholesterol level
(d) Price of a Big Mac and number of McDonald's french fries sold in a week
(c) Shoe size and IQ

Trent Speier
Trent Speier
Numerade Educator
01:25

Problem 44

For each of the following statements, explain whether you think the variables will have positive correlation, negative correlation, or no correlation. Support your opinion.
(a) Number of cigarettes smoked by a pregnant woman each week and birth weight of her baby
(b) Annual salary and years of education
(c) Number of doctors on staff at a hospital and number of administrators on staff.
(d) Head circumference and IQ
(e) Number of moviegoers and movie ticket price.

Trent Speier
Trent Speier
Numerade Educator
07:36

Problem 45

Consider the following data set:
$$\begin{array}{lllllllll}
x & 5 & 6 & 7 & 7 & 8 & 8 & 8 & 8 \\
\hline y & 4.2 & 5 & 5.2 & 5.9 & 6 & 6.2 & 6.1 & 6.9 \\
\hline x & 9 & 9 & 10 & 10 & 11 & 11 & 12 & 12 \\
\hline y & 7.2 & 8 & 8.3 & 7.4 & 8.4 & 7.8 & 8.5 & 9.5
\end{array}$$
(a) Draw a scatter diagram with the $x$ -axis starting at 0 and ending at 30 and with the $y$ -axis starting at 0 and ending at 20
(b) Compute the linear correlation coefficient.
(c) Now multiply both $x$ and $y$ by 2
(d) Draw a scatter diagram of the new data with the $x$ axis starting at 0 and ending at 30 and with the $y$ -axis starting at 0 and ending at $20 .$ Compare the scatter diagrams.
(c) Compute the linear correlation coefficient.
(f) Conclude that multiplying each value in the data set does not affect the correlation between the variables. Explain why this is the case.

Trent Speier
Trent Speier
Numerade Educator
00:42

Problem 46

In a study published in the Journal of the American Medical Association (May $16,2001),$ researchers found that breast-feeding may help to prevent obesity in kids. In an interview, the head investigator stated, "It's not clear whether breast milk has obesity-preventing properties or the women who are breast-feeding are less likely to have fat kids because they are less likely to be fat themselves and may be more health conscious." Using this researcher's statement, explain what might be wrong with the conclusion that breast-feeding prevents obesity. Identify some lurking variables in the study.

Trent Speier
Trent Speier
Numerade Educator
00:44

Problem 47

How Well Will You Do in College? The College Board is a membership association composed of schools, colleges, universities, and other educational organizations. One of its better-known programs is the administration of the SAT college entrance exam. In a recent study, the College Board wanted to learn what the best predictor of college grade-point average (GPA) was. The following correlations were obtained based on 48,039 students.
CAN'T COPY THE TABLE
(a) Which variable is the best predictor of college GPA?
(b) Which variable is the worst predictor of college GPA?

Trent Speier
Trent Speier
Numerade Educator
02:21

Problem 48

Load the correlation by eye applet.
(a) In the lower-left corner of the applet, add 10 points that line up with a positive slope so that the linear correlation between the points is about $0.8 .$ Click "show $\mathbf{r}^{n}$ to show the correlation.
(b) Add another point in the upper-right corner of the applet that roughly lines up with the 10 points you have in the lower-left corner. Comment on how the linear correlation coefficient changes.
(c) Drag the point in the upper-right corner straight down. Take note of the change in the linear correlation coefficient. Notice how a single point can have a substantial impact on the linear correlation coefficient.

Jon Southam
Jon Southam
Numerade Educator
00:51

Problem 49

Load the correlation by eye applet. Add about 10 points that form an upside-down U. Certainly, there is a relation between $x$ and $y,$ but what is the value of the linear correlation coefficient? Conclude that a low linear correlation coefficient does not imply there is no relation between two variables; it means there may be no linear relation between two variables.

Jon Southam
Jon Southam
Numerade Educator
03:59

Problem 50

Load the correlation by eye applet.
(a) Plot about 10 points that follow a linear trend and have a linear correlation coefficient that is close to 0.8.
(b) Clear the applet. Plot about 6 points vertically on top of each other on the left side of the applet. Add a seventh point to the right of the applet. Move the point until the linear correlation coefficient is close to 0.8.
(c) Clear the applet. Plot about 7 points in a U-shaped curve. Add an eighth point and move it around the applet until the linear correlation coefficient is close to 0.8
(d) Conclude that a linear correlation coefficient can result from data that have many patterns and so you should always plot your data.

Jon Southam
Jon Southam
Numerade Educator