Download the App!

Get 24/7 study help with the Numerade app for iOS and Android! Enter your email for an invite.

Get the answer to your homework problem.

Try Numerade free for 7 days

Like

Report

We considered the variables smoke and parity, one at a time, in modeling birth weights of babies in Exercises 6.1 and $6.2 .$ A more realistic approach to modeling infant weights is to consider all possibly related variables at once. Other variables of interest include length of pregnancy in days (gestation), mother's age in years (age), mother's height in inches (height), and mother's pregnancy weight in pounds (weight). Below are three observations from this data set.$$\begin{array}{rccccccc}\hline & \text { bwt } & \text { gestation } & \text { parity } & \text { age } & \text { height } & \text { weight } & \text { smoke } \\\hline 1 & 120 & 284 & 0 & 27 & 62 & 100 & 0 \\2 & 113 & 282 & 0 & 33 & 64 & 135 & 0 \\\vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \vdots \\1236 & 117 & 297 & 0 & 38 & 65 & 129 & 0 \\\hline\end{array}$$The summary table below shows the results of a regression model for predicting the average birth weight of babies based on all of the variables included in the data set.$$\begin{array}{rrrrr}\hline & \text { Estimate } & \text { Std. Error } & \text { t value } & \operatorname{Pr}(>|\mathrm{t}|) \\\hline \text { (Intercept) } & -80.41 & 14.35 & -5.60 & 0.0000 \\\text { gestation } & 0.44 & 0.03 & 15.26 & 0.0000 \\\text { parity } & -3.33 & 1.13 & -2.95 & 0.0033 \\\text { age } & -0.01 & 0.09 & -0.10 & 0.9170 \\\text { height } & 1.15 & 0.21 & 5.63 & 0.0000 \\\text { weight } & 0.05 & 0.03 & 1.99 & 0.0471 \\\text { smoke } & -8.40 & 0.95 & -8.81 & 0.0000 \\\hline\end{array}$$(a) Write the equation of the regression line that includes all of the variables.(b) Interpret the slopes of gestation and age in this context.(c) The coefficient for parity is different than in the linear model shown in Exercise 6.2 . Why might there be a difference?(d) Calculate the residual for the first observation in the data set.(e) The variance of the residuals is $249.28,$ and the variance of the birth weights of all babies in the data set is 332.57. Calculate the $R^{2}$ and the adjusted $R^{2}$. Note that there are 1,236 observations in the data set.

Intro Stats / AP Statistics

Chapter 6

Multiple and logistic regression

Linear Regression and Correlation

Missouri State University

University of North Carolina at Chapel Hill

Cairn University

University of St. Thomas

Lectures

0:00

23:19

The data set BWGHT contain…

07:04

The data in FERTIL2 includ…

14:37

$$\begin{array}{l}{\te…

08:32

Use the data in WAGE2 for …

07:11

The variable smokes is a b…

07:19

Use SMOKE for this exercis…

09:17

A problem of interest to h…

13:46

Use the data in HTV to ans…

07:53

Use the data in HTV for th…

00:01

The following model is a s…

15:30

Use the data in ELEM94 95 …

11:26

The file MATHPNL contains …

12:06

The data set $\mathrm{NBAS…

18:36

Use the data set in BEAUTY…

10:08

Use the data in MATHPNL fo…

13:45

The data set DRIVING inclu…

11:41

The data set HAPPINESS con…

05:34

The file PENSION contains …

10:57

Use the state-level data o…

07:57

For data sets with approxi…

once again welcome to a new problem. This time we're dealing with regression elements, and there's always a powerful aspect to looking for relationships. Uh, relationships between between quantitative variables. And in this sense, you're saying you have your ex, which is your independent on. Then you have your while, which is your dependent from contextual aspect. Your X variable is the explanatory variable and your wife variable is your response variable. So you want to see the impact that the explanatory variable cousin the response variable. And towards that, you tend to build a model where we do have slope coefficient. So this is a simple linear regression model where the e is our era and the being art is our intercept. So the point to each the value of X zero so we don't have any influences, we're gonna make it zero on, then, of course, beta one is your slope coefficient. This is a slope coefficient that relates that relates your ex and your wife value. So in this particular problem were given, um, a regression equation. It's a simple linear regression equation, um, are based off of white hot equals to bait or not, plus beta one x y hot is the predicted, uh, why value? Because if you have a model with a bunch of data points, there's gonna be a straight line which estimates theme, the data points. And since the straight line is not, um, is based off of a sample, it's gonna be a predicted equation. So in this particular problem, we have an equation that relates the birth weights off Children. They have a heart to talk of it because it's predicted and then three intercept is 11 19.77 minus zero point 514 mm x. But in this case, X is cigarettes. So So it's a relationship between bath, wheat and cigarettes. Um, so this one stands for, um, the theme the infant in front birth weight. Now, this is the infant bath. Wait, that's your wife? That's your widebody wife values the infant about three, but the who responds variable off the dependent variable and then the independent variable C I. G. S. That stands for the the average number off cigarettes. So this is the average number of cigarettes, uh, smoked for a day during pregnancy. So, you know, we want to see if there's a relationship between these two were given a regression model theme. The first question is, uh, determined the birth weight when the mother smokes zero cigarettes during pregnancy. And then the second thing is, how about 20 cigarettes during pregnancy? We also want to check Compare the outcomes off these to or inputs. And part B, I was saying, is the relationship. All of the same is the relationship between our cigarettes and both weeks causal. So, by co so we mean that does smoking during pregnancy. Perfect biathlete. There's smoking during pregnancy effect about a week and then put seed Hmm given above wheat Hoff 1 25 ounces. How many? How many cigarettes would? No. How many cigarettes would cause this out? Yeah, okay. And January 14 through the proportion mhm non smokers in this sample. The proportion of nonsmokers on this sample is 185 Um, how those is help being from outcome proxy. So these are the questions presented, and we're just going to jump in and figure out what the solutions are. So if we have, uh, the number of cigarettes, if it's zero, then the equation becomes, but wait, it calls to 1 19.7, minus 0.514 and then you plug in zero. And obviously you could see your outcome is 1 19.7 ounces. And this happens because, uh, this part is going to cancel out. And then in the second part, was saying if the number of cigarettes is 20 will repeat the process with about weight. But now you're plugging in 20 paseo result right next to the slope coefficient. And so you're gonna end up with, um, one or nine point 49 you're gonna end up with mhm 1949 So we just want to check that to make sure that we're getting the right answer. Yes, one of 9.49 So then that's the number of answers you will have. That's what a mm. And you can see that the the relationship between smoking and birth weight his inverse as mothers smoke okay more during pregnancy there, Childrens or influence? Let's call it influence on Children. Their influence birth weight declines on on. Then, in the second part of the problem way, uh, looking to see if there is a causal relationship when will say models typically smoke before the onset off pregnancy. Therefore, okay. Their food. Mm. The relationship between okay, both wheat and, uh, smoking is causal. So there's a causal relationship between the two other factors. Other factors can a fake both wheat outcomes, including uh huh. Mothers help during pregnancy. Yeah, Mother's health during pregnancy. The uh huh. Environmental fact is searches evolution when mothers in cu no and so on and so forth. Who say multiple regression models can include these extra variables can include these extra variables. Hmm. But smoking still, please? Uh, significant coz. Oh, role on influence, BlackBerry and then input. See, But see was saying, uh, given the birth weight given the bath fleet is 1 25 ounces. We want to predict how many cigarettes you're going to get off of that. So we do know that the equation looks like this 1. 19.77 When? 0.546 Want to predict this? So plug in 1 25 ounces on and simplify the equation. For since e was subtract mm 1 19.77 on both sides. And then we end up with way end up with negative 0.514 since equals two. So if we subtract those two numbers, we get 1 25. When this 119 from 77 you get 5 to 3. Wife went to three ounces. Hmm. And then we divide both sides by 0.54 For some reason, we're ending up with a negative number for six on There is a meaning for that. So you say on and okay, having, uh, negative number for cigarettes is impossible. Uh, the reason Mhm being. But, um, the values provided search us 1 19.77 ounces. Represent, Yeah. Outrage numbers. Wine. Uh huh. Uh, zero cigarettes. So average numbers when you have zero cigarettes, and this is this is the best of rich. Mm. This is the best average. And then finally, the last step off the question, Hmm. Uh, non smoking pregnant on smoking. Pregnant mothers represent 0.85 of all pregnant. Okay, brothers in the summer. That's, like, 85%. So looking at what just happened in, But see, how can we reconcile this with that housing? Housing more? Mm hmm. Smokers in the sample food make most sense. Uh, in the study. It's gonna make more sense in the study. Uh huh. Since it provides the march needed variation in the explanatory viable, much needed variation in the expand to invite seats. Mm. Helping us helping us, Gertz? Uh, most practical. Practical average. So the mm intercept. Uh, where would say where? The intercept Where the intercept represents mhm. Both wheat off in funds from land smoking mothers. Mm. Increasing non smokers. You? Yeah, will will say non smokers of non smoking. Increasing non smokers will alter positions, right. Both the average birth weight. Okay. Off infants to be good, find you than is currently presented. So once again, we had a problem. And in this particular problem, we had to identify for issues connected to bath weights. Fast one waas Uh, what's the what's the ounces when the weight is 01 19.77 and then what's the way? The ounces when the weight is 21 or 9.49 And then, um is the causal relationship? Absolutely. Because we get to see that our mothers will usually smoke before they get pregnant. So then once they get pregnant, we see that the habits of smoking effects the bath weights by having a lower bath with theme the general population of non smokers. Even if you have a multiple regression model, I mean, including all the variables you'll still see. The smoking has a stronger aspect. And but see, we've ended up finding a negative number of cigarettes, which is not possible since you can never have a negative number of cigarettes. And the reason why we're having that is because off the proportional, um, non smokers the mm proportion of non small because that from being 0.85 on the question is, is this helpful in the problem? Does this help from the problem? MM. In the sample. Obviously wanna have more smokers, and that provides more variation on give us most accurate results for non smokers on and on, the intercept takes a larger value, so I hope you enjoy the problem. Feel free to send any questions or comments and have a wonderful day

View More Answers From This Book

Find Another Textbook

The data set BWGHT contains data on births to women in the United States. Tw…

The data in FERTIL2 include, for women in Botswana during 1988, information …

$$\begin{array}{l}{\text { Use the data in } 401 \mathrm{KSUBS} \text { …

Use the data in WAGE2 for this exercise.(i) In Example $15.2,$ if sibs i…

The variable smokes is a binary variable equal to one if a person smokes, an…

Use SMOKE for this exercise.(i) A model to estimate the effects of smoki…

A problem of interest to health officials (and others) is to determine the e…

Use the data in HTV to answer this question. The data set includes informati…

Use the data in HTV for this exercise.(i) Run a simple OLS regression on…

The following model is a simplified version of the multiple regression model…

Use the data in ELEM94 95 to answer this question. See also Computer Exercis…

The file MATHPNL contains panel data on school districts in Michigan for the…

The data set $\mathrm{NBASAL}$ contains salary information and career statis…

Use the data set in BEAUTY, which contains a subset of the variables (but mo…

Use the data in MATHPNL for this exercise. You will do a fixed effects versi…

The data set DRIVING includes state-level panel data (for the 48 continental…

The data set HAPPINESS contains independently pooled cross sections for the …

The file PENSION contains information on participant-directed pension plans …

Use the state-level data on murder rates and executions in MURDER for the fo…

For data sets with approximately bellshaped distributions, we can improve on…