The following is a dataframe of readings at 0.5 second intervals from 9 sensors (the sensors are ar-ranged as an array of 3 x 3). The readings in yellow need to be cleaned.
Added by Kim B.
Step 1
Let's think step by step. Show more…
Show all steps
Your feedback will help us improve your experience
Adi S and 81 other AP CS educators are ready to help you.
Ask a new question
Labs
Want to see this concept in action?
Explore this concept interactively to see how it behaves as you change inputs.
Key Concepts
Recommended Videos
Can any of the following data points be removed using the Q-test at 90% confidence? 4.2, 5.3, 5.5, 5.0,4.9, 5.2
Adi S.
Refer to the SENIC data set in Appendix C. 1 and Project 9.25 . The regression model containing age, routine chest X-ray ratio, and average daily census in first-order terms is to be evaluated in detail based on the model-building data set. a. Obtain the residuals and plot them separately against $\hat{Y}$, each of the predictor variables in the model, and each of the related cross-product terms. On the basis of these plots, should any modifications of the model be made? b. Prepare a normal probability plot of the residuals. Also obtain the coefficient of correlation between the ordered residuals and their expected values under normality. Test the reasonableness of the normality assumption, using Table $B .6$ and $\alpha=.05 .$ What do you conclude? c. Obtain the scatter plot matrix, the correlation matrix of the $X$ variables, and the variance inflation factors. Are there any indications that serious multicollinearity problems are present? Explain. d. Obtain the studentized deleted residuals and prepare a dot plot of these residuals. Are any outliers present? Use the Bonferroni outlier test procedure with $\alpha=.01 .$ State the decision rule and conclusion. e. Obtain the diagonal elements of the hat matrix. Using the rule of thumb in the text, identify any outlying $X$ observations. f. Cases $62,75,106,$ and 112 are moderately outlying with respect to their $X$ values, and case 87 is reasonably far outlying with respect to its $Y$ value. Obtain $D F F I T S, D F B E T A S$ and Cook's distance values for these cases to assess their influence. What do you conclude?
Sri K.
Data on air pollution were collected from 41 U.S. cities. The type of air pollution under study was the annual mean concentration of sulfur dioxide. The values of six explanatory variables were also recorded. The variables in the data are as follows: y: the annual mean concentration of sulfur dioxide (micrograms per cubic meter) x1: average annual temperature in °F x2: number of manufacturing enterprises employing 20 or more workers x3: population size (thousands) x4: average annual wind speed (mph) x5: average annual precipitation (inches) x6: average number of days with precipitation per year a. Do the residuals appear to have a normal distribution? Justify your answer. b. Does the condition of constant variance appear to be satisfied? Justify your answer. c. Find an appropriate transformation of Y so that the assumptions for regression will be satisfied. Find the "best" model using the transformed Y and the backward variable selection method. City y x1 x2 x3 x4 x5 x6 1 10 70.30000305 213 582 6 7.050000191 36 2 13 61 91 132 8.199999809 48.52000046 100 3 12 56.70000076 453 716 8.699999809 20.65999985 67 4 17 51.90000153 454 515 9 12.94999981 86 5 56 49.09999847 412 158 9 43.36999893 127 6 36 54 80 80 9 40.25 114 7 29 57.29999924 434 757 9.300000191 38.88999939 111 8 14 68.40000153 136 529 8.800000191 54.47000122 116 9 10 75.5 207 335 9 59.79999924 128 10 24 61.5 368 497 9.100000381 48.34000015 115 11 110 50.59999847 3344 3369 10.39999962 34.43999863 122 12 28 52.29999924 361 746 9.699999809 38.74000168 121 13 17 49 104 201 11.19999981 30.85000038 103 14 8 56.59999847 125 277 12.69999981 30.57999992 82 15 30 55.59999847 291 593 8.300000191 43.11000061 123 16 9 68.30000305 204 361 8.399999619 56.77000046 113 17 47 55 625 905 9.600000381 41.31000137 111 18 35 49.90000153 1064 1513 10.10000038 30.95999908 129 19 29 43.5 699 744 10.60000038 25.94000053 137 20 14 54.5 381 507 10 37 99 21 56 55.90000153 775 622 9.5 35.88999939 105 22 14 51.5 181 347 10.89999962 30.18000031 98 23 11 56.79999924 46 244 8.899999619 7.769999981 58 24 46 47.59999847 44 116 8.800000191 33.36000061 135 25 11 47.09999847 391 463 12.39999962 36.11000061 166 26 23 54 462 453 7.099999905 39.04000092 132 27 65 49.70000076 1007 751 10.89999962 34.99000168 155 28 26 51.5 266 540 8.600000381 37.00999832 134 29 69 54.59999847 1692 1950 9.600000381 39.93000031 115 30 61 50.40000153 347 520 9.399999619 36.22000122 147 31 94 50 343 179 10.60000038 42.75 125 32 10 61.59999847 337 624 9.199999809 49.09999847 105 33 18 59.40000153 275 448 7.900000095 46 119 34 9 66.19999695 641 844 10.89999962 35.93999863 78 35 10 68.90000153 721 1233 10.80000019 48.18999863 103 36 28 51 137 176 8.699999809 15.17000008 89 37 31 59.29999924 96 308 10.60000038 44.68000031 116 38 26 57.79999924 197 299 7.599999905 42.59000015 115 39 29 51.09999847 379 531 9.399999619 38.79000092 164 40 31 55.20000076 35 71 6.5 40.75 148 41 16 45.70000076 569 717 11.80000019 29.06999969 123
Dominador T.
Recommended Textbooks
Computer Science and Information Technology
Introduction to Programming Using Python
Computer Science - An Overview
Transcript
18,000,000+
Students on Numerade
Trusted by students at 8,000+ universities
Watch the video solution with this free unlock.
EMAIL
PASSWORD