• Home
  • Textbooks
  • An Introduction to Statistical Methods and Data Analysis
  • Analysis of Variance for some Unbalanced Designs

An Introduction to Statistical Methods and Data Analysis

R. Lyman Ott, Michael Longnecker

Chapter 19

Analysis of Variance for some Unbalanced Designs - all with Video Answers

Educators


Chapter Questions

View

Problem 1

In Exercise 15.1, we described an experiment in which a horticulturist was investigating the effectiveness of five methods foe the irrigation of blueberry shrubs. The methods are surface, trickle, center pivot, lateral move, and subirrigation. There are 10 blueberry farms available for the study representing a wide variety of types of soils, terrains, and wind gradients. The horticulturist wants to use each of the five methods of irrigation on all 10 farms to moderate the effect of the many extraneous sources of variation that may impact the blueberry yields. Each farm is divided into five plots, and the response variable will be the weight of the harvested fruit from each plot of blueberry shrubs. During the study, a problem occurred on the plot irrigated using the surface method on farm 1, and no yield was obtained. The yields in pounds of blueberries over a growing season are given here.
a. Estimate the yield value for the missing plot.
b. Analyze the data by replacing the missing value with the estimate obtained in part (a), and then perform an analysis of variance using the formulas for a randomized block design with no missing observations.
C. Is there a significant difference in the mean yields for the different methods of irrigation? Use $\alpha=0.05$.

Victor Salazar
Victor Salazar
Numerade Educator

Problem 2

Refer to Exercise 19.1. Use the least significant difference criterion to identify which pairs of methods of irrigation have significantly different mean yields.

Check back soon!

Problem 3

Refer to Exercise 19.1. Obtain the sums of squares for an AOV table by fitting complete and reduced models using a statistical software program. Compare your results with those in Exercise 19.1.

Check back soon!
View

Problem 4

The business office of a large university is in the process of selecting amongst the Postal Service and three private couriers as its sole delivery method for the university's responses to applications for admission. After consulting with the university's statistics department, it was decided that over the next month the following study would be conducted. Ten cities with at least 100 applicants would be selected for inclusion in the study. To each of these cities 100 standard packages would be sent by each of the four methods of delivery. The percentage of packages not delivered within 5 days was recorded for each method of delivery, yielding the following data. For four of the cities, at least one of the methods of delivery did not provide service, and, hence, there are missing data in these cells.
a. Obtain the sums of squares for an AOV table by fitting complete and reduced models using a statistical software program.
b. Is there significant evidence of a difference in the four methods of delivery based on the percentage of packages delivered within 5 days?

Victor Salazar
Victor Salazar
Numerade Educator
01:20

Problem 5

Refer to Exercise 19.4. Use the Tukey-Kramer $W$ procedure to identify which pairs of methods of delivery have significantly different mean percentages.

Manik Pulyani
Manik Pulyani
Numerade Educator
01:24

Problem 6

Carbon monoxide (CO) emissions from automobiles can be influenced by the formulation of the gosoline that is used. Oxygenated fuels are used in northern states during the winter to decrease CO emissions. There are eight gasoline blends that are of interest to the researchers (B1-B8). Each of the eight blends will be placed in a car that will then be driven over a 50 -mile route during which the total amount of CO emissions will be measured. There are large car-to-car differences in CO emissions, and there are large route-to-route differences in city driving (stopand-go driving on city streets versus a freeway route). The researchers have eight cars and eight routes available to study the eight blends, with every blend observed in all eight cars, which will be driven over all eight routes. The following table contains the amount of CO emissions (grams) per mile by each vehicle, route, and blend. During the study, the device used to measure CO emissions failed to function properly when wehicle V7 was driven over route R3 using blend B1.
The research goal is to determine how the different blends impact the mean CO readings.
a. Estimate the amount of CO emisxions for vehicle $V 7$ while driving over route R 3 taing blend B1.
b. Analyze the data by replacing the missing value with the estimate obtained in part (a), and then perform an analysis of variance using the formulas for a Latin square design with no missing observations.
C. Is there a significant difference in the mean CO emissions for the different blends? Use $a=05$.

Sheryl Ezze
Sheryl Ezze
Numerade Educator
01:24

Problem 7

Refer to Exercise 19.6. Use the Tukey-Kramer W to identify which pairs of blends have significantly different mean CO emissions.

Sheryl Ezze
Sheryl Ezze
Numerade Educator

Problem 8

Refer to Exercise 19.6. Obtain the sums of squares for an AOV table by fitting complete and reduced models using a statistical software program. Compare your results with those in Exercise 19.7 .

Check back soon!

Problem 9

Refer to Exercise 19.6. Suppose upon examining the data logs from the study the researchers determined that the CO emissions monitoring device was probably not functioning properly for the following two data values: vehicle V7 on route R 4 using blend $\mathrm{B} 2, y_{743}$, and vehicle V6 on route R4 using blend B1, yout. Reanalyze the data after deleting these two values. Do your conclusions about the differences in the eight blends change?

Check back soon!
View

Problem 10

Refer to Exercise 19.9.
a. Identify vehicle and route as fixed or random effects.
b. How would you test for a significant effect due to vehicle?
c. How would you test for a significant effect due to route?

Victor Salazar
Victor Salazar
Numerade Educator
10:01

Problem 11

A horticulturistisinterestedinexaminingtheyield potentialofthreenewvarietiesofasparagus. She designed a study to evaluate the three new varieties relative to the standard variety. There were 16 plots available on a large test field for the study, but the plots were not homogeneous in that there was a distinct sloping from north to south throughout the field. Also, a soill analysis revealed a discernible nitrogen gradient, which ran from west to east across the field. Therefore, the horticulturists decided to assign the varieties V1,V2,V3, and V4, with V1 being the standard variety, to the plots in a Latin square arrangement. The values for marketable yield per plot (in kg/ ha) are given in the following table. Note that there is a missing yield for variety V4 in row 4 and column 1. This was due to a problem that occurred during one of the harvesting periods.
a. Estimate the amount of marketable yield for variety V4 planted in a plot with nitrogen level N4 and slope S1.
b. Analyze the data by replacing the missing value with the estimate obtained in part (a), and then perform an analysis of variance using the formulas for a Latin square design with no missing observations.
c. Is there a significant difference in the mean marketable yields for the four varieties? Use $\alpha=0.05$.

Raymond Matshanda
Raymond Matshanda
Numerade Educator

Problem 12

Refer to Exercise 19.11. Use the Tukey-Kramer $W$ to identify which pairs of varieties have significantly different mean marketable yields.

Check back soon!

Problem 13

Refer to Exercise 19.11. Obtain the sums of squares for an AOV table by fitting complete and reduced models using a statistical software program. Compare your results with those in Exercise 19.12

Check back soon!

Problem 14

Refer to Exercise 19.11.
a. Identify nitrogen level and slope level as cither fixed or random effects.
b. How would you test for a significant difference in the mean marketable yields due to differences in nitrogen levels?
c. How would you test for a significant difference in the mean marketable yields due to differences in the amount of slope in the plots?

Check back soon!

Problem 15

An incomplete block design consisted of five blocks (B1, B2, B3, B4, and B5) and frive treatments (T1, T2, T3, T4, and T5). The treatments were randomly assigned to the blocks in the following manner.
a. What are the values of the design parameters: $t, k, b$, and $r$ ?
b. What is the value of $\lambda$ for this design?
c. Is the incomplete block design balanced? Justify your answer.

Check back soon!

Problem 16

An incomplete block design consisted of six blocks (B1, B2, B3, B4, B5, and B6) and six treatments (T1, T2, T3, T4, T5, and T6). The treatments were randomly assigned to the blocks in the following manner.
a. What are the values of the design parameters: $t, k, b$, and $r$ ?
b. What is the value of $\lambda$ for this design?
c. Is the incomplete block design balanced? Justify your answer.

Check back soon!
01:55

Problem 17

A study of the difference in the effects of six newly created diets on the weight gain of young rabbits is proposed. Because weight varies considerably amongst young rabbits, it is proposed to block the experiment based on litters. There are 10 litters of rabbits available for the study, but they are of varying sizes. The minimum litter size is three. Therefore, only three of the six diets can be observed in any particular litter. A balanced incomplete block design was proposed for this situation. The researcher conducted the study and obtained the following weight gains.
Do the data provide significant evidence of a difference in mean weight gains amongst the six diets? Use the formulas given in this Section 19.4 to obtain your answers.

Sheryl Ezze
Sheryl Ezze
Numerade Educator
03:29

Problem 18

Refer to Exercise 19.17. Use the Tukey-Kramer $W$ to determine which pairs of diets have significantly different mean weight gains.

Jeremiah Mbaria
Jeremiah Mbaria
Numerade Educator

Problem 19

Refer to Exercise 19.17. Analyze the data using a computer program. Is the analysis of variance table from the output of the computer program the same as your results in Exercive 19.18?

Check back soon!

Problem 20

Refer to Exercise 19.17. Test for a significant effect due to litter.

Check back soon!

Problem 21

A petroleum company was interested in comparing the miles per gallon achieved by four different gasoline blends (I, II, III, and IV). Because there can be considerable variability due to differences in drivers and car models, these two extraneous sources of variability were included as blocking variables in the following Latin square design. Each driver drove each car model over a standard course with the assigned gasoline blend. However, when driver 3 was operating car model 4 using blend II gasoline, there was a malfunction of the car's carburator that invalidated the data. This malfunction was not discovered until well after the completion of the study, and, hence, the data could not be replaced. The miles per gallon data are given here.

Check back soon!
View

Problem 22

Use the method of fitting complete and reduced models to obtain an analysis of variance for the data in Exercise 19.21.

Victor Salazar
Victor Salazar
Numerade Educator
02:36

Problem 23

A physician was interested in comparing the effects of six different antihistamines in persons extremely sensitive to antihistamine injections. To do this, a random sample of 10 allergy patients was selected from the physician's private practice, with treatments (antihistamines) assigned to each patient according to the experimental design shown in the following table. Each person then received injections of the assigned antihistamines in different sections of the right arm. The area of redness surrounding the point of injection was measured after a fixed period of time. The data are shown in the table.
a. Identify the design.
b. Identify the characteristics of the design.
c. Run an analysis of variance. Use $\alpha=05$.

Harsh Gadhiya
Harsh Gadhiya
Numerade Educator
01:20

Problem 24

Refer to Exercise 19.23. Use the Tukey-Kramer W for determining treatment differences, with $\alpha=.05$.

Manik Pulyani
Manik Pulyani
Numerade Educator

Problem 25

The marketing research group of a corporation examined the public response to the introduction of a new TV game module by comparing weekly sales volumes (in $$\$ $$ thousand) for three different store chains in each of four geographic locations.
a. Write an appropriate model (including an effect for weeks) and the sources of variability in an analysis of variance table.
b. How would your model change if we analyze the total 2 -week sales data?
c. Run an analysis of variance on the 2 -week sales data using formulas from Chapter 15. Use $\alpha=.05$.

Check back soon!
01:20

Problem 26

Refer to Exercise 19.25. Use the Tukey-Kramer $W$ procedure to compare the different geographic areas by chain means. Use $a=.05$.

Manik Pulyani
Manik Pulyani
Numerade Educator

Problem 27

Refer to Exercise 19.26. Suppose that the week 1 data were not available in the north and east for chain 1, due to logistics problems that slowed the introduction of the product by a week.
a. Write an appropriate model.
b. Suggest a method for analyzing the data using available software.
c. Write model(s) for the procedure described in part (b).

Check back soon!

Problem 28

A foreign automobile manufacturer is spending hundreds of millions of dollars to construct a large manufacturing plant (about 70 acres under one roof) here in the United States. One of its objectives is to produce cars of high quality in the United States using U.S. workers. One part of the massive orientation program for new employees is to send about $20 \%$ of them to the home country for additional training. One measure of the worth of this additional training is whether the product quality is better on assembly lines where $20 \%$ of the employees have had the homeland orientation and have been able to share it with their fellow employees. Data from six assembly lines (three with the additional orientation) are shown here. To measure defects, two different inspectors examined each of two cars chosen at random from each of the assembly lines. Use these data to answer the following questions.
a. Suggest an appropriate dependent variable.
b. Write a model for this experimental situation, and identify all terms.
c. Fill out the sources and degrees of freedom for an AOV table.

Check back soon!

Problem 29

Refer to the conditions of Exercise 19.28.
a. Suggest a method to analyze these data.
b. Does the training produce fewer defects?
c. Can you suggest any plots that might be helpful in interpreting the data?

Check back soon!

Problem 30

Refer to Exercise 19.28. Suppose that inspector 2 was unable to evaluate the second car from assembly line 4 and that inspector 1 missed car 1 from assembly line 3.
a. Does the model change?
b. Suggest a method for analyzing the data.

Check back soon!
01:48

Problem 31

The state real estate commission is mandated to provide an examination that ensures a person passing the exam will have a minimum level of competence. This provides protection for the members of the public in their dealing with real estate firms. The state regulatory agency is responsible for establishing the acceptable level of safe practice and for determining whether an individual meets that standard. The real estate board has received several complaints about the grading of the essay questions on the exams. The board's staff designs a study to evaluate their current testing procedure by evaluating the differences in the grading of the essay questions on the real estate exam. The study included 25 real estate exam graders and a random sample of 30 exams taken during the past year. Because the grading of the exams is very time consuming, each grader was assigned 6 exams to score, with the scores given in the following table. The number in parenthesis is the identifer for the grader.
a. Describe by name the type of design used. Verify that the structural conditions of your selected design are satisfied in this study.
b. Is there a difference in the average scores of the graders? Justify your answer at the $\alpha=.05$ level.
C. Was it necessary to include the exam factor in the dexign and subsequent analysis of the data?
d. Using the residuals, do there appear to be any violations in the conditions needed to run tests of hypotheses in the analysis of variances?
e. Do you think that the board should be concerned with the differences in the graders' evaluations of the exams if a difference of four units in their scores is deemed to be an important difference?

Tyler Moulton
Tyler Moulton
Numerade Educator

Problem 32

Functionalized styrenes are extremely useful building blocks for organic synthesis and for functional polymers. One of the most general syntheses of styrenes involves the combination of an aryl halide with a vinyl organometallic reagent under catalysis by palladium (Pd) complexes. A study was designed to evaluate the effect of different levels of $\mathrm{Pd}-0.01,0.05,0.1,0.5$, and 1.0 (mol\%) - on the yield of vinylboronic acid. The reactions take place in a high-pressure chamber at a temperature of $135^{\circ} \mathrm{C}$. There are only three pressure chambers available for a single run of the experimental conditions. The chemists were concerned about the substantial run-torun variations in the yields produced by new setups of the experiment in the chambers. Thus, it was necessary to block on runs, but with only three chambers, it was not poesible to include all five levels of Pd during each run. The yields of vinylboronic acid are given in the following table.
a. Describe by name the type of design used. Verify that the structural conditions of your selected design are satisfied in this study.
b. Is there a difference in the average yields of the five levels of paladium? Justify your answer at the $\alpha=105$ level.
C. Was it necessary to include the runs factor in the design and subsequent analysis of the data?
d. Using the residuals, do there appear to be any violations in the conditions needed to run tests of hypotheses in the analysis of variances?
e. Do the levels of paladium appear to produce an important difference in average yields if a difference of $4 \%$ in yields is considered important?

Check back soon!