Because distributions of variables such as $\mathrm{PCB}$, the $\mathrm{PCB}$ congeners, and TEQ tend to be skewed, researchers frequently analyze the logarithms of the measured variables. Create a data set that has the logs of each of the variables in the $\mathrm{PCB}$ data file. Note that zero is a possible value for $\mathrm{PCB} 126$; most software packages will eliminate these cases when you request a log transformation.
(a) If you do not do anything about the 16 zero values of $\mathrm{PCB} 126$, what does your software do with these cases? Is there an error message of some kind?
(b) If you attempt to run a regression to predict the log of $\mathrm{PCB}$ using the log of $\mathrm{PCB} 126$ and the log of $\mathrm{PCB} 52$, are the cases with the zero values of $\mathrm{PCB} 126$ eliminated? Do you think that this is a good way to handle this situation?
(c) The smallest nonzero value of $\mathrm{PCB} 126$ is $0.0052$. One common practice when taking logarithms of measured values is to replace the zeros by one-half of the smallest observed value. Create a logarithm data set using this procedure; that is, replace the 16 zero values of $\mathrm{PCB} 126$ by $0.0026$ before taking logarithms. Use numerical and graphical summaries to describe the distributions of the log variables.