Geochemical data are commonly censored, that is, concentrations for some samples are reported as "less than" or "greater than" some value. Censored data hampers statistical analysis because certain computational techniques used in statistical analysis require a complete set of uncensored data. We show that the simple substitution method for creating an uncensored dataset, e.g., replacement by 3/4 times the detection limit, has serious flaws, and we present an objective method to determine the replacement value. Our basic premise is that the replacement value should equal the mean of the actual values represented by the qualified data. We adapt the maximum likelihood approach (Cohen, 1961) to estimate this mean. This method reproduces the mean and skewness as well or better than a simple substitution method using 3/4 of the lower detection limit or 3/4 of the upper detection limit. For a small proportion of "less than" substitutions, a simple-substitution replacement factor of 0.55 is preferable to 3/4; for a small proportion of "greater than" substitutions, a simple-substitution replacement factor of 1.7 is preferable to 4/3, provided the resulting replacement value does not exceed 100%. For more than 10% replacement, a mean empirical factor may be used. However, empirically determined simple-substitution replacement factors usually vary among different data sets and are less reliable with more replacements. Therefore, a maximum likelihood method is superior in general. Theoretical and empirical analyses show that true replacement factors for "less thans" decrease in magnitude with more replacements and larger standard deviation; those for "greater thans" increase in magnitude with more replacements and larger standard deviation. In contrast to any simple substitution method, the maximum likelihood method reproduces these variations. Using the maximum likelihood method for replacing "less thans" in our sample data set, correlation coefficients were reasonably accurately estimated in 90% of the cases for as much as 40% replacement and in 60% of the cases for 80% replacement. These results suggest that censored data can be utilized more than is commonly realized. ?? 1993 International Association for Mathematical Geology.
Additional publication details
An objective replacement method for censored geochemical data