The need to integrate large quantities of digital geoscience information to classify locations as mineral deposits or nondeposits has been met by the weights-of-evidence method in many situations. Widespread selection of this method may be more the result of its ease of use and interpretation rather than comparisons with alternative methods. A comparison of the weights-of-evidence method to probabilistic neural networks is performed here with data from Chisel Lake-Andeson Lake, Manitoba, Canada. Each method is designed to estimate the probability of belonging to learned classes where the estimated probabilities are used to classify the unknowns. Using these data, significantly lower classification error rates were observed for the neural network, not only when test and training data were the same (0.02 versus 23%), but also when validation data, not used in any training, were used to test the efficiency of classification (0.7 versus 17%). Despite these data containing too few deposits, these tests of this set of data demonstrate the neural network's ability at making unbiased probability estimates and lower error rates when measured by number of polygons or by the area of land misclassified. For both methods, independent validation tests are required to ensure that estimates are representative of real-world results. Results from the weights-of-evidence method demonstrate a strong bias where most errors are barren areas misclassified as deposits. The weights-of-evidence method is based on Bayes rule, which requires independent variables in order to make unbiased estimates. The chi-square test for independence indicates no significant correlations among the variables in the Chisel Lake–Andeson Lake data. However, the expected number of deposits test clearly demonstrates that these data violate the independence assumption. Other, independent simulations with three variables show that using variables with correlations of 1.0 can double the expected number of deposits as can correlations of −1.0. Studies done in the 1970s on methods that use Bayes rule show that moderate correlations among attributes seriously affect estimates and even small correlations lead to increases in misclassifications. Adverse effects have been observed with small to moderate correlations when only six to eight variables were used. Consistent evidence of upward biased probability estimates from multivariate methods founded on Bayes rule must be of considerable concern to institutions and governmental agencies where unbiased estimates are required. In addition to increasing the misclassification rate, biased probability estimates make classification into deposit and nondeposit classes an arbitrary subjective decision. The probabilistic neural network has no problem dealing with correlated variables—its performance depends strongly on having a thoroughly representative training set. Probabilistic neural networks or logistic regression should receive serious consideration where unbiased estimates are required. The weights-of-evidence method would serve to estimate thresholds between anomalies and background and for exploratory data analysis.
|Publication Subtype||Journal Article|
|Title||A comparison of the weights-of-evidence method and probabilistic neural networks|
|Series title||Natural Resources Research|
|Contributing office(s)||Geology, Minerals, Energy, and Geophysics Science Center|
|Google Analytic Metrics||Metrics page|