thumbnail

Comparisons of likelihood and machine learning methods of individual classification

Journal of Heredity

Out-of-print
By:
, , , , ,

Links

  • The Publications Warehouse does not have links to digital versions of this publication at this time
  • Download citation as: RIS

Abstract

Classification methods used in machine learning (e.g., artificial neural networks, decision trees, and k-nearest neighbor clustering) are rarely used with population genetic data. We compare different nonparametric machine learning techniques with parametric likelihood estimations commonly employed in population genetics for purposes of assigning individuals to their population of origin ('assignment tests'). Classifier accuracy was compared across simulated data sets representing different levels of population differentiation (low and high FST), number of loci surveyed (5 and 10), and allelic diversity (average of three or eight alleles per locus). Empirical data for the lake trout (Salvelinus namaycush) exhibiting levels of population differentiation comparable to those used in simulations were examined to further evaluate and compare classification methods. Classification error rates associated with artificial neural networks and likelihood estimators were lower for simulated data sets compared to k-nearest neighbor and decision tree classifiers over the entire range of parameters considered. Artificial neural networks only marginally outperformed the likelihood method for simulated data (02.8% lower error rates). The relative performance of each machine learning classifier improved relative likelihood estimators for empirical data sets, suggesting an ability to 'learn' and utilize properties of empirical genotypic arrays intrinsic to each population. Likelihood-based estimation methods provide a more accessible option for reliable assignment of individuals to the population of origin due to the intricacies in development and evaluation of artificial neural networks.

Additional Publication Details

Publication type:
Article
Publication Subtype:
Journal Article
Title:
Comparisons of likelihood and machine learning methods of individual classification
Series title:
Journal of Heredity
Volume
93
Issue:
4
Year Published:
2002
Language:
English
Contributing office(s):
Great Lakes Science Center
Description:
p. 260-269
Larger Work Type:
Article
Larger Work Subtype:
Journal Article
Larger Work Title:
Journal of Heredity
First page:
260
Last page:
269
Number of Pages:
9