Contamination of drinking water by nitrate is a growing problem in many agricultural areas of the country. Ingested nitrate can lead to the endogenous formation of N-nitroso compounds, potent carcinogens. We developed a predictive model for nitrate concentrations in private wells in Iowa. Using 34,084 measurements of nitrate in private wells, we trained and tested random forest models to predict log nitrate levels by systematically assessing the predictive performance of 179 variables in 36 thematic groups (well depth, distance to sinkholes, location, land use, soil characteristics, nitrogen inputs, meteorology, and other factors). The final model contained 66 variables in 17 groups. Some of the most important variables were well depth, slope length within 1 km of the well, year of sample, and distance to nearest animal feeding operation. The correlation between observed and estimated nitrate concentrations was excellent in the training set (r-square = 0.77) and was acceptable in the testing set (r-square = 0.38). The random forest model had substantially better predictive performance than a traditional linear regression model or a regression tree. Our model will be used to investigate the association between nitrate levels in drinking water and cancer risk in the Iowa participants of the Agricultural Health Study cohort.
Additional publication details
|Publication Subtype||Journal Article|
|Title||Modeling groundwater nitrate concentrations in private wells in Iowa|
|Series title||Science of the Total Environment|
|Contributing office(s)||National Water Quality Assessment Program|
|Online Only (Y/N)||N|
|Additional Online Files (Y/N)||N|