Developing and implementing the use of predictive models for estimating water quality at Great Lakes beaches
Local agencies measured E. coli concentrations and variables expected to affect E. coli concentrations such as wave height, turbidity, water temperature, and numbers of birds at the time of sampling. In addition to these field measurements, equipment was installed by the USGS or local agencies at or near several beaches to collect water-quality and metrological measurements in near real time, including nearshore buoys, weather stations, and tributary staff gages and monitors. The USGS worked with local agencies to retrieve data from existing sources either manually or by use of tools designed specifically to compile and process data for predictive-model development.
Predictive models were developed by use of linear regression and (or) partial least squares techniques for 42 beaches that had at least 2 years of data (2010-11 and sometimes earlier) and for 1 beach that had 1 year of data. For most models, software designed for model development by the U.S. Environmental Protection Agency (Virtual Beach) was used. The selected model for each beach was based on a combination of explanatory variables including, most commonly, turbidity, day of the year, change in lake level over 24 hours, wave height, wind direction and speed, and antecedent rainfall for various time periods. Forty-two predictive models were validated against data collected during an independent year (2012) and compared to the current method for assessing recreational water quality-using the previous day’s E. coli concentration (persistence model). Goals for good predictive-model performance were responses that were at least 5 percent greater than the persistence model and overall correct responses greater than or equal to 80 percent, sensitivities (percentage of exceedances of the bathing-water standard that were correctly predicted by the model) greater than or equal to 50 percent, and specificities (percentage of nonexceedances correctly predicted by the model) greater than or equal to 85 percent. Out of 42 predictive models, 24 models yielded over-all correct responses that were at least 5 percent greater than the use of the persistence model. Predictive-model responses met the performance goals more often than the persistence-model responses in terms of overall correctness (28 versus 17 models, respectively), sensitivity (17 versus 4 models), and specificity (34 versus 25 models). Gaining knowledge of each beach and the factors that affect E. coli concentrations is important for developing good predictive models. Collection of additional years of data with a wide range of environmental conditions may also help to improve future model performance. The USGS will continue to work with local agencies in 2013 and beyond to develop and validate predictive models at beaches and improve existing nowcasts, restructuring monitoring activities to accommodate future uncertainties in funding and resources.
Additional publication details
|Publication Subtype||USGS Numbered Series|
|Title||Developing and implementing the use of predictive models for estimating water quality at Great Lakes beaches|
|Series title||Scientific Investigations Report|
|Publisher||U.S. Geological Survey|
|Publisher location||Reston, VA|
|Contributing office(s)||Ohio Water Science Center|
|Description||Report: vii, 68 p.; 3 Appendices, Downloads Directory|
|Other Geospatial||Great Lakes|