Regression models to estimate real-time concentrations of selected constituents in two tributaries to Lake Houston near Houston, Texas, 2005-07

Scientific Investigations Report 2009-5231

Prepared in cooperation with the City of Houston
, ORCID iD , and



In December 2005, the U.S. Geological Survey in cooperation with the City of Houston, Texas, began collecting discrete water-quality samples for nutrients, total organic carbon, bacteria (total coliform and Escherichia coli), atrazine, and suspended sediment at two U.S. Geological Survey streamflow-gaging stations upstream from Lake Houston near Houston (08068500 Spring Creek near Spring, Texas, and 08070200 East Fork San Jacinto River near New Caney, Texas). The data from the discrete water-quality samples collected during 2005-07, in conjunction with monitored real-time data already being collected - physical properties (specific conductance, pH, water temperature, turbidity, and dissolved oxygen), streamflow, and rainfall - were used to develop regression models for predicting water-quality constituent concentrations for inflows to Lake Houston. Rainfall data were obtained from a rain gage monitored by Harris County Homeland Security and Emergency Management and colocated with the Spring Creek station. The leaps and bounds algorithm was used to find the best subsets of possible regression models (minimum residual sum of squares for a given number of variables). The potential explanatory or predictive variables included discharge (streamflow), specific conductance, pH, water temperature, turbidity, dissolved oxygen, rainfall, and time (to account for seasonal variations inherent in some water-quality data). The response variables at each site were nitrite plus nitrate nitrogen, total phosphorus, organic carbon, Escherichia coli, atrazine, and suspended sediment. The explanatory variables provide easily measured quantities as a means to estimate concentrations of the various constituents under investigation, with accompanying estimates of measurement uncertainty. Each regression equation can be used to estimate concentrations of a given constituent in real time. In conjunction with estimated concentrations, constituent loads were estimated by multiplying the estimated concentration by the corresponding streamflow and applying the appropriate conversion factor. By computing loads from estimated constituent concentrations, a continuous record of estimated loads can be available for comparison to total maximum daily loads. The regression equations presented in this report are site specific to the Spring Creek and East Fork San Jacinto River streamflow-gaging stations; however, the methods that were developed and documented could be applied to other tributaries to Lake Houston for estimating real-time water-quality data for streams entering Lake Houston.

Study Area

Additional publication details

Publication type:
Publication Subtype:
USGS Numbered Series
Regression models to estimate real-time concentrations of selected constituents in two tributaries to Lake Houston near Houston, Texas, 2005-07
Series title:
Scientific Investigations Report
Series number:
Year Published:
U.S. Geological Survey
Publisher location:
Reston, VA
Contributing office(s):
Texas Water Science Center
vi, 44 p.
Time Range Start:
Time Range End:
Online Only (Y/N):
Additional Online Files (Y/N):