The accuracy of statistically downscaled (SD) general circulation model (GCM) simulations of monthly surface climate for historical conditions (1950–2005) was assessed for the conterminous United States (CONUS). The SD monthly precipitation (PPT) and temperature (TAVE) from 95 GCMs from phases 3 and 5 of the Coupled Model Intercomparison Project (CMIP3 and CMIP5) were used as inputs to a monthly water balance model (MWBM). Distributions of MWBM input (PPT and TAVE) and output [runoff (RUN)] variables derived from gridded station data (GSD) and historical SD climate were compared using the Kolmogorov–Smirnov (KS) test For all three variables considered, the KS test results showed that variables simulated using CMIP5 generally are more reliable than those derived from CMIP3, likely due to improvements in PPT simulations. At most locations across the CONUS, the largest differences between GSD and SD PPT and RUN occurred in the lowest part of the distributions (i.e., low-flow RUN and low-magnitude PPT). Results indicate that for the majority of the CONUS, there are downscaled GCMs that can reliably simulate historical climatic conditions. But, in some geographic locations, none of the SD GCMs replicated historical conditions for two of the three variables (PPT and RUN) based on the KS test, with a significance level of 0.05. In these locations, improved GCM simulations of PPT are needed to more reliably estimate components of the hydrologic cycle. Simple metrics and statistical tests, such as those described here, can provide an initial set of criteria to help simplify GCM selection.