A cross-validation package driving Netica with python

Environmental Modelling and Software
By:  and 

Links

Abstract

Bayesian networks (BNs) are powerful tools for probabilistically simulating natural systems and emulating process models. Cross validation is a technique to avoid overfitting resulting from overly complex BNs. Overfitting reduces predictive skill. Cross-validation for BNs is known but rarely implemented due partly to a lack of software tools designed to work with available BN packages. CVNetica is open-source, written in Python, and extends the Netica software package to perform cross-validation and read, rebuild, and learn BNs from data. Insights gained from cross-validation and implications on prediction versus description are illustrated with: a data-driven oceanographic application; and a model-emulation application. These examples show that overfitting occurs when BNs become more complex than allowed by supporting data and overfitting incurs computational costs as well as causing a reduction in prediction skill. CVNetica evaluates overfitting using several complexity metrics (we used level of discretization) and its impact on performance metrics (we used skill).

Additional publication details

Publication type Article
Publication Subtype Journal Article
Title A cross-validation package driving Netica with python
Series title Environmental Modelling and Software
DOI 10.1016/j.envsoft.2014.09.007
Volume 63
Year Published 2014
Language English
Publisher Elsevier
Contributing office(s) Wisconsin Water Science Center
Description 10 p.
Larger Work Type Article
Larger Work Subtype Journal Article
Larger Work Title Environmental Modelling and Software
First page 14
Last page 23