NBII-SAIN Data Management Toolkit

Open-File Report 2009-1170
By:  and 



The Strategic Plan for the U.S. Geological Survey Biological Informatics Program (2005-2009) recognizes the need for effective data management: Though the Federal government invests more than $600 million per year in biological data collection, it is difficult to address these issues because of limited accessibility and lack of standards for data and information...variable quality, sources, methods, and formats (for example observations in the field, museum specimens, and satellite images) present additional challenges. This is further complicated by the fast-moving target of emerging and changing technologies such as GPS and GIS. Even though these technologies offer new solutions, they also create new informatics challenges (Ruggiero and others, 2005). The USGS National Biological Information Infrastructure program, hereafter referred to as NBII, is charged with the mission to improve the way data and information are gathered, documented, stored, and accessed. The central objective of this project is a direct reflection of the purpose of NBII as described by John Mosesso, Program Manager of the U.S. Geological Survey-Biological Informatics Program-GAP Analysis: At the outset, the reason for bringing about NBII was that there were significant amounts of data and information scattered all over the U.S., not accessible, in incompatible formats, and that NBII was tasked with addressing this problem...NBII's focus is to pull data together that truly matters to someone or communities. Essentially, the core questions are: 1) what are the issues, 2) where is the data, and 3) how can we make it usable and accessible (John Mosesso, U.S. Geological Survey, oral commun., 2006). Redundancy in data collection can be a major issue when multiple stakeholders are involved with a common effort. In 2001 the U.S. General Accounting Office (USGAO) estimated that about 50 percent of the Federal government's geospatial data at the time was redundant. In addition, approximately 80 percent of the cost of a spatial information system is associated with spatial data collection and management (U.S. General Accounting Office, 2003). These figures indicate that the resources (time, personnel, money) of many agencies and organizations could be used more efficiently and effectively. Dedicated and conscientious data management coordination and documentation is critical for reducing such redundancy. Substantial cost savings and increased efficiency are direct results of a pro-active data management approach. In addition, details of projects as well as data and information are frequently lost as a result of real-world occurrences such as the passing of time, job turnover, and equipment changes and failure. A standardized, well documented database allows resource managers to identify issues, analyze options, and ultimately make better decisions in the context of adaptive management (National Land and Water Resources Audit and the Australia New Zealand Land Information Council on behalf of the Australian National Government, 2003). Many environmentally focused, scientific, or natural resource management organizations collect and create both spatial and non-spatial data in some form. Data management appropriate for those data will be contingent upon the project goal(s) and objectives and thus will vary on a case-by-case basis. This project and the resulting Data Management Toolkit, hereafter referred to as the Toolkit, is therefore not intended to be comprehensive in terms of addressing all of the data management needs of all projects that contain biological, geospatial, and other types of data. The Toolkit emphasizes the idea of connecting a project's data and the related management needs to the defined project goals and objectives from the outset. In that context, the Toolkit presents and describes the fundamental components of sound data and information management that are common to projects involving biological, geospatial, and other related data

Additional publication details

Publication type Report
Publication Subtype USGS Numbered Series
Title NBII-SAIN Data Management Toolkit
Series title Open-File Report
Series number 2009-1170
DOI 10.3133/ofr20091170
Edition -
Year Published 2009
Language ENGLISH
Publisher U.S. Geological Survey
Contributing office(s) Leetown Science Center, Core Science Analytics, Synthesis, and Libraries
Description vi, 97 p.
Google Analytic Metrics Metrics page