A distributed pipeline for DIDSON data processing

Liling Li; Tyler Danner; Jesse Eickholt; Erin L. McCann; Kevin Pangle; Nicholas S. Johnson

doi:10.1109/BigData.2017.8258458

A distributed pipeline for DIDSON data processing

By: Liling Li, Tyler Danner, Jesse Eickholt, Erin L. McCann, Kevin Pangle, and Nicholas S. Johnson

https://doi.org/10.1109/BigData.2017.8258458

Links

More information: Publisher Index Page (via DOI)
Download citation as: RIS | Dublin Core

Abstract

Technological advances in the field of ecology allow data on ecological systems to be collected at high resolution, both temporally and spatially. Devices such as Dual-frequency Identification Sonar (DIDSON) can be deployed in aquatic environments for extended periods and easily generate several terabytes of underwater surveillance data which may need to be processed multiple times. Due to the large amount of data generated and need for flexibility in processing, a distributed pipeline was constructed for DIDSON data making use of the Hadoop ecosystem. The pipeline is capable of ingesting raw DIDSON data, transforming the acoustic data to images, filtering the images, detecting and extracting motion, and generating feature data for machine learning and classification. All of the tasks in the pipeline can be run in parallel and the framework allows for custom processing. Applications of the pipeline include monitoring migration times, determining the presence of a particular species, estimating population size and other fishery management tasks.

Additional publication details
Publication type	Conference Paper
Publication Subtype	Conference Paper
Title	A distributed pipeline for DIDSON data processing
DOI	10.1109/BigData.2017.8258458
Year Published	2018
Language	English
Publisher	IEEE
Contributing office(s)	Great Lakes Science Center
Larger Work Type	Book
Larger Work Subtype	Conference publication
Larger Work Title	2017 IEEE International Conference on Big Data
Conference Location	Boston, MA
Conference Date	December 11-14, 2017
Google Analytic Metrics	Metrics page