Inside Rensselaer
Data Scientists Collaborate on $2 Million Grant To Study Oceans

Peter Fox and Charles Stewart, data scientists at Rensselaer, are beginning a large-scale collaboration with the Woods Hole Oceanography Institution (WHOI), utilizing a more than $2 million grant from the Gordon and Betty Moore Foundation.

A professor in the Tetherless World Research Constellation, Fox is bringing his expertise in data science to the study of our immense ocean ecosystems. Stewart, a professor in the Department of Computer Science, will apply his experience in developing computer vision systems to the analysis of images of the ocean floor and seawater.

“A massive amount of data about the oceans is gathered every single day by sensors, cameras, sonar, and other technologies. Maybe 3 percent of that information is ever even looked at again. Any other valuable knowledge is essentially lost.” — Peter Fox

“A massive amount of data about the oceans is gathered every single day by sensors, cameras, sonar, and other technologies,” Fox said. “Maybe 3 percent of that information is ever even looked at again. Any other valuable knowledge is essentially lost. This makes the return on investment to create this data extremely low, monetarily and scientifically. What we are doing through this collaboration is integrating and enhancing this data so that it can be more easily consumed and used by other scientists and the public.”

To begin this large effort, Fox, Stewart, and their students will work with WHOI oceanographers and computer scientists to better analyze and interpret data from some of the sophisticated WHOI underwater imaging technologies.

These technologies include the FlowCytobot, an automated underwater microscope that identifies and counts tiny phytoplankton in the water. These microorganisms make up the foundation of any marine ecosystem and can be important indicators to ecosystem health, biodiversity, or environmental contamination. The technologies also include SeaBED, an autonomous underwater vehicle that hovers slightly above the seafloor at astonishing depths (up to 6,000 feet) and takes highly detailed sonar and optical images of the seafloor, and HabCam, the habitat mapping camera system that is moved above the seafloor creating a continuous image ribbon.

All of these technologies provide important information to oceanographers on the health of our oceans. They also have another important thing in common: They each produce enormous amounts of data that need to be analyzed.

The HabCam alone collects five to six images every second for days at a time, producing more than two terabytes of data per day for strips of the ocean floor 100 miles or more in length. These images must be processed to characterize habitat, count economically important species such as scallops, and determine the extent of invasive organisms. This vast number of images presents particular issues for Stewart.

“Images taken underwater present unique challenges for computers due to distortion from the water as well as the large amount of material that floats through the water and clouds the images,” says Stewart. “As a result, important information from the image is forever lost to scientists.”

Stewart will apply sophisticated computer vision techniques in new ways to challenge the limit of what can be accomplished with even the most recent research in computer vision for the underwater environment. He will use algorithms and software to interpret and refine images from the WHOI technologies, helping WHOI researchers and other scientists greatly enhance the research potential of their existing technologies.

Fox will work with WHOI researchers to apply tools that allow the data from these technologies and others like them to be easily accessed and shared among researchers and the public. A large part of this work will be accomplished by incorporating Semantic Web and knowledge provenance or origins technologies to the raw and processed data produced by the WHOI technologies, according to Fox. Semantic technology encodes data with information that computers or other web-enabled devices can use to better share, search, and interpret the data. It is a family of computational “languages” or codes that are undetectable to the human computer user, but of great utility to the machine.

“By integrating and enhancing this truly multidisciplinary data, we can enhance the utility of the data by making it usable to a variety of people, of varying expertise, and on various operating systems,” Fox said. “This key investment will help us unlock the full potential of the data being developed with these technologies.”

For more information on the collaboration, go to For more information on the Tetherless World Research Constellation, go to For more information on the Gordon and Betty Moore Foundation, go to

* * *
Send comments to:
Inside Rensselaer, Strategic Communications and External Relations
1000 Troy Building, 110 Eighth Street, Troy, N.Y. 12180 or to
Inside Rensselaer
Volume 5, Number 3, February 18, 2011
©2010 Rensselaer Polytechnic Institute
Front Page
Rensselaer Polytechnic Institute | About RPI | Virtual Campus Tour | Academics | Research | Student Life | Admissions | News & Events