Web scientists at Rensselaer Polytechnic Institute will use the World Wide Web to compile and share scientific data on an unprecedented scale. Their goal is to hasten scientific discovery and innovation by enabling rapid and easy collaboration between scientists, educators, students, policy makers, and even “citizen scientists” around the world via the Web.
Funded by $1.1 million in American Recovery and Reinvestment Act funding from the National Science Foundation (NSF), the research seeks to break science out of the hallowed halls of the laboratory and place it in the hands of the people.
“We want to provide a toolkit for scientists and educators that allows them to gain access to data from a variety of sources and, importantly, outside of their direct area of expertise,” said Peter Fox, the principal investigator for the project and Senior Constellation Professor in the Tetherless World Constellation at Rensselaer. “Right now there are many scientists, educators, and policy makers who want to use other’s scientific data, but they don’t know how to find it, how is was collected, and even how to read it.” Fox notes that with the increased specialization of most scientific research, even people in closely-related fields currently struggle to interpret the data of their contemporaries. These scientific language barriers, he said, can hinder the pace of new discoveries.
Toolkit provides new dimensions in language
The new toolkit will have a foundation in Semantic Web technology. On the Web, semantic computer code (known as ontologies) provides underlying meaning and links to the information that is presented on a Web page to your computer, smart phone, or other Web-enabled device. Current technology involves flat words on the screen, for example “climate change,” that require a human to interpret the words and then manually move on to another Web site for additional information. Web technologies based on semantics, however, would enable the computer to provide its own underlying meaning to the words, and provide links to related Web sites, nonprofit organizations, upcoming Senate bills, or even related photos stored on your computer. In the case of semantic data, the computer can configure, coalesce, and interpret data from millions of different sources instantly without the need for human intervention.
“Semantic technologies lower the barrier of entry to do science,” said co-principal investigator on the project and Senior Constellation Professor Deborah McGuinness. “With semantics, we can bridge the gap between the question that someone wants to ask in their limited scientific vocabulary and the extreme complexity of the underlying data.” An individual’s vocabulary and scientific understanding will no longer have to correspond to the level of their scientific discovery, according to Fox and McGuinness.
Fox, McGuinness, and their counterpart on the project, Senior Constellation Professor James Hendler, will use semantic ontologies to build customizable Web sites. Each Web site will be familiar, understandable, and navigable to its end user depending on the level and type of expertise. Behind the simple façade of the Web site will rest billions of pages of data all semantically tagged and ready to be accessed and interpreted by the computer. The user needs only to type a question, and it will be answered using data input by other users around world. The researchers also plan to create plug-in applications for commonly used data software such as Excel that adds access to the data in a format that is familiar to the end user.