Skip to content

SDSC Homepage SDSC Homepage San Diego Supercomputer Center Contact EnVision NPACI: A Leading Edge Site San Diego Supercomputer Center
  A Geosciences Network or Understanding the Whole Earth

A Science Environment for Ecological Knowledge

Hot Commodities

How to Run an Antimatter Generator

Modeling the Matrix

Morphin’ Lizards!
SDSC Homepage
SDSC Researcher Likes Augmented Reality View
NPACI Releases Updated Cluster Software
NSF Announces Cyberinfrastructure Initiative
Fran Berman Appointed to UCSD Endowed Chair
Phil Andrews and Jay Boisseau Elected to NPACI Executive Committee
Primate Resource Online at SDSC
SDSC Announces New IBM Data-Oriented Supercomputer

Cyberinfrastructure Scientists Can Use
(without being IT experts)

by Chaitan Baru,
Program Co-Director,
Data and Knowledge Systems Program, SDSC

The mission of the San Diego Supercomputer Center (SDSC) is to develop and use technology to advance science. The challenge we face is to ensure that information technology keeps pace with the requirements of science and continually serves its most demanding needs.

An emergent requirement in leading-edge science is the need for sophisticated data, information, and knowledge management. The scientific endeavor is naturally dependent on the availability of data–observed facts validate theories and hypotheses. Furthermore, with the vastly improved ability to collect data, many areas of science are also becoming data-driven: hypotheses are derived from analyzing and mining data. Data-intensive computing, the mode of computing where applications read and/or generate extremely large data sets, is only one aspect of this.

Major IT challenges also arise in managing, integrating, analyzing, and mining information from complex, heterogeneous, multidisciplinary databases. Additionally, these capabilities must be implemented in a distributed grid-computing environment to enable computational tools and scientific workflows to access information and resources across multiple administrative domains.

This issue of enVision describes two projects–the Geosciences Network (GEON) and the Science Environment for Ecological Knowledge (SEEK)–which are developing cyberinfrastructure for advanced information management and analysis in collaboration with scientists. The goal is to enable scientists to be more productive by using state-of-the-art IT without requiring them to become IT experts. For the scientist, IT is simply a tool, and like any tool, it should be useful, easy to use, and predictable. For the information technologist, the demands of science sometimes push on the leading edge of IT capabilities, requiring research and development in new technologies.

A two-tier approach is needed. At one level, IT best practices must be implemented in science. This involves using well-established technologies and techniques, such as the use of relational databases or geographic information systems (GIS) to support the science. At another level, investigation of new IT approaches is required, including research in basic computer science. In both cases, a close collaboration between scientists and IT researchers is the key to developing successful cyberinfrastructure.

The GEON project is a five-year effort funded by the National Science Foundation. In this project, a coalition of geoscientists and IT researchers are developing cyber tools to interlink databases and analysis tools. Technology is being developed to enable researchers to search and discover information based on concepts and relationships that are natural to the scientist. While such technology is obviously important for outreach–to enable students and non-specialists to use this information–it is also equally important for "inreach"–to enable researchers to discover and use information from related disciplines. The GEON cyberinfrastructure weaves the separate strands of the Earth sciences disciplines and data into a unified fabric to enable easier discovery of related information, along with some explanation of these relationships. This fabric includes computing, data management, and visualization. Further, a grid portal will be created to provide access to this powerful new environment.

GEON is part of a broader IT-driven integration that is occurring across all of the sciences and beyond, including in projects such as the Biomedical Informatics Research Network (BIRN).

SDSC researchers and collaborators across the country are developing an ecological network similar to GEON, called SEEK. This initiative is an outgrowth of ecological and biodiversity informatics research, and it includes computer scientists, ecologists, and technologists. Such a novel combination of talent and the support of NSF, NIH, and other funding sources are crucial to the invention of SEEK, GEON, BIRN, and hopefully many other collaborations like them.