| CASTING
SEMANTIC NETS
A NEUROSCIENCE
DISCOVERY ENVIRONMENT
hanks
to initiatives by individual scientists and programs such as the
Human Brain Project and the Protein Data Bank, vast quantities
of neuroscience information are becoming available in the form
of databases and Web resources-everything from molecular structures
and 3-D cellular information through anatomical studies and movies
of functional brain imaging. "No neuroscientist needs to be reminded
of the information explosion," said Maryann Martone, a researcher
at UCSD's National Center for Microscopy and Imaging Research
(NCMIR). "How can we integrate data from various research disciplines,
with different experimental approaches, concerning completely
different species of organisms, and at widely varying levels of
resolution, into a unified body of knowledge that we can navigate
and mine for new insights?"
|
|
Figure 1. Where the Proteins Are
Using a tool for Windows clients
called AxioMap, SDSC's Ilya Zaslavsky created this Kind
query retrieval visualization of a neuron image with color-coded
overlays that represent protein localization information
from multiple sources.
|
 |
|
|
A solution to this dilemma comes from the
intersection of NPACI's Data-Intensive Computing Environments
(DICE) and Neuroscience thrusts. NPACI researchers at SDSC and
UCSD have developed a prototype information mediator system called
KIND as part of their effort to federate neuroscience databases.
The project will create a knowledge retrieval environment in which
a scientist can query the mediator, retrieve information from
across a number of information sources, and use the results to
perform custom analyses and data mining. Scientific information from many sources is
now accessible almost instantly via the Internet, but these sources
typically use different interfaces and export their data in incompatible
formats. The brute-force approach of trying to create a single,
enormous database that encompasses every possible attribute of
nervous systems just isn't feasible. A more realistic way is to
federate-cross-reference-separate data sources, taking their differences
into account. Mediator systems assist users by seeking information
of interest and providing integrated views of the data they want. Wrapper-mediator systems use the Extensible
Markup Language (XML) as a view-definition language to both model
and interchange information across incompatible data sources.
An XML data wrapper associated with each source exports an XML
view of the data. The mediator selects the outputs of independent
sources, restructures and merges them, and provides an integrated
XML view of the information. NPACI's DICE thrust has developed wrappers
and mediators for a variety of information sources, including
relational databases, geographical information systems, and Web
sites, and is applying mediator systems for distributed digital
libraries, government agencies, and scientific data collections.
Top
| Contents | Next CASTING SEMANTIC
NETS "Some of our recent work has been driven by
the need to integrate the scientific databases of the Neuroscience
Workbench, where source data comes from different 'semantic worlds'
that might share few or no attributes," said DICE researcher Amarnath
Gupta. Gupta and Bertram Ludaescher at SDSC, collaborating with
Martone and NCMIR Director Mark Ellisman, have developed KIND,
a mediator that extends current approaches by incorporating semantic
models of information sources. "More traditional mediators work with sources
that share the same domain of discourse, but here we need to integrate
across different domains like neuroanatomy, protein properties,
and ion-currents in nerves," Ludaescher said. "A novel feature
of our architecture is the use of domain maps-semantic nets of
terms and relationships." The domain map draws upon anatomical
relations between brain regions and their cellular and subcellular
components. A first prototype establishing the viability
of the approach is operational. The researchers developed their
mediation-based approach using NCMIR's 3-D Cell Centered Database,
which contains information about neuronal structure and protein
distribution. Using data from other sources in XML wrappers, NCMIR
researchers have accessed other Web-based neuroscience resources,
including the SenseLab database at Yale, Synapse Web at Boston
University, and the EF-Hand Calcium-Binding Proteins Data Library
at Vanderbilt University. Top
| Contents | Next A NEUROSCIENCE DISCOVERY
ENVIRONMENT "Our objective is to develop a discovery environment
for neuroscientists," Martone said. The environment is intended
to deal with cellular and subcellular morphology, molecular distributions
in cellular and subcellular structures, and physiological responses
that reveal functional properties of single cells and structures
as well as cellular and subcellular environments. "With the KIND
mediator we'll be able to perform ad hoc queries and compare properties
across resolution levels, experimental conditions, cell populations,
and species populations." "This brings us closer to bridging the gap
between experimental and scientific disciplines and toward a unified
model of nervous systems," Martone said. -MG 
www.npaci.edu/DICE/Neuro
|
RESEARCHERS Amarnath Gupta,
Bertram Ludaescher,
Ilya Zaslavsky
SDSC Maryann Martone,
Mark Ellisman
UCSD
|