Skip to content

PlantsP Focuses Bioinformatics on Plant Genomics

SDSC RESEARCH |Contents | Next
John C. Walker
University of Missouri

Michael Gribskov

Jeffrey Harper
The Scripps Research Institute

Alice C. Harmon
University of Florida

Estelle M. Hrabak
University of New Hampshire

Joseph J. Kieber
University of North Carolina

Douglas D. Randall
University of Missouri

G. Eric Schaller
University of New Hampshire

Douglas W. Smith
UC San Diego

Frans E. Tax
University of Arizona

ShuQun Zhang
University of Missouri

A mong the most important genomes now being sequenced are those of plants. Rice is a staple for much of the world, as are grains like corn and wheat. By the end of the year 2000, scientists expect to have the complete genetic sequences of two important seed-bearing plants: rice and Arabidopsis thaliana, a relative of cabbage and mustard. While the relatively small genomes of many microbes as well as the genomes of organisms ranging from the microscopic worm Caenorhabditis elegans to Homo sapiens have been sequenced, plants--for all their economic importance--have received less attention. A collaboration among plant scientists and bioinformaticists at the University of Missouri, SDSC, The Scripps Research Institute, and other sites, under funding from NSF, has begun to demonstrate ways to focus efforts in the functional genomics of plants.



Arabidopsis thaliana

Arabidopsis thaliana

"We chose Arabidopsis as our model organism because many of its genes are already known and functional genomics of plant phosphorylation as our model problem because phosphorylation is key to the cellular mechanisms that control plant growth and development," said John C. Walker, University of Missouri biology professor and project leader. "Our Web resource site, PlantsP, is demonstrating the value and utility of bioinformatics for organizing functional research."


Arabidopsis is a small flowering plant of the mustard family, "although you wouldn't want to put it on your sandwich," Walker said. "I've tried it, and it's pretty bitter." A relative of mustard, broccoli, cauliflower, and cabbage, Arabidopsis was nominated as a model organism a decade ago, when many plant scientists began to concentrate on it. The generation time is short, the genome is small, and the plant itself is small--researchers don't need vast fields to grow it.

"The PlantsP project is both a search for knowledge about a single plant and an exploration of computational methods for finding the structures, functions, and evolutionary histories of proteins in general," said Michael Gribskov, PlantsP co-principal investigator and chief bioinformaticist, who is a senior staff scientist at SDSC and adjunct assistant professor of biology at UC San Diego.

The project is particularly ambitious, said Jeffrey Harper, another PlantsP co-investigator and a plant biologist at The Scripps Research Institute in La Jolla, "because our aim is create the primary source of information on phospho-regulation in plants, in a format that is constantly being updated by a community of experts." While the collaborators have already completed a preliminary analysis of calcium-dependent plant protein kinases, "this will really get moving when the completed genome of Arabidopsis arrives at the end of the year," Harper said.

The site contains a description of the project and participating laboratories, a bibliographic section, and links to various community resources. It also has its own database of "knockouts," with a link to the Arabidopsis Knockout Facility at the University of Wisconsin. Knockouts are a powerful method for determining gene functions by isolating mutations: "We see what function is affected in a plant when it has everything it needs but one or two genes that have been 'knocked out' by selective mutation," Walker said.

Top| Contents | Next


A. Harmon, M. Gribskov, and J.F. Harper (2000): CDPKs--A kinase for every Ca2+ signal. Trends in Plant Science 5, 154-159.


The functional focus of PlantsP is phosphorylation. Protein phosphorylation and dephosphorylation are catalyzed by protein kinases and phosphatases. These enzymes usually constitute a reversible molecular switch, like an on-off light switch. Phosphorylation or dephosphorylation occurs to control cell cycles, transcriptional and translational regulation, metabolic processes, growth and differentiation, and plant responses to changes in environment.

"When it is sequenced, we estimate that the Arabidopsis genome, which is only 130 million base-pairs long, will yield some 25,000 genes," Harper said, "of which about 1,000 genes code for protein kinases and half again as many for phosphatases." Because the kinases and phosphatases control so many plant processes, Gribskov noted, "a genome-wide approach is needed to make advances in finding the roles of these enzymes in regulating plant function."

Gribskov has just received another plant genome grant from the NSF Directorate for Biological Sciences to apply bioinformatics to the entire NSF effort in plant genomics, creating a central clearinghouse for sharable information and a common site for dissemination of information about all 40 or so NSF Plant Genome projects.

In addition to supplying a common site for dissemination of information about the NSF Plant Genome Program, Gribskov will assemble, document, and distribute reusable software tools for genomics projects, and supply a focal point for development of common data models, ontologies, and other standards for interoperable electronic data resources.

"Our experience with PlantsP has taught us that there is an urgent need for coordination among projects, and the relationships among them have been rather ad hoc until now," Gribskov said. "This project will test the ability of a Web network to aid in multiproject program coordination." --MM *

Top| Contents | Next