Home > CSSS Seminars > Ernst Kretschmann
 


Date/Time/Place: September 29, 2004, SDSC Auditorium
3-4 pm Seminar
4-5 pm Technical Discussion

Speaker: Samuel Kerrien
EMBL – European Bioinformatics Institute

Title: UniProt 2.0 – Technical Overview & Discussion

Abstract: The annotation in gene and gene product is deeply rooted in a flat file/free text tradition. Relational or object oriented models of the data and their implementations are rare and usually have access restrictions for end users. Users of the UniProt data set, which is encompassing the former Swiss-Prot, TrEMBL and PIR content also has to deal with these legacy problems. The flat file format that has been used, extended and maintained over many years is readable for human eyes but difficult to process algorithmically. With the exponential growth of protein information, doubling the data amount approximately every two years, downloading the whole sets and parsing out what is actually needed on the end users side became cumbersome and CPU intensive. With UniProt, we are trying to overcome this situation and give users query access to our data sources. We are trying to make parsing procedures on the user side redundant by providing a solid interface to the data set as a whole and also to the individual protein entries.

This presentation will give an overview over the libraries that are used at the EBI to present the UniProt data and to automatically process protein entries. Those libraries will be made available shortly for users and additional services are in the planning. The speaker will be going into some detail in terms of how to make use of the read only copy of the UniProt data base inside Java program code

 

   
  Home | SDSC | UCSD | Campus Map | Contact Info
   
 
  The San Diego Supercomputer Center (SDSC) is a research unit of the University of California, San Diego, and the leading-edge site of the National Partnership for Advanced Computational Infrastructure. SDSC researchers conduct studies in computational science, develop high-performance computing and networking technologies, and participate in NPACI activities.

SDSC -- UC San Diego, MC 0505 -- 9500 Gilman Drive -- La Jolla, CA 92093-0505 -- 858-534-5000 -- 858-534-5152 (fax)
info@sdsc.edu © 2001, The Regents of the University of California