Volume 2 Chapter 10 Search Menu PDB-SEQUENCE

Back to Table of Contents

*PDBSEQ

Function

*PDBSEQ is used to test for residue sequences in Protein Data Bank (PDB) entries.

Graphics QUEST3D Procedure

The command line interface to the PDB sequence searching has been replaced with the PDB-SEARCH menu for the Graphical option. The following section is still relevant for users who do not use the graphical interface.

CSD Contents

An agreement has been made with the producers of the Protein Data Bank whereby the Cambridge Structural Database will contain bibliographic/sequence PDB entries. This arrangement will allow users of the CSD to search for information on macromolecules and, using the PDB ID code, retrieve relevant numeric data from the PDB.

This scheme will be initiated in the October 1993 release of the CSD with the inclusion of 1007 entries corresponding to the October 1992 release of the PDB.

The content of the entries is illustrated by the two examples on the next page.

(i)
The reference code of each entry takes the form 0ABC0n where nABC is the PDB ID code.

(ii)
All entries have a category flag value of 4 and associated bit screen 122 is set.

(iii)
The compound name field (#COMPND) contains the name of the macromolecule and its source.

(iv)
The qualifier phrase field (#QUAL) contains a maximum of 7 information types :

(v)
The author field (#AUTHOR) contains the name(s) of the author(s) of the principal publication associated with the structure of the macromolecule.

(vi)
The journal reference field (#JRNL) contains the journal coden, volume, page and year of the principal publication associated with the structure of the macromolecule.

(vii)
The residue sequence field (#SEQRES) contains, for each chain :

Basic QUEST Procedure

For machine-specific implementations, the PDB is distributed as a separate computer file. Users of the machine-independent package will find these entries combined at the the start of the CSD database of small-molecule entries.

If you search a database which includes both macromolecules and small molecules then you can confine the search to macromolecules using *CATFlag equal to 4 or bit screen 122. *PDBSEQ searches automatically search on screen 141 (SEQRES field present).

To search for text in the compound name and qualifier phrase fields you should use the *COMPound and *QUALifier tests.

For literature citation searches you should use the *AUTHor, *SURName, *CODEn, *VOLUme, *PAGE and *YEAR tests.

Searching of residue sequence information is specified by supplying a packet of instructions, analogous to those used in *PEPTIDE searches.

The search test packet must start with:
Tn  *PDBSEQ

and terminate with:
END

Between these two records a number of keywords can be used:

(a) PSEQ

This record defines a search sequence in terms of standard and non-standard residue names.

Common amino-acids are represented by their 3-letter code symbols (in upper-case or lower-case or a mixture of both).

The current set of symbols is listed in Appendix 16. Since PDB entries include oligonucleotides the symbols can also consist of 1-character and 2-character codes.

If an entry contains only one chain then a hit is registered when the first occurrence of the search sequence is encountered (to find all occurrences of the search sequence in the chain the keyword EXHAustive must be used - see later).

If an entry contains more than one chain QUEST will search all chains for the first occurrence of the search sequence in each chain (to find all occurrences of the search sequence in all chains the keyword EXHAustive must be used - see later).

It is important to note that a packet of instructions can contain more than one PSEQ record.

An example of a simple sequence search is shown below.

?0INS04
 
#SYSCAT cat 4
 
#COMPND Insulin;
Source: Pig (Sus scrofa)
 
#QUAL  ID-4INS;
Deposition date 800710;
Class:Hormone;
Data contributed by G.G.Dodson,E.J.Dodson,D.C.Hodgkin,N.W.Isaacs,
M.Vijayan;
Supersedes 900415 the existing entry 1INS;
Resolution 1.5A
 
#AUTHOR E.N.Baker,T.L.Blundell,J.F.Cutfield,S.M.Cutfield,E.J.Dodson,
G.G.Dodson,D.M.Crowfoot Hodgkin,R.E.Hubbard,N.W.Isaacs,C.D.Reynolds,
K.Sakabe,N.Sakabe,N.M.Vijayan
 
#JRNL  441,319,369,1988
 
#SEQRES Chain A 21 residues GLY-ILE-VAL-GLU-GLN-CYS-CYS-THR-SER-ILE-CYS-SER-LEU-
TYR-GLN-LEU-GLU-ASN-TYR-CYS-ASN Chain B 30 residues PHE-VAL-ASN-GLN-HIS-LEU-CYS-
GLY-SER-HIS-LEU-VAL-GLU-ALA-LEU-TYR-LEU-VAL-CYS-GLY-GLU-ARG-GLY-PHE-PHE-TYR-THR-
PRO-LYS-ALA Chain C 21 residues GLY-ILE-VAL-GLU-GLN-CYS-CYS-THR-SER-ILE-CYS-SER-
LEU-TYR-GLN-LEU-GLU-ASN-TYR-CYS-ASN Chain D 30 residues PHE-VAL-ASN-GLN-HIS-LEU-
CYS-GLY-SER-HIS-LEU-VAL-GLU-ALA-LEU-TYR-LEU-VAL-CYS-GLY-GLU-ARG-GLY-PHE-PHE-TYR-
THR-PRO-LYS-ALA
 
.................................................................................
 
?0APD02
 
#SYSCAT cat 4
 
#COMPND Apolipoprotein D (model);
Source: Human (Homo sapiens)
 
#QUAL  ID-2APD;
Deposition date 920421;
Class:Lipocalin;
Data contributed by M.C.Peitsch,M.S.Boguski
Experimental:theoretical model;
Supersedes 921015 the existing entry 1APD
 
#AUTHOR M.C.Peitsch,M.S.Boguski
 
#JRNL  822,2,197,1990
 
#SEQRES Chain 169 residues GLN-ALA-PHE-HIS-LEU-GLY-LYS-CYS-PRO-ASN-PRO-PRO-VAL-
GLN-GLU-ASN-PHE-ASP-VAL-ASN-LYS-TYR-LEU-GLY-ARG-TRP-TYR-GLU-ILE-GLU-LYS-ILE-PRO-
THR-THR-PHE-GLU-ASN-GLY-ARG-CYS-ILE-GLN-ALA-ASN-TYR-SER-LEU-MET-GLU-ASN-GLY-LYS-
ILE-LYS-VAL-LEU-ASN-GLN-GLU-LEU-ARG-ALA-ASP-GLY-THR-VAL-ASN-GLN-ILE-GLU-GLY-GLU-
ALA-THR-PRO-VAL-ASN-LEU-THR-GLU-PRO-ALA-LYS-LEU-GLU-VAL-LYS-PHE-SER-TRP-PHE-MET-
PRO-SER-ALA-PRO-TYR-TRP-ILE-LEU-ALA-THR-ASP-TYR-GLU-ASN-TYR-ALA-LEU-VAL-TYR-SER-
CYS-THR-CYS-ILE-ILE-GLN-LEU-PHE-HIS-VAL-ASP-PHE-ALA-TRP-ILE-LEU-ALA-ARG-ASN-PRO-
ASN-LEU-PRO-PRO-GLU-THR-VAL-ASP-SER-LEU-LYS-ASN-ILE-LEU-THR-SER-ASN-ASN-ILE-ASP-
VAL-LYS-LYS-MET-THR-VAL-THR-ASP-GLN-VAL-ASN-CYS-PRO-LYS-LEU-SER

Ex.1

T1  *PDBSEQ
PSEQ  -GLU-GLU-LEU-
END
QUES  T1

The following hit is registered :

---------+---------+---------+---------+---------+---------+---------+---------+
0ZTA02
GCN4 Leucine zipper; Source: Synthetic polypeptide corresponding to the leucine
      zipper of the yeast (Saccharomyces cerevisiae) transcriptional activator G
      CN4
ID-2ZTA; Deposition date 910705; Class:Leucine Zipper; Data contributed by E.K.O
      'Shea,J.D.Klemm,P.S.Kim,T.Alber; Resolution 1.8A
E.K.O'Shea,J.D.Klemm,P.S.Kim,T.Alber
Science, 254, 539,1991
Chain A,   68 residues, Test 1, Start   11 -VAL-GLU-GLU-LEU-LEU-
---------+---------+---------+---------+---------+---------+---------+---------+
Type "K"(Keep), "R"(Reject) or "O"(for list of options)
 

Notes

*REFC=0ZTA02 *COMP=GCN4 Leucine zipper; Source: Synthetic polypeptide corresponding to the le ucine zipper of the yeast (Saccharomyces cerevisiae) transcriptional activator G CN4 // *QUAL=ID-2ZTA; Deposition date 910705; Class:Leucine Zipper; Data contrib uted by E.K.O'Shea,J.D.Klemm,P.S.Kim,T.Alber; Resolution 1.8A // *AUTH=E.K.O'She a,J.D.Klemm,P.S.Kim,T.Alber // *CODE=38(Science) // *VOLU= 254 // *PAGE= 539 // *YEAR=1991 // *PREF=0ZTA02 // *ADAT=930909 // *MDAT=930909 // *MSDB=0 // *CASN=0 // *NBSI=0 // *CDRE=0 // *BATC=0 // *BCLA=99 // *TOLE=.40 // *COOR=0 // *SPGN=0 // *SPAC= // *RFAC=0 // *TEMP=295 // *MAXA=0 // *ZVAL=0 // *DENM=0 // *DENX=0 / / *DENC=0 // *CELA=0 // *CELB=0 // *CELC=0 // *ALPH=0 // *BETA=0 // *GAMM=0 // * MA27=0 // *MA28=0 // *RCP1=0 // *RCP2=0 // *RCP3=0 // *RCP4=0 // *RCP5=0 // *RCP 6=0 // *SIGF=0 // *MATF=0 // *INTF=0 // *CATF=4 // *METR=0 // *BRV2=0 // *BRV1=0 // *RCVO=0 // *MA43=811226177 // *MA44=808591392 // *SCOR=0 // *MA46=0 // *MA47 =0 // *MA48=0 // *ZPRI=0 // *NRES=0 // *BITS=0 // *NW01=364 // *NW02=27 // *NW03 =0 // *JRNL=Science // *MVOL=0 // *CVOL=0 // Chain A, 68 residues, ACE-ARG-MET-LYS-GLN-LEU-GLU-ASP-LYS-VAL-GLU-GLU- 13 LEU-LEU-SER-LYS-ASN-TYR-HIS-LEU-GLU-ASN-GLU-VAL-ALA-ARG-LEU-LYS-LYS-LEU- 31 VAL-GLY-GLU-ARG-ACE-ARG-MET-LYS-GLN-LEU-GLU-ASP-LYS-VAL-GLU-GLU-LEU-LEU- 49 SER-LYS-ASN-TYR-HIS-LEU-GLU-ASN-GLU-VAL-ALA-ARG-LEU-LYS-LYS-LEU-VAL-GLY- 67 GLU-ARG ---------+---------+---------+---------+---------+---------+---------+---------+

Notes

Sequence Sections

If an instruction line contains no recognised keyword then QUEST attempts to treat the line as a defined sequence section, provided that the content takes the form:

aaaaaa = xxxxxxxxxxxxxxxxxxxxxxxx

When such a sequence section is used in a PSEQ instruction then the identifier must be preceded by $

Ex.2 Search packet A

T8  *PDBSEQ
PSEQ  -ARG-THR-GLY-
PSEQ  -TRP-ASP-ALA-
END

A search using packet A would register hits for entries in which :

the 2 search sequences are in different chains

the 2 search sequences are in the same chain separated by one or more residues

If we wish to find entries where these 2 sequences are separated by THR then we can make use of the sequence section feature as follows:

Search packet B

T5  *PDBSEQ
SEQC = ARG-THR-GLY
SEQD = TRP-ASP-ALA
PSEQ  -$SEQC-THR-$SEQD-
END

A search using packet B would register the following hit :

---------+---------+---------+---------+---------+---------+---------+---------+
0L1701
Lysozyme (E.C.3.2.1.17) (mutant with Ile 3 replaced by Val) (I3V); Source: Bacte
      riophage T4 (mutant gene is derived from the M13 plasmid by cloning of the
       T4 lysozyme gene)
ID-1L17; Deposition date 890501; Class:Hydrolase (O-glycosyl); Data contributed
      by M.Matsumura,S.Dao-pin,B.W.Matthews; Resolution 1.7A
M.Matsumura,W.J.Becktel,B.W.Matthews
Nature (London), 334, 406,1988
 
Chain 1,  164 residues, Test 5, Start  154
-PHE-ARG-THR-GLY-THR-TRP-ASP-ALA-TYR-
---------+---------+---------+---------+---------+---------+---------+---------+

Note that in specifying a sequence section there is no need to indicate leading and trailing - symbols.

These are specified when the sequence sections are used in PSEQ instructions.

(b) PDEF

PDEF instructions are used to define new residues in terms of standard residue names.

The last example illustrates that in combining residues in PDEF instructions a blank is equivalent to the + symbol.

It also illustrates that the effect of PDEF instructions is cumulative.

Ex.3

T3  *PDBSEQ
PDEF  ABC = ASN  + ASP
PSEQ  -TYR-ALA-ABC
END

In this search ABC is terminal and can be ASN or ASP.

The following hit is registered :

---------+---------+---------+---------+---------+---------+---------+---------+
0CRN01
Crambin; Source: Abyssinian cabbage (Crambe abyssinica) seed
ID-1CRN; Deposition date 810430; Class:Plant Seed Protein; Data contributed by W
      .A.Hendrickson,M.M.Teeter; Resolution 1.5A
M.M.Teeter
Proc.Nat.Acad.Sci.U.S.A., 81, 6014,1984
Chain 1,   46 residues, Test 3, Start   44
-ASP-TYR-ALA-(ASN)
---------+---------+---------+---------+---------+---------+---------+---------+
Type "K"(Keep), "R"(Reject) or "O"(for list of options) m
*REFC=0CRN01
*COMP=Crambin; Source: Abyssinian cabbage (Crambe abyssinica) seed // *QUAL=ID-1
CRN; Deposition date 810430; Class:Plant Seed Protein; Data contributed by W.A.H
endrickson,M.M.Teeter; Resolution 1.5A // *AUTH=M.M.Teeter // *CODE=40(Proc.Nat.
Acad.Sci.U.S.A.) // *VOLU= 81 // *PAGE= 6014 // *YEAR=1984 // *PREF=0CRN01 // *A
DAT=930909 // *MDAT=930909 // *MSDB=0 // *CASN=0 // *NBSI=0 // *CDRE=0 // *BATC=
0 // *BCLA=99 // *TOLE=.40 // *COOR=0 // *SPGN=0 // *SPAC= // *RFAC=0 // *TEMP=2
95 // *MAXA=0 // *ZVAL=0 // *DENM=0 // *DENX=0 // *DENC=0 // *CELA=0 // *CELB=0
// *CELC=0 // *ALPH=0 // *BETA=0 // *GAMM=0 // *MA27=0 // *MA28=0 // *RCP1=0 //
*RCP2=0 // *RCP3=0 // *RCP4=0 // *RCP5=0 // *RCP6=0 // *SIGF=0 // *MATF=0 // *IN
TF=0 // *CATF=4 // *METR=0 // *BRV2=0 // *BRV1=0 // *RCVO=0 // *MA43=809718350
// *MA44=808525856 // *SCOR=0 // *MA46=0 // *MA47=0 // *MA48=0 // *ZPRI=0 // *NRE
S=0 // *BITS=0 // *NW01=236 // *NW02=20 // *NW03=0 // *JRNL=Proc.Nat.Acad.Sci.U.
S.A. // *MVOL=0 // *CVOL=0 //
Chain 1, 46 residues,  THR-THR-CYS-CYS-PRO-SER-ILE-VAL-ALA-ARG-SER-ASN-PHE-
14 ASN-VAL-CYS-ARG-LEU-PRO-GLY-THR-PRO-GLU-ALA-ILE-CYS-ALA-THR-TYR-THR-GLY-
32 CYS-ILE-ILE-ILE-PRO-GLY-ALA-THR-CYS-PRO-GLY-ASP-TYR-ALA-(ASN)
---------+---------+---------+---------+---------+---------+---------+---------+

Note that the residue ASN (part of the PDEF definition), shown above in brackets is shown in bold in actual display, to distinguish it from the inverse video used for exact matches..

(c) EXHA

If the instruction EXHAustive is included in the search packet then QUEST will search to the end of each entry to locate all occurrences of the search sequence(s) in the entry.

Ex.4

T4  *PDBSEQ
PDEF  ABC = ILE THR
PSEQ  -ALA-ABC-CYS-
EXHA
END

The following hit is registered :

---------+---------+---------+---------+---------+---------+---------+---------+
0CRN01
Crambin; Source: Abyssinian cabbage (Crambe abyssinica) seed
ID-1CRN; Deposition date 810430; Class:Plant Seed Protein; Data contributed by W
      .A.Hendrickson,M.M.Teeter; Resolution 1.5A
M.M.Teeter
Proc.Nat.Acad.Sci.U.S.A., 81, 6014,1984
Chain 1,   46 residues, Test 4, Start   24 -GLU-ALA-ILE-CYS-ALA-
Chain 1,   46 residues, Test 4, Start   38 -GLY-ALA-THR-CYS-PRO-
---------+---------+---------+---------+---------+---------+---------+---------+
Type "K"(Keep), "R"(Reject) or "O"(for list of options)      

Ex.5

T5  *PDBSEQ
PDEF  ABC = ILE THR
PSEQ  -ALA-ABC-CYS-
END

This search differs from Ex.4 in that the EXHA instruction is omitted. The hit display is now :

---------+---------+---------+---------+---------+---------+---------+---------+
0CRN01
Crambin; Source: Abyssinian cabbage (Crambe abyssinica) seed
ID-1CRN; Deposition date 810430; Class:Plant Seed Protein; Data contributed by W
      .A.Hendrickson,M.M.Teeter; Resolution 1.5A
M.M.Teeter
Proc.Nat.Acad.Sci.U.S.A., 81, 6014,1984
Chain 1,   46 residues, Test 5, Start   24 -GLU-ALA-ILE-CYS-ALA-
---------+---------+---------+---------+---------+---------+---------+---------+
Type "K"(Keep), "R"(Reject) or "O"(for list of options)        

(d) SAME

If a search packet contains 2 or more search sequences then, by default, a hit is registered if the search sequences are located, either in the same chain or in different chains.

If the instruction SAME is included in the search packet then a hit is registered only if the search sequences are located in the same chain.

Ex.6

T6  *PDBSEQ
SAME
PSEQ  -VAL-VAL-SER-
PSEQ  -LEU-ASN-SER-
END

The following hit is registered :

---------+---------+---------+---------+---------+---------+---------+---------+
0TAB01
Trypsin (E.C.3.4.21.4) complex with Bowman-Birk inhibitor (AB-I); Source: Bovine
       (Bos taurus) pancreas and Adzuki beans (Phaseolus angularis)
ID-1TAB; Deposition date 901015; Class:Hydrolase (serine proteinase); Data contr
      ibuted by Y.Tsunogae,I.Tanaka,T.Yamane,J.-I.Kikkawa,T.Ashida, C.Ishikawa,K
      .Watanabe,S.Nakamura,K.Takahashi; Resolution 2.3A
Y.Tsunogae,I.Tanaka,T.Yamane,J.-I.Kikkawa,T.Ashida,C.Ishikawa, K.Watanabe,S.Naka
      mura,K.Takahashi
J.Biochem., 100, 1637,1986
Chain E,  223 residues, Test 6, Start   18 -SER-LEU-ASN-SER-GLY-
Chain E,  223 residues, Test 6, Start   35 -TRP-VAL-VAL-SER-ALA-
Chain I,   82 residues, No Hits
---------+---------+---------+---------+---------+---------+---------+---------+
Type "K"(Keep), "R"(Reject) or "O"(for list of options) 

(e) NCHA

The NCHAin command allows the user to restrict the search to entries containing a specified number of chains.

The command takes the form:

Thus

NCHA .EQ. 1 restricts the search to single-chain entries

NCHA 1 - 2 restricts the search to single- and double-chain entries.

Ex.7

T7  *PDBSEQ
NCHA  .EQ.  1
PSEQ  -ARG-TYR-SER-
END

The following hit is registered :

---------+---------+---------+---------+---------+---------+---------+---------+
0AIT02
Tendamistat; Source: (Streptomyces tendae)
ID-2AIT; Deposition date 890524; Class:alpha-Amylase Inhibitor; Experimental:NMR
      ; Data contributed by A.D.Kline,W.Braun,P.Guntert,M.Billeter, K.Wuthrich
A.D.Kline,W.Braun,K.Wuthrich
J.Mol.Biol., 204, 675,1988
Chain 1,   74 residues, Test 7, Start   19 -TRP-ARG-TYR-SER-GLN-
---------+---------+---------+---------+---------+---------+---------+---------+
Type "K"(Keep), "R"(Reject) or "O"(for list of options) M
*REFC=0AIT02
*COMP=Tendamistat; Source: (Streptomyces tendae) // *QUAL=ID-2AIT; Deposition da
te 890524; Class:alpha-Amylase Inhibitor; Experimental:NMR; Data contributed by
A.D.Kline,W.Braun,P.Guntert,M.Billeter, K.Wuthrich // *AUTH=A.D.Kline,W.Braun,K.
Wuthrich // *CODE=70(J.Mol.Biol.) // *VOLU= 204 // *PAGE= 675 // *YEAR=1988 // *
PREF=0AIT02 // *ADAT=930909 // *MDAT=930909 // *MSDB=0 // *CASN=0 // *NBSI=0 //
*CDRE=0 // *BATC=0 // *BCLA=99 // *TOLE=.40 // *COOR=0 // *SPGN=0 // *SPAC= // *
RFAC=0 // *TEMP=295 // *MAXA=0 // *ZVAL=0 // *DENM=0 // *DENX=0 // *DENC=0 // *C
ELA=0 // *CELB=0 // *CELC=0 // *ALPH=0 // *BETA=0 // *GAMM=0 // *MA27=0 // *MA28
=0 // *RCP1=0 // *RCP2=0 // *RCP3=0 // *RCP4=0 // *RCP5=0 // *RCP6=0 // *SIGF=0
// *MATF=0 // *INTF=0 // *CATF=4 // *METR=0 // *BRV2=0 // *BRV1=0 // *RCVO=0 //
*MA43=809584980 // *MA44=808591392 // *SCOR=0 // *MA46=0 // *MA47=0 // *MA48=0 /
/ *ZPRI=0 // *NRES=0 // *BITS=0 // *NW01=264 // *NW02=29 // *NW03=0 // *JRNL=J.M
ol.Biol. // *MVOL=0 // *CVOL=0 //
     Chain 1, 74 residues,  ASP-THR-THR-VAL-SER-GLU-PRO-ALA-PRO-SER-CYS-VAL-THR-
  14 LEU-TYR-GLN-SER-TRP-ARG-TYR-SER-GLN-ALA-ASP-ASN-GLY-CYS-ALA-GLU-THR-VAL-
  32 THR-VAL-LYS-VAL-VAL-TYR-GLU-ASP-ASP-THR-GLU-GLY-LEU-CYS-TYR-ALA-VAL-ALA-
  50 PRO-GLY-GLN-ILE-THR-THR-VAL-GLY-ASP-GLY-TYR-ILE-GLY-SER-HIS-GLY-HIS-ALA-
  68 ARG-TYR-LEU-ALA-ARG-CYS-LEU
---------+---------+---------+---------+---------+---------+---------+---------+
Type "K"(Keep), "R"(Reject) or "O"(for list of options)

(f) NRES

The NRESidues command allows the user to restrict the search to entries containing a specified number of residues in a chain.

The command takes the form:

Thus

NRES .LT. 100 restricts the search to chains containing less than 100 residues

NRES 300-500 restricts the search to chains containing 300-500 residues.

Ex.8

T8  *PDBSEQ
NRES .LE. 20
PSEQ  -AGL-GAL-AGL-
END

The following hit is registered :

---------+---------+---------+---------+---------+---------+---------+---------+
0AGA01
Agarose (an alternating copolymer of 3-linked beta-D-galactopyranose and 4-linke
      d 3,6-anhydro-alpha-L-galactopyranose); Source: Red seaweed (Rhodophycae)
      from several sources were studied
ID-1AGA; Deposition date 780523; Class:Texture of Connective Tissue; Data contri
      buted by S.Arnott; Resolution 3.0A
S.Arnott,A.Fulmer,W.E.Scott,I.C.M.Dea,R.Moorhouse,D.A.Rees
J.Mol.Biol., 90, 269,1974
Chain A,    6 residues, Test 8, Start    2  GAL-AGL-GAL-AGL-GAL-
Chain B,    6 residues, Test 8, Start    2  GAL-AGL-GAL-AGL-GAL-
---------+---------+---------+---------+---------+---------+---------+---------+
Type "K"(Keep), "R"(Reject) or "O"(for list of options) m
*REFC=0AGA01
*COMP=Agarose (an alternating copolymer of 3-linked beta-D-galactopyranose and 4
-linked 3,6-anhydro-alpha-L-galactopyranose); Source: Red seaweed (Rhodophycae)
from several sources were studied // *QUAL=ID-1AGA; Deposition date 780523; Clas
s:Texture of Connective Tissue; Data contributed by S.Arnott; Resolution 3.0A //
 *AUTH=S.Arnott,A.Fulmer,W.E.Scott,I.C.M.Dea,R.Moorhouse,D.A.Rees // *CODE=70(J.
Mol.Biol.) // *VOLU= 90 // *PAGE= 269 // *YEAR=1974 // *PREF=0AGA01 // *ADAT=930
909 // *MDAT=930909 // *MSDB=0 // *CASN=0 // *NBSI=0 // *CDRE=0 // *BATC=0 // *B
CLA=99 // *TOLE=.40 // *COOR=0 // *SPGN=0 // *SPAC= // *RFAC=0 // *TEMP=295 // *
MAXA=0 // *ZVAL=0 // *DENM=0 // *DENX=0 // *DENC=0 // *CELA=0 // *CELB=0 // *CEL
C=0 // *ALPH=0 // *BETA=0 // *GAMM=0 // *MA27=0 // *MA28=0 // *RCP1=0 // *RCP2=0
 // *RCP3=0 // *RCP4=0 // *RCP5=0 // *RCP6=0 // *SIGF=0 // *MATF=0 // *INTF=0 //
 *CATF=4 // *METR=0 // *BRV2=0 // *BRV1=0 // *RCVO=0 // *MA43=809584449 // *MA44
=808525856 // *SCOR=0 // *MA46=0 // *MA47=0 // *MA48=0 // *ZPRI=0 // *NRES=0 //
*BITS=0 // *NW01=404 // *NW02=11 // *NW03=0 // *JRNL=J.Mol.Biol. // *MVOL=0 // *
CVOL=0 //
     Chain A, 6 residues,   GAL-AGL-GAL-AGL-GAL-AGL
     Chain B, 6 residues,   GAL-AGL-GAL-AGL-GAL-AGL
---------+---------+---------+---------+---------+---------+---------+---------+
Type "K"(Keep), "R"(Reject) or "O"(for list of options)

(g) FULL

The FULL command forces a full display of sequence information for each hit entry.

The command takes the form

FULL CHAIN displays the complete sequences of the chains in which the search sequences are located. This is the default if FULL is given with no qualifier.

FULL ALL displays , for each hit entry, the complete sequence information for the entry ie. all chains.

Ex.9

T9  *PDBSEQ
FULL CHAIN
EXHA
PSEQ  -SER-SER-GLU-
END

The following hit is registered :

---------+---------+---------+---------+---------+---------+---------+---------+
0HDD01
Engrailed homeodomain complex with DNA; Source: Fruit fly (Drosophila melanogast
      er) expressed in (Escherichia coli)
ID-1HDD; Deposition date 910916; Class:DNA Binding; Data contributed by C.R.Kiss
      inger,B.Liu,C.O.Pabo,E.Martin-Blanco,T.B.Kornberg; Resolution 2.8A
C.R.Kissinger,B.Liu,E.Martin-Blanco,T.B.Kornberg,C.O.Pabo
Cell (Cambridge,Mass.), 63, 579,1990
     Chain C, 61 residues,  MET-ASP-GLU-LYS-ARG-PRO-ARG-THR-ALA-PHE-SER-SER-GLU-
  14 GLN-LEU-ALA-ARG-LEU-LYS-ARG-GLU-PHE-ASN-GLU-ASN-ARG-TYR-LEU-THR-GLU-ARG-
  32 ARG-ARG-GLN-GLN-LEU-SER-SER-GLU-LEU-GLY-LEU-ASN-GLU-ALA-GLN-ILE-LYS-ILE-
  50 TRP-PHE-GLN-ASN-LYS-ARG-ALA-LYS-ILE-LYS-LYS-SER
     Chain D, 61 residues,  MET-ASP-GLU-LYS-ARG-PRO-ARG-THR-ALA-PHE-SER-SER-GLU-
  14 GLN-LEU-ALA-ARG-LEU-LYS-ARG-GLU-PHE-ASN-GLU-ASN-ARG-TYR-LEU-THR-GLU-ARG-
  32 ARG-ARG-GLN-GLN-LEU-SER-SER-GLU-LEU-GLY-LEU-ASN-GLU-ALA-GLN-ILE-LYS-ILE-
  50 TRP-PHE-GLN-ASN-LYS-ARG-ALA-LYS-ILE-LYS-LYS-SER
     Chain A, 21 residues,  No Hits
     Chain B, 21 residues,  No Hits
---------+---------+---------+---------+---------+---------+---------+---------+
Type "K"(Keep), "R"(Reject) or "O"(for list of options)    

Note that M will give the short display in this case.

We will now repeat the search in Ex.9 but use the FULL ALL instruction rather than FULL CHAIN.

Ex.10

T10  *PDBSEQ
FULL ALL
EXHA
PSEQ  -SER-SER-GLU-
END

The following hit is registered :

---------+---------+---------+---------+---------+---------+---------+---------+
0HDD01
Engrailed homeodomain complex with DNA; Source: Fruit fly (Drosophila melanogast
      er) expressed in (Escherichia coli)
ID-1HDD; Deposition date 910916; Class:DNA Binding; Data contributed by C.R.Kiss
      inger,B.Liu,C.O.Pabo,E.Martin-Blanco,T.B.Kornberg; Resolution 2.8A
C.R.Kissinger,B.Liu,E.Martin-Blanco,T.B.Kornberg,C.O.Pabo
Cell (Cambridge,Mass.), 63, 579,1990
     Chain C, 61 residues,  MET-ASP-GLU-LYS-ARG-PRO-ARG-THR-ALA-PHE-SER-SER-GLU-
  14 GLN-LEU-ALA-ARG-LEU-LYS-ARG-GLU-PHE-ASN-GLU-ASN-ARG-TYR-LEU-THR-GLU-ARG-
  32 ARG-ARG-GLN-GLN-LEU-SER-SER-GLU-LEU-GLY-LEU-ASN-GLU-ALA-GLN-ILE-LYS-ILE-
  50 TRP-PHE-GLN-ASN-LYS-ARG-ALA-LYS-ILE-LYS-LYS-SER
     Chain D, 61 residues,  MET-ASP-GLU-LYS-ARG-PRO-ARG-THR-ALA-PHE-SER-SER-GLU-
  14 GLN-LEU-ALA-ARG-LEU-LYS-ARG-GLU-PHE-ASN-GLU-ASN-ARG-TYR-LEU-THR-GLU-ARG-
  32 ARG-ARG-GLN-GLN-LEU-SER-SER-GLU-LEU-GLY-LEU-ASN-GLU-ALA-GLN-ILE-LYS-ILE-
  50 TRP-PHE-GLN-ASN-LYS-ARG-ALA-LYS-ILE-LYS-LYS-SER
     Chain A, 21 residues,  T-T-T-T-G-C-C-A-T-G-T-A-A-T-T-A-C-C-T-A-A
     Chain B, 21 residues,  A-T-T-A-G-G-T-A-A-T-T-A-C-A-T-G-G-C-A-A-A
---------+---------+---------+---------+---------+---------+---------+---------+
Type "K"(Keep), "R"(Reject) or "O"(for list of options)

Related Bit Screens

The relevant bit screen is 122.

If you search a database which includes both macromolecules and small molecules then you can confine the search to macromolecules using bit screen 122.

Bit 141 indicates that the SEQRES field is present.

Back to Table of Contents

Volume 2 Chapter 10 Search Menu *PEPTIDE.