Volume 2 Chapter 10 Search Menu PDB-SEQUENCE
*PDBSEQ is used to test for residue sequences in Protein Data Bank (PDB) entries.
Graphics QUEST3D Procedure
The command line interface to the PDB sequence searching has been replaced with the PDB-SEARCH menu for the Graphical option. The following section is still relevant for users who do not use the graphical interface.
CSD Contents
An agreement has been made with the producers of the Protein Data Bank whereby the Cambridge Structural Database will contain bibliographic/sequence PDB entries. This arrangement will allow users of the CSD to search for information on macromolecules and, using the PDB ID code, retrieve relevant numeric data from the PDB.
This scheme will be initiated in the October 1993 release of the CSD with the inclusion of 1007 entries corresponding to the October 1992 release of the PDB.
The content of the entries is illustrated by the two examples on the next page.
For machine-specific implementations, the PDB is distributed as a separate computer file. Users of the machine-independent package will find these entries combined at the the start of the CSD database of small-molecule entries.
If you search a database which includes both macromolecules and small molecules then you can confine the search to macromolecules using *CATFlag equal to 4 or bit screen 122. *PDBSEQ searches automatically search on screen 141 (SEQRES field present).
To search for text in the compound name and qualifier phrase fields you should use the *COMPound and *QUALifier tests.
For literature citation searches you should use the *AUTHor, *SURName, *CODEn, *VOLUme, *PAGE and *YEAR tests.
Searching of residue sequence information is specified by supplying a packet of instructions, analogous to those used in *PEPTIDE searches.
Tn *PDBSEQ
END
(a) PSEQ
This record defines a search sequence in terms of standard and non-standard residue names.
PSEQ -VAL-ARG-GLY-
PSEQ VAL-ARG-GLY-
PSEQ -VAL-ARG-GLY
PSEQ %-VAL-ARG-GLY-%
The current set of symbols is listed in Appendix 16. Since PDB entries include oligonucleotides the symbols can also consist of 1-character and 2-character codes.
PSEQ -VAL-ANY-GLY-
If an entry contains more than one chain QUEST will search all chains for the first occurrence of the search sequence in each chain (to find all occurrences of the search sequence in all chains the keyword EXHAustive must be used - see later).
It is important to note that a packet of instructions can contain more than one PSEQ record.
PSEQ -VAL-ARG-GLY- PSEQ -SER-SER-GLU-
?0INS04 #SYSCAT cat 4 #COMPND Insulin; Source: Pig (Sus scrofa) #QUAL ID-4INS; Deposition date 800710; Class:Hormone; Data contributed by G.G.Dodson,E.J.Dodson,D.C.Hodgkin,N.W.Isaacs, M.Vijayan; Supersedes 900415 the existing entry 1INS; Resolution 1.5A #AUTHOR E.N.Baker,T.L.Blundell,J.F.Cutfield,S.M.Cutfield,E.J.Dodson, G.G.Dodson,D.M.Crowfoot Hodgkin,R.E.Hubbard,N.W.Isaacs,C.D.Reynolds, K.Sakabe,N.Sakabe,N.M.Vijayan #JRNL 441,319,369,1988 #SEQRES Chain A 21 residues GLY-ILE-VAL-GLU-GLN-CYS-CYS-THR-SER-ILE-CYS-SER-LEU- TYR-GLN-LEU-GLU-ASN-TYR-CYS-ASN Chain B 30 residues PHE-VAL-ASN-GLN-HIS-LEU-CYS- GLY-SER-HIS-LEU-VAL-GLU-ALA-LEU-TYR-LEU-VAL-CYS-GLY-GLU-ARG-GLY-PHE-PHE-TYR-THR- PRO-LYS-ALA Chain C 21 residues GLY-ILE-VAL-GLU-GLN-CYS-CYS-THR-SER-ILE-CYS-SER- LEU-TYR-GLN-LEU-GLU-ASN-TYR-CYS-ASN Chain D 30 residues PHE-VAL-ASN-GLN-HIS-LEU- CYS-GLY-SER-HIS-LEU-VAL-GLU-ALA-LEU-TYR-LEU-VAL-CYS-GLY-GLU-ARG-GLY-PHE-PHE-TYR- THR-PRO-LYS-ALA ................................................................................. ?0APD02 #SYSCAT cat 4 #COMPND Apolipoprotein D (model); Source: Human (Homo sapiens) #QUAL ID-2APD; Deposition date 920421; Class:Lipocalin; Data contributed by M.C.Peitsch,M.S.Boguski Experimental:theoretical model; Supersedes 921015 the existing entry 1APD #AUTHOR M.C.Peitsch,M.S.Boguski #JRNL 822,2,197,1990 #SEQRES Chain 169 residues GLN-ALA-PHE-HIS-LEU-GLY-LYS-CYS-PRO-ASN-PRO-PRO-VAL- GLN-GLU-ASN-PHE-ASP-VAL-ASN-LYS-TYR-LEU-GLY-ARG-TRP-TYR-GLU-ILE-GLU-LYS-ILE-PRO- THR-THR-PHE-GLU-ASN-GLY-ARG-CYS-ILE-GLN-ALA-ASN-TYR-SER-LEU-MET-GLU-ASN-GLY-LYS- ILE-LYS-VAL-LEU-ASN-GLN-GLU-LEU-ARG-ALA-ASP-GLY-THR-VAL-ASN-GLN-ILE-GLU-GLY-GLU- ALA-THR-PRO-VAL-ASN-LEU-THR-GLU-PRO-ALA-LYS-LEU-GLU-VAL-LYS-PHE-SER-TRP-PHE-MET- PRO-SER-ALA-PRO-TYR-TRP-ILE-LEU-ALA-THR-ASP-TYR-GLU-ASN-TYR-ALA-LEU-VAL-TYR-SER- CYS-THR-CYS-ILE-ILE-GLN-LEU-PHE-HIS-VAL-ASP-PHE-ALA-TRP-ILE-LEU-ALA-ARG-ASN-PRO- ASN-LEU-PRO-PRO-GLU-THR-VAL-ASP-SER-LEU-LYS-ASN-ILE-LEU-THR-SER-ASN-ASN-ILE-ASP- VAL-LYS-LYS-MET-THR-VAL-THR-ASP-GLN-VAL-ASN-CYS-PRO-LYS-LEU-SER
Ex.1
T1 *PDBSEQ PSEQ -GLU-GLU-LEU- END QUES T1
The following hit is registered :
---------+---------+---------+---------+---------+---------+---------+---------+
0ZTA02
GCN4 Leucine zipper; Source: Synthetic polypeptide corresponding to the leucine
zipper of the yeast (Saccharomyces cerevisiae) transcriptional activator G
CN4
ID-2ZTA; Deposition date 910705; Class:Leucine Zipper; Data contributed by E.K.O
'Shea,J.D.Klemm,P.S.Kim,T.Alber; Resolution 1.8A
E.K.O'Shea,J.D.Klemm,P.S.Kim,T.Alber
Science, 254, 539,1991
Chain A, 68 residues, Test 1, Start 11 -VAL-GLU-GLU-LEU-LEU-
---------+---------+---------+---------+---------+---------+---------+---------+
Type "K"(Keep), "R"(Reject) or "O"(for list of options)
Notes
Notes
Instead, the residues of the search sequence are written in upper-case and all other residues in lower-case. Those residues which correspond to ANY or PDEF residues in the query are written in mixed case, e.g. AlA.
Sequence Sections
If an instruction line contains no recognised keyword then QUEST attempts to treat the line as a defined sequence section, provided that the content takes the form:
aaaaaa = xxxxxxxxxxxxxxxxxxxxxxxx
Ex.2 Search packet A
T8 *PDBSEQ PSEQ -ARG-THR-GLY- PSEQ -TRP-ASP-ALA- END
A search using packet A would register hits for entries in which :
the 2 search sequences are in different chains
the 2 search sequences are in the same chain separated by one or more residues
If we wish to find entries where these 2 sequences are separated by THR then we can make use of the sequence section feature as follows:
Search packet B
T5 *PDBSEQ SEQC = ARG-THR-GLY SEQD = TRP-ASP-ALA PSEQ -$SEQC-THR-$SEQD- END
A search using packet B would register the following hit :
---------+---------+---------+---------+---------+---------+---------+---------+
0L1701
Lysozyme (E.C.3.2.1.17) (mutant with Ile 3 replaced by Val) (I3V); Source: Bacte
riophage T4 (mutant gene is derived from the M13 plasmid by cloning of the
T4 lysozyme gene)
ID-1L17; Deposition date 890501; Class:Hydrolase (O-glycosyl); Data contributed
by M.Matsumura,S.Dao-pin,B.W.Matthews; Resolution 1.7A
M.Matsumura,W.J.Becktel,B.W.Matthews
Nature (London), 334, 406,1988
Chain 1, 164 residues, Test 5, Start 154
-PHE-ARG-THR-GLY-THR-TRP-ASP-ALA-TYR-
---------+---------+---------+---------+---------+---------+---------+---------+
Note that in specifying a sequence section there is no need to indicate leading and trailing - symbols.
These are specified when the sequence sections are used in PSEQ instructions.
(b) PDEF
PDEF instructions are used to define new residues in terms of standard residue names.
It also illustrates that the effect of PDEF instructions is cumulative.
Ex.3
T3 *PDBSEQ PDEF ABC = ASN + ASP PSEQ -TYR-ALA-ABC END
In this search ABC is terminal and can be ASN or ASP.
The following hit is registered :
---------+---------+---------+---------+---------+---------+---------+---------+
0CRN01
Crambin; Source: Abyssinian cabbage (Crambe abyssinica) seed
ID-1CRN; Deposition date 810430; Class:Plant Seed Protein; Data contributed by W
.A.Hendrickson,M.M.Teeter; Resolution 1.5A
M.M.Teeter
Proc.Nat.Acad.Sci.U.S.A., 81, 6014,1984
Chain 1, 46 residues, Test 3, Start 44
-ASP-TYR-ALA-(ASN)
---------+---------+---------+---------+---------+---------+---------+---------+
Type "K"(Keep), "R"(Reject) or "O"(for list of options) m
*REFC=0CRN01
*COMP=Crambin; Source: Abyssinian cabbage (Crambe abyssinica) seed // *QUAL=ID-1
CRN; Deposition date 810430; Class:Plant Seed Protein; Data contributed by W.A.H
endrickson,M.M.Teeter; Resolution 1.5A // *AUTH=M.M.Teeter // *CODE=40(Proc.Nat.
Acad.Sci.U.S.A.) // *VOLU= 81 // *PAGE= 6014 // *YEAR=1984 // *PREF=0CRN01 // *A
DAT=930909 // *MDAT=930909 // *MSDB=0 // *CASN=0 // *NBSI=0 // *CDRE=0 // *BATC=
0 // *BCLA=99 // *TOLE=.40 // *COOR=0 // *SPGN=0 // *SPAC= // *RFAC=0 // *TEMP=2
95 // *MAXA=0 // *ZVAL=0 // *DENM=0 // *DENX=0 // *DENC=0 // *CELA=0 // *CELB=0
// *CELC=0 // *ALPH=0 // *BETA=0 // *GAMM=0 // *MA27=0 // *MA28=0 // *RCP1=0 //
*RCP2=0 // *RCP3=0 // *RCP4=0 // *RCP5=0 // *RCP6=0 // *SIGF=0 // *MATF=0 // *IN
TF=0 // *CATF=4 // *METR=0 // *BRV2=0 // *BRV1=0 // *RCVO=0 // *MA43=809718350
// *MA44=808525856 // *SCOR=0 // *MA46=0 // *MA47=0 // *MA48=0 // *ZPRI=0 // *NRE
S=0 // *BITS=0 // *NW01=236 // *NW02=20 // *NW03=0 // *JRNL=Proc.Nat.Acad.Sci.U.
S.A. // *MVOL=0 // *CVOL=0 //
Chain 1, 46 residues, THR-THR-CYS-CYS-PRO-SER-ILE-VAL-ALA-ARG-SER-ASN-PHE-
14 ASN-VAL-CYS-ARG-LEU-PRO-GLY-THR-PRO-GLU-ALA-ILE-CYS-ALA-THR-TYR-THR-GLY-
32 CYS-ILE-ILE-ILE-PRO-GLY-ALA-THR-CYS-PRO-GLY-ASP-TYR-ALA-(ASN)
---------+---------+---------+---------+---------+---------+---------+---------+
Note that the residue ASN (part of the PDEF definition), shown above in brackets is shown in bold in actual display, to distinguish it from the inverse video used for exact matches..
(c) EXHA
If the instruction EXHAustive is included in the search packet then QUEST will search to the end of each entry to locate all occurrences of the search sequence(s) in the entry.
Ex.4
T4 *PDBSEQ PDEF ABC = ILE THR PSEQ -ALA-ABC-CYS- EXHA END
The following hit is registered :
---------+---------+---------+---------+---------+---------+---------+---------+
0CRN01
Crambin; Source: Abyssinian cabbage (Crambe abyssinica) seed
ID-1CRN; Deposition date 810430; Class:Plant Seed Protein; Data contributed by W
.A.Hendrickson,M.M.Teeter; Resolution 1.5A
M.M.Teeter
Proc.Nat.Acad.Sci.U.S.A., 81, 6014,1984
Chain 1, 46 residues, Test 4, Start 24 -GLU-ALA-ILE-CYS-ALA-
Chain 1, 46 residues, Test 4, Start 38 -GLY-ALA-THR-CYS-PRO-
---------+---------+---------+---------+---------+---------+---------+---------+
Type "K"(Keep), "R"(Reject) or "O"(for list of options)
Ex.5
T5 *PDBSEQ PDEF ABC = ILE THR PSEQ -ALA-ABC-CYS- END
This search differs from Ex.4 in that the EXHA instruction is omitted. The hit display is now :
---------+---------+---------+---------+---------+---------+---------+---------+
0CRN01
Crambin; Source: Abyssinian cabbage (Crambe abyssinica) seed
ID-1CRN; Deposition date 810430; Class:Plant Seed Protein; Data contributed by W
.A.Hendrickson,M.M.Teeter; Resolution 1.5A
M.M.Teeter
Proc.Nat.Acad.Sci.U.S.A., 81, 6014,1984
Chain 1, 46 residues, Test 5, Start 24 -GLU-ALA-ILE-CYS-ALA-
---------+---------+---------+---------+---------+---------+---------+---------+
Type "K"(Keep), "R"(Reject) or "O"(for list of options)
(d) SAME
If a search packet contains 2 or more search sequences then, by default, a hit is registered if the search sequences are located, either in the same chain or in different chains.
If the instruction SAME is included in the search packet then a hit is registered only if the search sequences are located in the same chain.
Ex.6
T6 *PDBSEQ SAME PSEQ -VAL-VAL-SER- PSEQ -LEU-ASN-SER- END
The following hit is registered :
---------+---------+---------+---------+---------+---------+---------+---------+
0TAB01
Trypsin (E.C.3.4.21.4) complex with Bowman-Birk inhibitor (AB-I); Source: Bovine
(Bos taurus) pancreas and Adzuki beans (Phaseolus angularis)
ID-1TAB; Deposition date 901015; Class:Hydrolase (serine proteinase); Data contr
ibuted by Y.Tsunogae,I.Tanaka,T.Yamane,J.-I.Kikkawa,T.Ashida, C.Ishikawa,K
.Watanabe,S.Nakamura,K.Takahashi; Resolution 2.3A
Y.Tsunogae,I.Tanaka,T.Yamane,J.-I.Kikkawa,T.Ashida,C.Ishikawa, K.Watanabe,S.Naka
mura,K.Takahashi
J.Biochem., 100, 1637,1986
Chain E, 223 residues, Test 6, Start 18 -SER-LEU-ASN-SER-GLY-
Chain E, 223 residues, Test 6, Start 35 -TRP-VAL-VAL-SER-ALA-
Chain I, 82 residues, No Hits
---------+---------+---------+---------+---------+---------+---------+---------+
Type "K"(Keep), "R"(Reject) or "O"(for list of options)
(e) NCHA
The NCHAin command allows the user to restrict the search to entries containing a specified number of chains.
The command takes the form:
NCHA .EQ. 1 restricts the search to single-chain entries
NCHA 1 - 2 restricts the search to single- and double-chain entries.
Ex.7
T7 *PDBSEQ NCHA .EQ. 1 PSEQ -ARG-TYR-SER- END
The following hit is registered :
---------+---------+---------+---------+---------+---------+---------+---------+
0AIT02
Tendamistat; Source: (Streptomyces tendae)
ID-2AIT; Deposition date 890524; Class:alpha-Amylase Inhibitor; Experimental:NMR
; Data contributed by A.D.Kline,W.Braun,P.Guntert,M.Billeter, K.Wuthrich
A.D.Kline,W.Braun,K.Wuthrich
J.Mol.Biol., 204, 675,1988
Chain 1, 74 residues, Test 7, Start 19 -TRP-ARG-TYR-SER-GLN-
---------+---------+---------+---------+---------+---------+---------+---------+
Type "K"(Keep), "R"(Reject) or "O"(for list of options) M
*REFC=0AIT02
*COMP=Tendamistat; Source: (Streptomyces tendae) // *QUAL=ID-2AIT; Deposition da
te 890524; Class:alpha-Amylase Inhibitor; Experimental:NMR; Data contributed by
A.D.Kline,W.Braun,P.Guntert,M.Billeter, K.Wuthrich // *AUTH=A.D.Kline,W.Braun,K.
Wuthrich // *CODE=70(J.Mol.Biol.) // *VOLU= 204 // *PAGE= 675 // *YEAR=1988 // *
PREF=0AIT02 // *ADAT=930909 // *MDAT=930909 // *MSDB=0 // *CASN=0 // *NBSI=0 //
*CDRE=0 // *BATC=0 // *BCLA=99 // *TOLE=.40 // *COOR=0 // *SPGN=0 // *SPAC= // *
RFAC=0 // *TEMP=295 // *MAXA=0 // *ZVAL=0 // *DENM=0 // *DENX=0 // *DENC=0 // *C
ELA=0 // *CELB=0 // *CELC=0 // *ALPH=0 // *BETA=0 // *GAMM=0 // *MA27=0 // *MA28
=0 // *RCP1=0 // *RCP2=0 // *RCP3=0 // *RCP4=0 // *RCP5=0 // *RCP6=0 // *SIGF=0
// *MATF=0 // *INTF=0 // *CATF=4 // *METR=0 // *BRV2=0 // *BRV1=0 // *RCVO=0 //
*MA43=809584980 // *MA44=808591392 // *SCOR=0 // *MA46=0 // *MA47=0 // *MA48=0 /
/ *ZPRI=0 // *NRES=0 // *BITS=0 // *NW01=264 // *NW02=29 // *NW03=0 // *JRNL=J.M
ol.Biol. // *MVOL=0 // *CVOL=0 //
Chain 1, 74 residues, ASP-THR-THR-VAL-SER-GLU-PRO-ALA-PRO-SER-CYS-VAL-THR-
14 LEU-TYR-GLN-SER-TRP-ARG-TYR-SER-GLN-ALA-ASP-ASN-GLY-CYS-ALA-GLU-THR-VAL-
32 THR-VAL-LYS-VAL-VAL-TYR-GLU-ASP-ASP-THR-GLU-GLY-LEU-CYS-TYR-ALA-VAL-ALA-
50 PRO-GLY-GLN-ILE-THR-THR-VAL-GLY-ASP-GLY-TYR-ILE-GLY-SER-HIS-GLY-HIS-ALA-
68 ARG-TYR-LEU-ALA-ARG-CYS-LEU
---------+---------+---------+---------+---------+---------+---------+---------+
Type "K"(Keep), "R"(Reject) or "O"(for list of options)
(f) NRES
The NRESidues command allows the user to restrict the search to entries containing a specified number of residues in a chain.
The command takes the form:
NRES .LT. 100 restricts the search to chains containing less than 100 residues
NRES 300-500 restricts the search to chains containing 300-500 residues.
Ex.8
T8 *PDBSEQ NRES .LE. 20 PSEQ -AGL-GAL-AGL- END
The following hit is registered :
---------+---------+---------+---------+---------+---------+---------+---------+
0AGA01
Agarose (an alternating copolymer of 3-linked beta-D-galactopyranose and 4-linke
d 3,6-anhydro-alpha-L-galactopyranose); Source: Red seaweed (Rhodophycae)
from several sources were studied
ID-1AGA; Deposition date 780523; Class:Texture of Connective Tissue; Data contri
buted by S.Arnott; Resolution 3.0A
S.Arnott,A.Fulmer,W.E.Scott,I.C.M.Dea,R.Moorhouse,D.A.Rees
J.Mol.Biol., 90, 269,1974
Chain A, 6 residues, Test 8, Start 2 GAL-AGL-GAL-AGL-GAL-
Chain B, 6 residues, Test 8, Start 2 GAL-AGL-GAL-AGL-GAL-
---------+---------+---------+---------+---------+---------+---------+---------+
Type "K"(Keep), "R"(Reject) or "O"(for list of options) m
*REFC=0AGA01
*COMP=Agarose (an alternating copolymer of 3-linked beta-D-galactopyranose and 4
-linked 3,6-anhydro-alpha-L-galactopyranose); Source: Red seaweed (Rhodophycae)
from several sources were studied // *QUAL=ID-1AGA; Deposition date 780523; Clas
s:Texture of Connective Tissue; Data contributed by S.Arnott; Resolution 3.0A //
*AUTH=S.Arnott,A.Fulmer,W.E.Scott,I.C.M.Dea,R.Moorhouse,D.A.Rees // *CODE=70(J.
Mol.Biol.) // *VOLU= 90 // *PAGE= 269 // *YEAR=1974 // *PREF=0AGA01 // *ADAT=930
909 // *MDAT=930909 // *MSDB=0 // *CASN=0 // *NBSI=0 // *CDRE=0 // *BATC=0 // *B
CLA=99 // *TOLE=.40 // *COOR=0 // *SPGN=0 // *SPAC= // *RFAC=0 // *TEMP=295 // *
MAXA=0 // *ZVAL=0 // *DENM=0 // *DENX=0 // *DENC=0 // *CELA=0 // *CELB=0 // *CEL
C=0 // *ALPH=0 // *BETA=0 // *GAMM=0 // *MA27=0 // *MA28=0 // *RCP1=0 // *RCP2=0
// *RCP3=0 // *RCP4=0 // *RCP5=0 // *RCP6=0 // *SIGF=0 // *MATF=0 // *INTF=0 //
*CATF=4 // *METR=0 // *BRV2=0 // *BRV1=0 // *RCVO=0 // *MA43=809584449 // *MA44
=808525856 // *SCOR=0 // *MA46=0 // *MA47=0 // *MA48=0 // *ZPRI=0 // *NRES=0 //
*BITS=0 // *NW01=404 // *NW02=11 // *NW03=0 // *JRNL=J.Mol.Biol. // *MVOL=0 // *
CVOL=0 //
Chain A, 6 residues, GAL-AGL-GAL-AGL-GAL-AGL
Chain B, 6 residues, GAL-AGL-GAL-AGL-GAL-AGL
---------+---------+---------+---------+---------+---------+---------+---------+
Type "K"(Keep), "R"(Reject) or "O"(for list of options)
(g) FULL
The FULL command forces a full display of sequence information for each hit entry.
The command takes the form
or
FULL CHAIN displays the complete sequences of the chains in which the search sequences are located. This is the default if FULL is given with no qualifier.
FULL ALL displays , for each hit entry, the complete sequence information for the entry ie. all chains.
Ex.9
T9 *PDBSEQ FULL CHAIN EXHA PSEQ -SER-SER-GLU- END
The following hit is registered :
---------+---------+---------+---------+---------+---------+---------+---------+
0HDD01
Engrailed homeodomain complex with DNA; Source: Fruit fly (Drosophila melanogast
er) expressed in (Escherichia coli)
ID-1HDD; Deposition date 910916; Class:DNA Binding; Data contributed by C.R.Kiss
inger,B.Liu,C.O.Pabo,E.Martin-Blanco,T.B.Kornberg; Resolution 2.8A
C.R.Kissinger,B.Liu,E.Martin-Blanco,T.B.Kornberg,C.O.Pabo
Cell (Cambridge,Mass.), 63, 579,1990
Chain C, 61 residues, MET-ASP-GLU-LYS-ARG-PRO-ARG-THR-ALA-PHE-SER-SER-GLU-
14 GLN-LEU-ALA-ARG-LEU-LYS-ARG-GLU-PHE-ASN-GLU-ASN-ARG-TYR-LEU-THR-GLU-ARG-
32 ARG-ARG-GLN-GLN-LEU-SER-SER-GLU-LEU-GLY-LEU-ASN-GLU-ALA-GLN-ILE-LYS-ILE-
50 TRP-PHE-GLN-ASN-LYS-ARG-ALA-LYS-ILE-LYS-LYS-SER
Chain D, 61 residues, MET-ASP-GLU-LYS-ARG-PRO-ARG-THR-ALA-PHE-SER-SER-GLU-
14 GLN-LEU-ALA-ARG-LEU-LYS-ARG-GLU-PHE-ASN-GLU-ASN-ARG-TYR-LEU-THR-GLU-ARG-
32 ARG-ARG-GLN-GLN-LEU-SER-SER-GLU-LEU-GLY-LEU-ASN-GLU-ALA-GLN-ILE-LYS-ILE-
50 TRP-PHE-GLN-ASN-LYS-ARG-ALA-LYS-ILE-LYS-LYS-SER
Chain A, 21 residues, No Hits
Chain B, 21 residues, No Hits
---------+---------+---------+---------+---------+---------+---------+---------+
Type "K"(Keep), "R"(Reject) or "O"(for list of options) Note that M will give the short display in this case.
We will now repeat the search in Ex.9 but use the FULL ALL instruction rather than FULL CHAIN.
Ex.10
T10 *PDBSEQ FULL ALL EXHA PSEQ -SER-SER-GLU- END
The following hit is registered :
---------+---------+---------+---------+---------+---------+---------+---------+
0HDD01
Engrailed homeodomain complex with DNA; Source: Fruit fly (Drosophila melanogast
er) expressed in (Escherichia coli)
ID-1HDD; Deposition date 910916; Class:DNA Binding; Data contributed by C.R.Kiss
inger,B.Liu,C.O.Pabo,E.Martin-Blanco,T.B.Kornberg; Resolution 2.8A
C.R.Kissinger,B.Liu,E.Martin-Blanco,T.B.Kornberg,C.O.Pabo
Cell (Cambridge,Mass.), 63, 579,1990
Chain C, 61 residues, MET-ASP-GLU-LYS-ARG-PRO-ARG-THR-ALA-PHE-SER-SER-GLU-
14 GLN-LEU-ALA-ARG-LEU-LYS-ARG-GLU-PHE-ASN-GLU-ASN-ARG-TYR-LEU-THR-GLU-ARG-
32 ARG-ARG-GLN-GLN-LEU-SER-SER-GLU-LEU-GLY-LEU-ASN-GLU-ALA-GLN-ILE-LYS-ILE-
50 TRP-PHE-GLN-ASN-LYS-ARG-ALA-LYS-ILE-LYS-LYS-SER
Chain D, 61 residues, MET-ASP-GLU-LYS-ARG-PRO-ARG-THR-ALA-PHE-SER-SER-GLU-
14 GLN-LEU-ALA-ARG-LEU-LYS-ARG-GLU-PHE-ASN-GLU-ASN-ARG-TYR-LEU-THR-GLU-ARG-
32 ARG-ARG-GLN-GLN-LEU-SER-SER-GLU-LEU-GLY-LEU-ASN-GLU-ALA-GLN-ILE-LYS-ILE-
50 TRP-PHE-GLN-ASN-LYS-ARG-ALA-LYS-ILE-LYS-LYS-SER
Chain A, 21 residues, T-T-T-T-G-C-C-A-T-G-T-A-A-T-T-A-C-C-T-A-A
Chain B, 21 residues, A-T-T-A-G-G-T-A-A-T-T-A-C-A-T-G-G-C-A-A-A
---------+---------+---------+---------+---------+---------+---------+---------+
Type "K"(Keep), "R"(Reject) or "O"(for list of options)
Related Bit Screens
The relevant bit screen is 122.
If you search a database which includes both macromolecules and small molecules then you can confine the search to macromolecules using bit screen 122.
Bit 141 indicates that the SEQRES field is present.