Volume 1 Chapter 5 Peptide Sequence Searches in Graphics QUEST3D

Back to Table of Contents

5.2 BASIC QUEST

5.2.1 Text Searches

A text search is defined by a test of the form:
        Tn  *KEYWORD  string
or      Tn  *KEYWORD  `string'

where string is a string of alphanumeric characters.

For the test to succeed there must be an exact match of the question string with the characters present in the appropriate field of the database entry.

Only the first four characters of the keyword need be typed.

Full details of individual text search keywords are given in chapter 11 of Vol.2.

Ex.1

T1   *AUTH  F.A.COTTON
QUES  T1

This registers hits for all entries having F.A.Cotton as an author.

Under normal circumstances no distinction is made, in a text search, between upper- and lower-case.

This applies also to the typing of search and other keywords.

Thus an equivalent test would be:

t1  *Auth  f.A.CoTtOn

If the distinction between upper- and lower-case must be preserved then you must use the control keyword LOCASE (see chapter 8).

Ex.2

T9  *SPAC  `P21^'
where ^ is blank
QUES  T9

This registers hits for all entries having space group P21.

In this example the text string P21 is enclosed within ` marks and P21 is followed by a blank space.

*SPAC involves the search of a left-adjusted field and so it is necessary to introduce a trailing blank to avoid registering hits for space groups such as P212121, P21/c, etc.

Note that n in Tn can take any value in the range 1-199.

Ex.3

T1   *SURN  `O''HARA'
QUES  T1

This registers hits for all entries having O'Hara as an author.

This example illustrates the fact that, if the search string itself contains the ` character,then `` ie. 2 prime characters must be typed and the complete string enclosed by ` marks.

Ex.4

T1   *NAME  FERRO
QUES  T1  .AND.  T1

This will register a hit for the entry with compound name:

Ferrocenium hemi-ferrocene tricyanoethanolate

This example illustrates that QUEST "remembers" text which has been hit by a text search test. This ensures that text can be hit only once.

5.2.2 Numeric Searches

A numeric search is defined is defined by a test of the form:

Tn  *KEYWORD  .LO.  VALUE

where .LO. is one of the logical operators .EQ. is equal to

Specification of a range of values is also permitted:

Tn  *KEYWORD  VALUE1 - VALUE2

Full details of individual numeric search keywords are given in chapter 7 of Vol.2.

Ex.1

T2  *CLAS  .EQ.  56
QUES  T2

This will register hits for all entries which are classified as triterpenes (chemical class 56).

Note that n in Tn can take any value in the range 1-199.

Ex.2

T3  *SPGN  201
QUES  T3

This will register hits for all entries having space group number 201.

This numeric search illustrates the fact that the use of the logical operator .EQ. is not mandatory.

Ex.3

T4  *CLAS  52-57
QUES  T4

This will register hits for all terpenes (chemical classes 52-57).

This numeric search involves a range of values.

5.2.3 Screen Searches

Both database screens and query screens have been discussed in chapter 2.

Query screens are assigned by the user to screenout most of the database entries and thereby increase the speed of the search.

A complete list of the 682 1D and 2D bit screens is given in Appendix 1 of the printed documentation but note that normally a user would assign bit screens only in the range 1-155.

The command SCREEN (or SCRE) and the keyword *BTEST (or *BTES) are available for screen searches:

(a) SCRE  n1  n2  n3  n4  etc.
(b) Tn  *BTES  n1  n2  n3  n4  etc.
(c) Tn  *BTES  n1  &  n2  &  n3  &  n4  etc.
(d) Tn  *BTES  n1,n2,n3,n4  etc.

For (a)-(c) the individual screens n1, n2, n3, n4 etc. are linked by the logical operator .AND.

Thus all must be satisfied for a hit to be registered.

For (d) the individual screens n1, n2, n3, n4 etc. are linked by the logical operator .OR.

Full details of SCREEN and *BTEST are given in chapter 9 of Vol.2.

Ex.1

SCRE  153  91  -34
T1  *CLAS  52-57
T2  *YEAR  .GE.  1988
SCRE  -91  90
QUES  T1  .AND.  T2

This will register hits for all terpenes (classes 52-57) published from 1988 onwards and for which:

This example illustrates a number of features of the use of SCREEN:

Ex.2

T1  *BTES  1,2
T2  *CLAS  38
QUES  T1  .AND.  T2

This will register hits for heterocyclic-oxygen compounds (class 38) containing group 1A elements (bit 1) or group 2A elements (bit 2).

5.2.4 Chemical Formula Searches

Four tests are available for searching chemical formulae which are not strictly numeric or text searches in respect of their syntax.

The relevant keywords are:

Full details of these search keywords are given in chapter 10 of Vol.2.

Ex.1

T1  *ELEM  1A
QUES  T1

This will register hits for all compounds containing group 1A elements, ie. Li, Na, K, Rb, Cs, or Fr.

Ex.2

T2  *ELEM  H & N & O
QUES   T2

This will register hits for compounds containing the elements C, H, N, O; other elements may be present.

Ex.3

T3  *ELEM  H + N + O
QUES   T3

This will register hits for compounds containing only the elements C, H, N, O.

Ex.4

T4  *RESI  C24 & N3 & S1 & P2
QUES  T4

This will register hits for compounds where a single residue contains C24 and N3 and S1 and P2; other elements may be present.

Ex.5

T5  *RESI  C14  + H14  + SI1 + S1
QUES  T5

This will register hits for compounds where a single residue contains only C14 and H14 and Si1 and S1.

Ex.6

T6  *RESF  7A .GT. 2
QUES  T6

This will register hits for compounds where a single residue contains more than 2 atoms of group 7A, ie. halogen atoms.

Ex.7

T7  *SUMF  7A .GT. 2
QUES  T7

This will register hits for compounds where the sum formula contains more than 2 atoms of group 7A, ie. halogen atoms.

5.2.5 Unit Cell Matching Searches

The keywords *nCELL can be used to match a set of unit cell parameters against those in the database.

The input unit cell parameters are transformed to the Niggli reduced cell parameters and these are compared against the reduced unit cell parameters held in the database.

As for chemical formula searches, this type of search is not strictly numeric or text.

Full details of the *nCELL keywords are given in chapter 10 of Vol.2.

Ex.

T1 *PCEL MONO 11.43 6.59 11.39 90 95.68 90 0.05
QUES  T1

PCEL indicates that that the input cell is primitive

MONO indicates the monoclinic system

The six unit cell parameters then follow

Finally a tolerance is declared, in this case 0.05Å.

5.2.6 Peptide Sequence Searches

The keyword *PEPTIDE can be used to search the SYNONYM fields of entries in basic class 48 for sequences of amino-acids, using standard 3-letter code symbols.

As for chemical formula searches, this type of search is not strictly numeric or text.

Full details of the *PEPTIDE keyword are given in chapter 10 of Vol.2.

Ex.1

T1  *PEPT
PSEQ  -PRO-AIB-
QUES  T1

The PSEQ record is used to define the search sequence.

This will register hits, for example, for entries whose SYNONYM fields contain the following sequence information:

In these sequences

Ex. 2

T2  *PEPT
PDEF  ABC= ILE  LEU
PSEQ  -GLY-ABC-GLY-
QUES  T2

In this example the PSEQ record is preceded by a PDEF record which allows the user to define combinations of the 3-letter code symbols.

Here ABC is defined to be either ILE or LEU.

PDEF can be compared to ELDEF in a connectivity search test package.

This will register hits, for example, for entries whose SYNONYM fields contain the following sequence information:


Back to Table of Contents

Volume 1 Chapter 5 Guide to Search Details in Volume 2.