Volume 1 Chapter 7 Commands in the 3D-CONSTRAIN Sub-menu

Back to Table of Contents

7.16 Bit Screens for 3D Searching

7.16.1 3D Screening in the CSD System

The concept of screens, as a set of contiguous bits set on (1) or off (0) to indicate the presence or absence of some item of information, has been introduced in chapter 2 with respect to searches of 1D and 2D information fields. Bit maps of this sort are calculated and stored for each CSD entry, and are also calculated at search time for each query input to the system. A detailed search (e.g. a text comparison, a 2D substructure search), is only carried out if a target entry passes the screening process: all of the query screens must be present in the screen record for that entry.

The screening step is a search heuristic which saves a considerable amount of computing time by rejecting, at a very early stage, entries that cannot possibly be hits.

The development of similar heuristic bit settings to cover 3D geometric searches is currently an area of considerable research activity in chemical information science (see e.g. P.Willett, Three-Dimenisional Chemical Structure Handling, Research Studies Press and John Wiley, Taunton and Chichester UK, 1991, and references therein).

In the 3D case, the problem is compounded by the very wide range of geometrical parameters that can be used in phrasing a 3D search. For this reason, no generalised screening mechanisms are available and screens are established to cover the most likely geometric search terms.

In Version 5 of the CSD (October 1992) three types of 3D screens are encoded:

This list will be extended in the future by further sets of 3D screens as they become available.

7.16.2 Differences between 2D and 3D screening

3D screens operate differently from 2D screens in that a set of screens is determined for each SELECT command where screening is possible. This means that more than one set of screens can be produced for a single connectivity instruction packet. These screens are then compared consecutively with the MASK records to determine those structures that pass the screening.

In addition the 3D screens differ from the 2D screens in that only one of the bits in a particular type of screen needs to overlap with that in the MASK record for that type to be considered a hit.

Finally the 3D screens use explicitly defined ranges to determine which screening bits are set. These ranges are calculated to produce the optimum screen-out with respect to the current contents of the database, but as the average size of entries in the database is increasing it is possible that these ranges may at some point in the future need to be redetermined.

Because of these differences the following conditions apply to the use of 3D screens:

(i)
unlike the 2D screens, the 3D screens set for a connectivity instruction packet are not displayed when the structure is defined.

(ii)
the exact definition of each bit is not given in the documentation and their use is restricted to the processing of SELECT commands, i.e. it is not possible to apply SCREEN or *BTEST to these screens.

Thus the following sections provide a general guide to the types of screens set, rather than a definitive specification of the conditions for the setting of each bit.

7.16.3 Intramolecular Distance Screens

20 words in the MASK record are allocated to intramolecular distance screens. Each word relates to a different combination of atom types as follows:

   1 AA - AA               2 AA - X
   3 AA - S,P,Hal          4 C - C
   5 C - N                 6 C - O
   7 Csp2 - C              8 Csp2 - N
   9 Csp2 - O             10 Carom - C
  11 Carom - N            12 Carom - O
  13 N - N,S,P,Hal        14 N - O,S,P,Hal
  15 O - O,S,P,Hal        16 Osp3 - N,O,S,P,Hal
  17 Osp2 - N,O,S,P,Hal   18 Ar Ring - N,O
  19 Ar Ring - S,P,Hal    20 Ar Ring - Cnonsat

where

7.16.4 Single Torsion Screens

There are 7 words in the MASK record that contain screens for single torsion angles. Unlike the distance screens which use a full word for each combination the torsion screens contain several combinations in the same word. The torsion angles that are used for screening are as follows:

word 1    Z - C ~ C - Z                      C - C ~ C - C
          C - C ~ C - N,O,X                  C - C ~ C - H,D
        H,D - C ~ C - H,D
 
word 2    Z - C ~ N - Z                      Z - C ~ N - H,D
        H,D - C ~ N - H,D
 
word 3    Z - C - C - Z acyclic or cyclic    Z - C - C - Z cyclic
          Z - C - C - Z acyclic              C - C - C - C acyclic or cyclic
          C - C - C - C cyclic               C - C - C - C acyclic
 
word 4    C - C - C - O acyclic or cyclic    C - C - C - O cyclic
          C - C - C - O acyclic              C - C - C - N acyclic or cyclic
          C - C - C - N cyclic               C - C - C - N acyclic
 
word 5  N,O - C - C - N,O                    Z - C - C - X
          C - C - C - X                      Z - C - C - H,D
        H,D - C - C - H,D                    Z - C - X - Z
 
word 6    Z - C - O - Z                      Z - C - O - C
      X,N,O - C - O - C                      C - C - O - C
          Z - C - O - H,D
 
word 7    Z - C - N - Z                      Z - C - N - C
          X,N,O - C - N - C                  C - C - N - C
          Z - C - N - H,D

where:

All torsion angle screens use absolute values between 0 and 180deg. so that SELECT commands for a specified angle of -10deg. to 0deg. or 0deg. to 10deg. would use the same screens.

All torsion angles proceed from negative to positive, e.g. from -10deg. to +10deg. If numbers are specified in the opposite direction, e.g. from +10deg. to -10deg. then the range will be sector of 340deg. rather than the expected sector of 20deg.

Where one of the valence angles in a localised/delocalised double bond is greater than 170deg. bit screen 70 will be set; if both are greater than 170deg. then bit screen 71 is set. In both these cases no torsion angle is calculated as the large valence angles make the result unreliable.

7.16.5 Adjacent Torsion Screens

There are two words in the MASK record that relate to adjacent torsion screens. "Adjacent" refers to a situation such as A-B-C-D-E where A-B-C-D is one torsion angle and B-C-D-E is an adjacent torsion angle. Bits are then set depending on the ranges for both angles. The first word relates to cases where both central bonds are cyclic and the second word to those cases where one or both of the central bonds are acyclic.

7.16.6 Making the Best Use of 3D Screens

7.16.7 Possible Problems with 3D Screens

If you experience problems with 3D searches, e.g. not hitting an expected structure, then it is possible to turn off the 3D screens in two ways:

(i)
Enter the command NOSCREENS which turns off all screening activity.

This is best done within the connectivity instruction packet involved as this will result in the screens being deactivated only for that test.

(ii)
Switch off only the 3D screening.

In the VAX implementation this can be done by commenting out, using a ! after the $ in column one of the followingdeclaration in the CSD$COM:QUEST_STARTUP.COM file:

$ assign /user_mode use_3dscreens CSD3DSCR

For other implementations it is necessary to edit routine TSTLNM in the B section of the code.

Here, when testing for 'CSD3DSCR', the line 'TSTLNM=1' should be changed to 'TSTLNM=0'.

After editing, the code must be recompiled and relinked before the change will have any effect.

Back to Table of Contents

Volume 1 Chapter 8 Use of Control Instructions.