Volume 1 Chapter 1 A Statistical Survey

Back to Table of Contents

1.4 The Basic Software System

The operation of the basic software system is illustrated by the flowchart below, and also by the alphanumeric QUEST instruction set and example search results.

CSDS : Basic Software System

Basic QUEST query:

 
T1 *YEAR .GE. 1970
T2 *CONNSER
AT1 C 3
AT2 C 2
AT3 C 2
AT4 C 2
AT5 O 1 E
BO 1 2 1
BO 2 3 1
BO 3 1 1
BO 1 4 1
BO 4 5 2
END
QUEST T1.AND.T2

Fragment:

Output from Basic QUEST:

---------+---------+---------+---------+---------+---------+---------+---------+
*REFC=ALCHRB10 // *COMP=(4S)-2-(Prop-2-enyl)rethron-4-yl (1R,3R)chrysanthemate 6
-bromo-2,4-dinitrophenylhydrazone // *QUAL=absolute configuration // *FORM=C25 H
29 Br1 N4 O6 // *AUTH=M.J.Begley,L.Crombie,D.J.Simmonds,D.A.Whiting // *CODE=207
(J.Chem.Soc.,Perkin Trans.1) // *VOLU= // *PAGE= 1230 // *YEAR=1974 //
The basic software comprises the following programs:

1.4.1 QUEST

This program permits the interrogation of the 1D and 2D information fields of the CSD. The program is operated using an alphanumeric query language which permits the user to TEST one or more individual information fields (see T1, T2, etc. in the example). The complete query consists of a logical combination of individual test results input on the (final) QUEST command.

The program can interrogate 38 numerical fields, 20 text fields and the 2D connection tables for substructure search. Special search constructs permit searching of molecular formulae and crystallographic unit cell information.

2D Substructure search queries must be input via the alphanumeric specification of AToms and BOnds and their required chemical properties (see example). Substructure search capabilities are very flexible and include: specification of hybridization states and coordination numbers, variable point of attachment (simple generic) searches, variable specification of atom and bond properties, cyclic/acyclic bond descriptors, eight chemical bond types, pre-set and user-defined element groups, etc. Facilities for controlling the chemical environment of the fragment are also provided.

Output from QUEST consists of a print file containing alphanumeric information for CSD entries that are classed as hits. The program will also generate a number of subfiles of CSD information for the hits that can be accessed by other programs.

The most important subfile is the FDAT file. This simple ASCII file contains the 3D structural data for each hit entry and is used by the programs GSTAT and PLUTO within the CSD System. The FDAT file can also be read by many popular molecular modelling packages.


1.4.2 GSTAT

GSTAT is a multi-functional 3D search and molecular geometry program. The program is operated through a set of alphanumeric instructions (see example below). The program will perform:

GSTAT instruction file:

 
FRAG Cyclo-propyl carbonyls
AT1 C 3
AT2 C 2
AT3 C 2
AT4 C 2
AT5 O 1
BO 1 2
BO 2 3
BO 3 1
BO 1 4
BO 4 5
C Ensure 4 5 is C=O
TEST DIST 4 5 1.15 1.25
END
C Get mid-point of C2-C3 bond
SETup X1 2 3
DEFine D1 1 2
DEFine D2 1 3
DEFine D3 2 3
TRA ?ADD1 = D1 + D2
TRA ?ADD2 = ?ADD1 + D3
TRA DMEAN = ?ADD2 / 3.0
DEF ?TAU 5 4 1 X1
TRA TAU = ABS ?TAU
SELect DMEAN 1.47 1.54
OUTPUT COORD ORIG 1 XM 1 4 YM 1 X1
HIST D3
SCAT D3 TAU

Fragment:

GSTAT tabulation output:

 Nfrag  Refcod          D1      D2      D3   DMEAN     TAU
     1  ACMEPT       1.528   1.530   1.416   1.491 176.188
     2  ACMEPT       1.522   1.529   1.416   1.489 176.205
     3  ACMEPT10     1.528   1.530   1.416   1.491 176.188
     4  ACMEPT10     1.522   1.529   1.416   1.489 176.20
     5  ACOHKT       1.497   1.525   1.524   1.515  75.168
     6  ACOHKT       1.525   1.524   1.497   1.515  21.613
     7  AIMCTY       1.497   1.517   1.512   1.509 146.695
     8  ARITOL       1.506   1.535   1.500   1.514 150.295
     9  BARTUS10     1.536   1.516   1.477   1.510  10.475
 
        Mean         1.520   1.520   1.496   1.512  74.011
        S.D.Sample   0.019   0.019   0.024   0.011  67.20
        S.D.Mean     0.001   0.001   0.001   0.001   3.027
        Minimum      1.453   1.440   1.416   1.473   0.000
        Maximum      1.582   1.578   1.574   1.539 179.980
        Nobs           493     493     493     493     493

Histogram of D3

D3         10   20   30   40   50   60   70   80   90  100        D3
       .....I....I....I....I....I....I....I....I....I.....
  1.39 -                                                 -
       .                                                 .
       .                                                 .
       .**                                               . (  4)
       .**                                               . (  4) MEAN:
  1.44 -****                                             - (  8)      1.496
       .*********                                        . ( 19)
       .******                                           . ( 13) SDEV:
       .**************                                   . ( 29)      0.024
       .***********************************              . ( 71)
  1.49 -*****************************************        - ( 82) MINIMUM:
       .***************************************          . ( 78)      1.416
       .*************************************            . ( 74)
       .**************************************           . ( 77) MAXIMUM:
       .***********                                      . ( 23)      1.574
  1.54 -****                                             - (  8)
       .*                                                . (  3)   498 VALS
       .**                                               . (  4) PLOTTED:
       .*                                                . (  1)
       .                                                 .
  1.59 -                                                 -
       .....I....I....I....I....I....I....I....I....I.....
D3         10   20   30   40   50   60   70   80   90  100        D3

Scattergram of D3 vs. TAU

    1.370    1.415    1.460    1.505    1.550    1.595
       +I--------I--------I--------I--------I--------I+  D3
  180.0-         4    2  2  3122812  52               - ACROSS      DOWN
       I            1    1  11 15623541 1             ID3          TAU
       I             1     1 4444471252121            I        MEAN:
       I                    13312156141421 21         I   1.496    73.611
  140.0-                      222 13 141  1  1        -        SDEV:
       I                       1111  111              I   0.024    67.216
       I                        1 2    1              I      MINIMUM:
       I                            211 2             I   1.416     0.000
  100.0-                         1                    -      MAXIMUM:
       I                  1      1 1 11 1  1          I   1.574   179.980
       I                         111 1 3211 1         I
       I                          2 1 1   11 1   1    I  498 VALS PLOTTED
   60.0-                     12  1 24231              -    0 VALS OMITTED
       I                     1 112276C32              I
       I                   1 1211 3  11      11       I  CORREL. COEFF.
       I                     13323122231              I       0.150
   20.0-             2 1   11238752332431 1           -
       I             21 83422A357A4432313  1          I
       I             2 1623269837843242               I
       I                                              I
  -20.0-                                              -
       +I--------I--------I--------I--------I--------I+
TAU

Given that we have already prepared an FDAT file for the QUEST run on the previous page, then the GSTAT example above will:


1.4.3 PLUTO

PLUTO is a well-known stand-alone crystallographic plotting program written at the CCDC in the early 1970's. It remains a valuable tool for the visualization of crystal structures, either at the molecular level or at the crystal structure (packing) diagram level to show intermolecular interactions.

PLUTO will generate mono or stereo illustrations of molecular or crystal structures in three basic styles:

Complete control over view direction, sizes of bonds and atoms, elements to be excluded or included, etc. is provided to the user.

The program will operate from an FDAT file generated by QUEST, or from crystallographic data encoded in a very simple free format. Operations are controlled by an alphanumeric instruction set generated by the user.

The examples below show a ball and spoke drawing of one of the molecules from the cyclopropyl-carbonyl dataset, together with two superposed plots of all fragments. The superimposed plots are obtained by using the 'molecular axes' coordinates output by GSTAT in response to the instructions in the GSTAT example above.

Ball and Spoke Drawing

Superimposed Fragments Viewed Along C1-C4 Bond.

Superimposed Fragments Viewed Perpendicular to C1-C4 Bond

Back to Table of Contents

Volume 1 Chapter 1 The Graphical System.