Back to PreQuest User Guide

1. Getting Started

In order to show how PreQuest can be used to create a small private database we have provided some example files with the distribution program (see $CSDHOME/examples/prequest). The most common formats presented by users are CIF (example.[1-4]) and SHELX (example.[5-8]). This section will illustrate what you will need to do to process these examples into a useful CSD searchable database.


1.1 A Typical Example


1.2 More about Editing

PreQuest performs a large number of checks on the data fields of every entry. For convenience these can be divided into:

The individual data fields in each of these distinct classes can be easily edited from within the PreQuest program. This can be done using the Editing Function buttons available in the Main Menu.

1D Edit

This allows the user to edit the actual text of the record (in BCCAB format - see section 4.5). However, many entries can be satisfactorily changed by direct overtyping in the Data Boxes provided, e.g. Author, Compound etc.

When editing text fields in PreQuest:

Note that on starting PreQuest the default editor is set to be "textedit" for Suns. If your computer is not a Sun you must select another editor from the list in the Preferences menu (see section 2.19). We recommend that SG users choose "jot", while other Unix users choose "xedit".

2D Edit

Allows modification of the chemical diagram. Specific editing of the diagram may be necessary to ensure that chemical bond types and atomic charges are correctly assigned. Users also have the option of completely redrawing the chemical diagram. This may be required, for example, if certain chemical conventions are to be adhered to. See section 2.16.

3D Check

The user can change the coordinates to correct any errors in the structure, and has the ability tosuppress any unwanted atomic sites in disordered structures. See section 2.18.

When to Edit

As a general guide, if there is a "red" error status then some editorial action should be taken. PreQuest will allow erroneous entries to be archived, for example when they are missing fields or contain mistakes. However, it should be remembered that unchecked data greatly reduces the value of your database, and may result in search failures when using Quest. It is, therefore, well worth the effort at this stage to create records of as high a standard as possible.

There is an option to control the levels of error checking in the Preferences menu (see section 2.19). It is normally sufficient to use the Relaxed (minimal rules) option.


1.3 Treating Disorder

It is essential to edit the atomic coordinates for disordered sites. This is because the current Version 5 CSD software does not have storage facilities for disorder groups, site numbers, and occupancy factors. If a structure contains disorder (as occurs for ~12% of all CSD entries) then the minor occupancy sites must be suppressed to leave a single representative atomic position, that will then be matched against the chemical diagram. Suppressed atoms are not deleted, they simply take no further part in the establishment of the crystallographic connectivity, and therefore will not be used in Quest 3D searches.

How to deal with a typical example of a disordered structure is now illustrated using another example:


1.4 Control of Refcodes

The assignment of reference codes (refcodes) for your private database is controlled by an auxiliary file called prequest.refcodes. This file should be present in your root directory. If it is not present PreQuest will assign a 6-digit number in sequence starting at 000001.

prequest.refcodes consists of 8 lines, each of which defines the sequence of characters to be assigned by each new data entry read. This example shows the 8 lines as:

   S
   0123456789
   0123456789
   0123456789
   0123456789
   0123456789
   <blank>
   <blank>

This has the effect of always keeping character 1 as "S" and characters 7 and 8 as "S". The sequence of the codes will then be S00001, S00002, S00003 etc.

Between sessions PreQuest records the last refcode it generated in the 9th line of this file. A filename other than prequest.refcodes can be specified by using the environment variable CCDCNEWREFCODES.

It should also be noted that the CIF field _database_code_CSD can be used from within a CIF to specify a refcode for the structure as it is read into PreQuest.


1.5 Levels of Checking

At the CCDC production of the main database requires that all entries pass a series of detailed and elaborate checks. This ensures that the data is of the highest standard and integrity. PreQuest incorporates two levels of checking that reflect this: Strict (full CCDC rules) and Relaxed (minimal rules). Most users will find that the Relaxed setting is adequate for private database creation, and this is the default setting in the Preferences menu.

However, some words of warning are needed. It is, for example, not necessary to give an author or compound name, but this means that these records will never be retrievable by Quest using these search parameters. The absolute minimum for an entry is to have a journal reference. If you switch on the Strict setting you will find that the author, compound name, formula, cell and class area also required at the CCDC.

The Check menu presents options to switch on/off checking at 1D, 2D and 3D levels. We advise keeping all of these checks ON. Switching off these checking tests can result in important errors being missed which result in making the entry unsearchable by Quest.


1.6 Recommended Minimum Data Fields

The recommended minimum data for private databases are:

Note that for space groups which require exact cell angle values (e.g. for P212121 the cell angles alpha = beta = gamma = 90 degrees ) the alpha, beta and gamma data fields for the entry may show no values. This is because they are exact and are therfore considered as "redundant" or "assumed" for that system. (For more information see #CELL - section 4.5.)

We also recommend that any textual information useful for retrieving items within your local context be added. For example:

   Synonym     Compound 2317P     Lab number 567894
   Qualifier   antibiotic activity
   Remarks     Refinement incomplete - see file wxyz.res
   Properties  Phi 56.7

Most of the data fields that are presented to the user have obvious meanings. They relate to the data fields of the BCCAB format specification (see section 4.5). Of the slightly less obvious 1D data fields:

Back to PreQuest User Guide

PreQuest: 2. Main Menu .