- Geometrical calculations for complete CSD entries.
- Location of 3D substructural fragments (intramolecular or intermolecular)
within the crystallographic connectivity records of the CSD.
- Geometrical calculations for substructural fragments.
- Statistical and numerical analyses of fragment geometry.
- Output of atomic coordinate data, in a variety of forms, for complete CSD
entries or for substructural fragments.

- Intramolecular: based on the connectivity of bonded units (residues) in the
crystal chemical unit of each CSD entry.
- Intermolecular: based on the extended connectivity of the complete crystal
structure, constructed using van der Waals radii supplied by the user.

- A subfile of CSD entries in FDAT format. This subfile is generated by the
program QUEST in response to a particular search query.
- A file containing alphanumeric instructions to control the operation of the
GSTAT program.

- A file of atomic coordinates for use by external modelling software.
- A file of fragment geometry, as defined by the user, for use by external
statistical, numerical or visualization software.
- Files of information generated by one (or more) of the statistical
methodologies contained within GSTAT itself.

Within the graphics version, much of the functionality of GSTAT (except the ability to perform statistical analyses) has been transferred to the upgraded QUEST3D program. In this Version, the processes of statistical analysis and of data visualization, are being transferred to a new menu-driven and interactive program called VISTA.

All releases of GSTAT issued after October 1st 1992 are interfaced to the
graphics Version QUEST3D program *via* the "Fragment" file generated by
that program. This interface permits GSTAT to take advantage of the improved
precision of 2D/3D substructure searches that is afforded by QUEST3D.

Volume 4 (Chapters 1 - 9) of the CSD System Documentation describe the functions of GSTAT and the structure of the alphanumeric "Instruction File" through which these functions are accessed.

A central part of Volume 4 is Chapter 9, which presents a definitive glossary of GSTAT keywords and qualifiers, arranged in keyword order.

The earlier Chapters 2 through 4 of Volume 4 describe the operation of individual sections of the program, together with the associated keyword subset and illustrations of program input and output. Chapters 5 through 8 deal with miscellaneous background information essential to a full understanding of the GSTAT program.

The remainder of this introduction presents brief summaries of the functionality of GSTAT and its mode of operation. The introduction is ordered according to the Chapter titles of Volume 4. Items in bold upper case font in Sections 14.2 - 14.4 below are the relevant major GSTAT keywords.

**CALC**ulation of standard**INTRA**molecular geometry: bond lengths, valence angles and torsion angles.**CALC**ulation of**COORD**ination sphere geometry: distances and angles about a selected element within some specified radius.**CALC**ulation of**INTER**molecular distances involving specified elements within specified distance limits.**OUTPUT**of atomic**COORD**inates for the complete crystal chemical unit in the following forms:- Fractional coordinates referred to crystallographic axes.
- Cartesian coordinates based on crystallographic axes.
- Cartesian coordinates referred to inertial axes.

**FRAG**: specification of the substructural fragment.**SETUP**: specification of geometrical objects such as centroids, vectors and planes.**DEF**ine: definition and naming of numerical parameters to be generated for each occurrence of the specified chemical fragment. These parameters may be:- Data-entry parameters,
*e.g.*R-factor, space group number, etc. - Simple geometrical parameters calculated from the atomic positions and/or the
geometrical objects established
*via*the SETUP command. - Special geometrical parameters, such as ring puckering parameters, generated
*via*special functions within GSTAT.

- Data-entry parameters,
**TRANS**: formation of linear combinations of DEFINE'd parameters*via*simple FORTRAN-like statements,*e.g*. addition of two parameters, absolute value of a given parameter,*etc*.**SEL**ect: selection of located fragments on the basis of their 3D geometrical characteristics,*e.g.*a specified torsion angle must fall within a specified numerical range,*etc*.**KILL**/**KEEP**: elimination/retention of specific fragments by use of their reference number(s).

Special and more detailed sections discuss the facilities available for (a) treating fragments that may exist in 'chiral' and 'achiral' chemical environments, and (b) handling the effects of topological symmetry that may occur in some simple chemical fragments.

The processes described in Chapter 3 of Volume 4 result in the generation of a data matrix of Nf rows and Np columns, where Nf is the number of fragments that are located and which pass the 3D selection procedures, and Np is the (fixed) number of numerical parameters specified for each fragment.

- Simple Descriptive Statistics: provides a listing of the data matrix and,
for each column (variable) calculates the mean, the maximum and minimum values,
the sample standard deviation, the standard deviation of the mean, and the
number of observations.
- Visual Display of Geometrical Data: generation of
**HIST**ogram(s) for individual variable(s).**SCAT**tergram(s) for pair(s) of variable(s).

- Visual Display of 3D Structures:
**OUTPUT**of atomic**COORD**inates for each occurrence of the fragment in the styles described at 14.2 above, for use by external plotting packages.

**SUP**erposition of retrieved fragments using inertial axis coordinates and least-squares fitting, for use by external plotting packages**CHI**-squared analyses of distribution(s) of individual variable(s).- Generation of
**COR**relations, covariances and individual parameter variances in matrix form. - Simple linear
**REGR**ession of one parameter on another. **P**rincipal**C**omponent (**FAC**tor)**A**nalysis of a number of variables selected from the complete data matrix. Scattergrams of principal component scores often provide a valuable visual overview of the multivariate data matrix, and can indicate the presence of groups (clusters) of fragments having similar geometry.**CLUST**er Analysis based on a number of variables selected from the complete data matrix: numerical dissection of the dataset into groups (clusters) of fragments having similar geometry

GSTAT has the ability to override this default so as to construct a distance-based intramolecular connectivity table which uses covalent radii that are input by the user.

GSTAT also has the ability to construct an intermolecular connectivity representation using van der Waals radii for specific elements that are input by the user. This intermolecular connectivity can then be used to locate fragments that involve hydrogen-bonded and non-bonded interactions between specified elements.

Chapter 5 of Volume 4 describes how to modify the connectivity representations that are used by GSTAT in the location of chemical fragments. It also describes the differences that exist between the crystallographic connectivity tables employed in GSTAT, and the chemical connectivity tables that are used by the CSD programs QUEST (basic) and QUEST3D (graphics).

The QUEST3D-GSTAT link described in Chapter 7 of Volume 4 provides a mechanism through which GSTAT may take advantage of the integrated fragment location algorithms of QUEST3D. By this means, the fragment search mechanisms of GSTAT are circumvented, and the program will operate on the atoms identified in the more precise QUEST3D search.

These files consist of (a) files of atomic coordinates, (b) files of fragment geometry (the data matrix referred to in Sections 14.3 and 14.4), and (c) various files that are specific to one or more of the statistical and numerical techniques summarized in Section 14.4.

The glossary is ordered alphabetically by keyword for ease of reference.

Volume 1 Chapter 15 Introduction to PLUTO.