Volume 1 Chapter 7 Choosing and Naming Geometric Parameters

Back to Table of Contents

7.8 Selecting Suitable 3D Search Constraints

This is the most important problem facing the user of any 3D structural search system. Since we can only search a database for what we know about, the answer depends crucially on our existing knowledge of 3D structural chemistry in geometric terms. Unless we have a considerable body of geometrical knowledge available to us, then we are going to have difficulty in phrasing an adequate 3D search question.

This problem does not affect searches of text or of 2D connection tables. Here, as we have already seen, we are searching discrete, explicit and well-known data items. A 2D search is conducted in terms of the language of 2D structural chemistry: a human invention and a formalism that underpins all teaching and communication in the subject.

The 3D case is very different. First, as described above, the choice of geometrical parameters that are to be made explicit as search terms is entirely in the hands of the user. Secondly, and most importantly, the chosen geometrical parameters are not fixed quantities, but vary from molecule to molecule and from crystal fragment to crystal fragment. A double bond in the 2D chemical representation is invariant throughout the CSD (viz. bond type 2). The same double bond in the 3D representation has a length which varies over a range of real values in a range of crystal fragments The geometry of a molecule is thus peculiar to the molecule alone; it is not a human formalism invented for the purposes of communication or comparison.

All of this preamble may seem very obvious, but these rather basic considerations are vitally important in pointing out both the benefits and the pitfalls of the 3D search process.

A 3D search based on a single item of geometrical information, e.g. the nitrogen to ring-centroid distance (DIST) in the morphine example, simply retrieves a portion of the complete distribution of the DIST parameter. The selection of the upper and lower bounds of this slice has been made using prior knowledge of the value of DIST in parent morphine. In this case, the real search question is: "How many fragments in the CSD have a geometry that is similar to parent morphine?" The assumption has been made that DIST is a suitable parameter that characterises that similarity.

So, we return to the question of 'prior knowledge' as the crucial factor in phrasing a 3D search. In practice, our prior knowledge of 3D geometric structure is rather sparse. At a coarse level it has been derived from a consideration of hybridisation models. At a more detailed level, knowledge comes from microwave spectroscopy, gas-phase electron diffraction and, primarily, from surveys of crystal structure data.

Because the results of these surveys have been published, then we do have adequate knowledge in some specific areas, e.g. standard bond lengths involving a wide variety of element pairs*, limiting covalent and van der Waals radii, valence angles that define a variety of coordination geometries, torsion angles that describe a variety of conformations for rings and chains, non-bonded distances that describe certain specific pharmacophores, etc.

All of these data sources represent vital knowledge of 3D structure. However, by their very nature, they are generalisations over a very broad range of chemical fragments. They may not be directly applicable to a highly specific fragment that is of interest to a specific user of the CSD.

For all of these many reasons, the process of 3D search is intimately bound up with the process of 3D research, ie. the process of surveying all of the available geometrical information for the fragment under study before deciding:

to obtain the desired results in a 3D search.

* See, for example,

Back to Table of Contents

Volume 1 Chapter 7 3D Geometrical Surveys as Precursors to 3D Searching.