<< >> Up

3.6 Similarity Searching


A similarity search compares the chemical attributes of a query structure with those of each database entry (Vol.1, p.6-39 to 6-52).

The technique is useful since:

Similarity Coefficients (Willett et. al, J. Chem. Inf. Comput. Sci. 26, 36, 1986)

Two coefficients are available:

Both yield values between 1.0 for maximum similarity (identity) and 0.0 (maximum dissimilarity).

2D Similarity Search Procedure

  1. Draw structure in BUILD menu and define it as precisely as possible by specifying exact hydrogen counts, bonds types, etc. using commands in 2D-CONSTRAIN sub-menu.

  2. Select the similarity coefficient to be used: TANI or DICE in 2D-CONSTRAIN sub-menu.

  3. Select SIMIL in the 2D-CONSTRAIN sub-menu to define fragment for similarity search.

  4. Select STOP-LIMIT in SEARCH menu, to specify number of hits, eg. 30.

  5. Start search by selecting HITALL and START commands.

  6. At the end of search the top n hits are ranked by decreasing values of similarity coefficient and a suub-database of these n hits is saved for subsequent searching.

2D Similarity Example

Menu Commands Required

1. Draw fragment in BUILD menu:

2. Constrain O and N to be NH2 and OH:

3. Specify fragment as precisely as possible:

4. Fragment definition is now complete:

Initiate CSD Similarity Search: Follow steps 4 to 6 in the `search procedure' described above.

Similarity Search Results

<< >> Up