<< >> Contents

APPENDIX 13 - NOTES FOR UNIX USERS


This Appendix provides information for users of the Cambridge Structural Database System Unix implementation for Silicon Graphics 4D series and Sun (Sparc) workstations.

13.1. Introduction - Features of the Unix Implementation

This section covers some of the features within the QUEST environment under Unix which are different from the Machine Independent Package (MIP).

To make use of them it is not necessary to know how these features work but an idea of how they operate can help in understanding how the program performs in given situations.

13.1.1. SPLIT DATABASE

In the MIP package the information items for the entries are kept together in a single sequential file. This results in two major drawbacks:


* To position the file at a given entry it is necessary to read all the entries between your current location and the entry to which you wish to go. Thus a search starting at a refcode towards the end of the database can take a considerable time to reach the required location in the file before even starting to search.


* When the program is searching and an entry fails a test on the MASK record then all the rest of the data for that entry must be read before the next entry can be checked.

The Unix version of QUEST solves these problems by separating the MASK records from the Text + Connectivity (TEXT/CONN) and the Structural Data (DATA) records.

The files used in these cases have the suffixes .scr and .tcd, the main data files being: aser<nn>.scr

aser<nn>.tcd

where <nn> is the 2-digit product identification code, given by the CSD2DID environment variable. If you create a subset of the database then QUEST will produce a matching set of files based on the problem name, for example, test1.scr and test1.tcd.

When an entry has passed the screening tests on its MASK record it is then necessary to obtain the next section of the information for that entry. This information is stored in the aser<nn>.tcd file. A pointer in the MASK record is used to locate the corresponding data in the .tcd file.

The information for an entry is divided up in the .tcd file into TEXT/CONN and DATA records. Once past the MASK record the TEXT/CONN record is read and this is followed by the DATA record if required. This means, for example, that a search on an author's name will have enough information after passing the MASK tests and reading the TEXT/CONN information to be determined correctly so that the usually larger DATA record does not have to be read.

13.1.2. FOREGROUND/BACKGROUND PROCESSES

This is a feature of the Unix implementation which is similar to the VAX/VMS implementation. For the MIP implementation searching of the database will stop when an entry is hit and will not proceed again until the user has decided whether to keep or reject the entry.

In the foreground/background approach the background process continues to search for the next hit while the user is studying the last hit in the foreground process. This reduces the time for a typical interactive search as the user does not spend as much time waiting between hits.

13.1.3. 3D SCREENS

The use of the 3D screens applies to all implementations of Graphics QUEST3D and information on 3D screens is given in chapter 7 of Volume 1.

13.2. QUEST

This section describes ways in which QUEST may be used under Unix.

13.2.1. USING THE QUEST SHELL SCRIPT

To carry out a QUEST search under Unix, the command line may contain three pieces of information in addition to the quest command:

root_filename A name of your choice used by the QUEST program to produce output files of the generic name root_filename.xxx where xxx indicates the contents of the file (listed below in section 13.2.5 "QUEST Files").

It is good practice to choose a root_filename that describes the search you are doing, for example sugars. This will make files easier to locate after the search.

input_option If you wish to use QUEST commands already set up in a question file prior to invoking QUEST, you should enter the filename of the question file here.

If you wish to carry out an interactive search of the database, you may use the /dev/tty device file.

Note: The Unix implementations of QUEST are, by default, interactive, unlike the VAX/VMS implementation. Thus, the /dev/tty argument is optional under Unix, providing that you are searching the Master database.

database The CSD master database is the default database; therefore this piece of information may be omitted if the search is to be made on the master database.

If, however, the search is to be made on a subset of the database, the filename of the subset should be entered.

Note that if a search is to be carried out on a subset of the master database, all three pieces of information must be present.

(Subsets of the master database are created by using the SAVE ASER command within QUEST, or the CCDC RETFIL command)

A QUEST search is initiated using this information by typing:

	<PROMPT>	quest root_filename input_option database

The following table summarises the various search commands that can be used to initiate QUEST.

Command line

root_filename
input_option
database
example
quest a /dev/tty
a
Typed interactively
CSD Master

quest a
a
Typed interactively
CSD Master

quest a b
a
file b.que
CSD Master

quest a /dev/tty c
a
Typed interactively
file c

quest a b c
a
file b.que
file c

quest a a c
a
file a.que
file c

Example 1 - Interactive Searches

An interactive search of the CSD master database may be initiated by typing:

	<PROMPT>	quest output /dev/tty

	or simply
	
	<PROMPT>	quest output

This will allow you to enter the question commands immediately.

Note that the output files from this search will be named output.xxx.

Example 2 - Question Files

If your QUEST search commands are in a file called myfile.que, QUEST is initiated by typing:

	<PROMPT>	quest output myfile.que

Note that the output files from this search will be named output.xxx.

Example 3 - Searching Database Subsets

In the examples above, the searches will be carried out on the default database, the CSD master database. However, it is also possible to carry out a search of a subset of the master database.

The SAVE ASER command allows you to create a database containing only those entries which are hits for a question.

For example, in Example 3 above, if SAVE ASER had been one of the command lines included in the myfile.que question file, then the database files output.scr and output.tcd would have been created by QUEST for all those entries that satisfied the rest of the command lines in the question file.

To carry out an interactive search on this subset, the following command should be issued:

	<PROMPT>	quest    newfile    /dev/tty    output

This will create output files of the generic name newfile.xxx.

If you wanted to carry out a search of the database subset using a new question file, e.g. specific.que, and saving the output to files prefixed with newfile, you would issue the command:

	<PROMPT>	quest    newfile    specific.que   output

However, if you wanted to carry out a search of the database subset using a question file newfile.que, you would have to issue the command:


	<PROMPT>	quest    newfile    newfile.que    output

The command line

	<PROMPT>	quest    newfile    newoutput.que

would be interpreted as meaning a search of the CSD master database using a query file called newoutput.que.

Note that the Unix implementations do not add the extension ".que" to the query file you specify, unlike the VAX/VMS implementation, where the default extension is ".que".

13.2.2. MAINTAINING MULTIPLE QUEST INITIALISATION AND CONFIGURATION FILES

Chapter 13 in Volume 1 ("Customising Your QUEST/QUEST3D Implementation") explains how the Unix implemetations of QUEST search for QUEST Initialisation files in the order:

If the environment variable QUESTINIT is assigned a value then this file is used. For example, you can type:

	<PROMPT>	setenv   QUESTINIT    my_init.que    

and QUEST will use the instructions in my_init.que every session.

To remove a previous assignment, type:

	<PROMPT>	unsetenv    QUESTINIT

If there is a file in the current directory called quest_init.que then this file is used.

If there is a file in the user's home directory called quest_init.que then this file is used.

Otherwise, the file $CSDEXEC/quest_init.que will be used.

The release copy of this file contains only comments but your local system manager can always add instructions to this file if everyone at your site uses a common set of instructions.

Similarly, QUEST3D Configuration files are searched for in the order:

If the environment variable QUESTFIG is assigned a value then this file is used.

For example, you can type:

	<PROMPT>	setenv    QUESTFIG    my_config.fig    

and QUEST will use the colours in my_config.fig every session.

To remove a previous assignment, type:

	<PROMPT>	unsetenv    QUESTFIG

If there is a file in the current directory called quest.fig then this file is used.

If there is a file in the user's home directory called quest.fig then this file is used.

Otherwise, the file $CSDEXEC/quest.fig will be used.

Some users may find that they wish to maintain, for example, multiple QUEST Initialisation files.

This section shows one method of doing so safely and efficiently.

Suppose that sometimes you use QUEST on a Silicon Graphics console and wish to use the graphics. On other occasions you only have access to a 4100 Series terminal. On still other occasions you do not wish to use the graphics at all.

In the C-shell, you could create the following files in your login directory (~), and insert the following text:

	<PROMPT>	vi      ~/quest_init.iris
			COMMENT   Silicon Graphics console initialisation file
			TERMINAL    IRIS
			MENU

	<PROMPT>	vi      ~/quest_init.t410
			COMMENT   TEK 4100 Series initialisation file
			TERMINAL    T410X
			MENU

	<PROMPT>	vi      ~/quest_init.blank
			COMMENT Blank Initialisation File - no functional commands

Then, in your .login, you could insert the following lines:

	<PROMPT>	vi      ~/.login
			alias	iris	`setenv	QUESTINIT	~/quest_init.iris'
			alias	t410	`setenv	QUESTINIT	~/quest_init.t410'
			alias	blank	`setenv	QUESTINIT	~/quest_init.blank'
			alias	noinit	`unsetenv	QUESTINIT'

Now, every time you log on, you just type the instruction corresponding to the terminal you are using.

For example, if you are using the console, just type:

	<PROMPT>	iris

and every subsequent QUEST session, until you log out or change the assignment of $QUESTINIT, will automatically enter full menu mode for a n SG console.

Similarly, typing blank will ensure that QUEST just uses the initialisation file containing only your one-line comment (you could equally use an empty file).

Typing noinit will deassign the environment variable QUESTINIT so that QUEST will search in the current and home directories, and CSDEXEC for a file called quest_init.que in the usual way.

Of course, the above discussion can be applied equally well to QUEST Configuration files and the environment variable QUESTFIG.

13.2.3. MAINTAINING DATABASE SUBSETS

To maintain a database subset over successive releases of the CSD System, it is recommended that the original query file be retained and used with each release of the CSD. This is advisable for several reasons:


* New entries may have been added to the database, which satisfy your question
* Errors in entries may have been corrected. (Sometimes authors take several months to respond to CCDC queries about original papers.)
* Very occasionally, refcodes are changed by the CCDC. (The "previous refcode" test - using the Quest keyword *PREFCODE - may be used to search for old refcodes.)
* The format of the ASER file may change from time to time, as new fields are added to support enhanced searching facilities.

13.2.4. QUEST FILES

The following table lists the file types and file names used by QUEST.

<prob> is the problem name used in starting the QUEST search.

The 2-, 3- or 4-character filename suffix indicates the contents or purpose of the file.

For example, if you typed

	<PROMPT>	quest glucose
			...
		> SAVE FDAT
			...
		> QUEST ...

a file called glucose.dat would be created, containing an FDAT entry for every `hit'.

Input Files

Unix File Name

Funcion
Basic QUEST Command
Graphics QUEST3D Procedure
In or Out?
1. QUESTFIG
2. ./quest.fig
3. ~/quest.fig
4. CSDEXEC/
quest.fig
QUEST3D graphics configuration file. Allows users to change the colours used by Graphics Version QUEST3D. On machine-specific implementations, the file is searched for in one of four locations.
Not available
The file is read automatically, if it exists.
Input
1. QUESTINIT
2. ./quest_init
.que
3. ~/quest_init.que
4. CSDEXEC/
quest_init.que
QUEST initialisation file. Allows users to specify instructions that will be executed automatically every time QUEST starts. On machine-specific implementations, the file is searched for in one of four locations.
The file is read automatically, if it exists.
The file is read automatically, if it exists.
Input
<prob>.rbr
Restrict by refcode. Only entries whose refcodes are in the file on unit 30 will be searched. Note that for VAX/VMS users, the preferred way of doing this is to use a virtual database subset.
INIT 7
INIT 7
Input
<prob>.rco
RECOVER. This enables you to set up a QUEST instruction document, which will be read by QUEST when you issue the RECOVER instruction.
See also QUEST Initialisation files.
RECOVER
If RECOVER is to be used, it should be issued before entering graphics mode. (ie, before typing MENU)
Input
<prob>.run
Reserved for Graphical Input demonstration purposes.
Not applicable

Input
Default: /dev/tty
QUEST instruction document, ie, the commands that will define the QUEST search.


Input
$CSDEXEC/
helptext.txt
The QUEST HELP file.
HELP
Select HELP in any menu.
Input
$CSDEXEC/
menu.new
Used in Development versions of the QUEST input menu package. Not used in release versions.
Not applicable
-
Input
$CSDEXEC/
template.dat
Graphical Input Template File
Not applicable
Select TEMPLATE in the BUILD menu.
Input

Output Files

Unix File Name

Funcion
Basic QUEST Command
Graphics QUEST3D Procedure
In or Out?
<prob>.sve
STORE-FRAGMENT file.
Not applicable
STORE-/FETCH-/DELETE-FRAGMENT commands in the FILES menu.
Input/Output
<prob>.bib
FBIB file - Bibliographic information for `hit' database entries.
SAVE FBIB
Select SAVE FBIB in the SEARCH menu.
Output
<prob>.con
FCON file - chemical connectivity information for `hit' database entries.
SAVE FCON
Select SAVE FCON in the SEARCH menu.
Output
<prob>.dat
FDAT file, containing one entry for each database `hit' kept by the user.
SAVE FDAT
Select SAVE FDAT in the SEARCH menu.
Output
<prob>.dbg
Reserved for Graphical Input debugging.


Output
<prob>.fgd
QUEST3D search fragment definition
Not available

Output
<prob>.fgn
QUEST3D search fragment numbers
Not available

Output
<prob>.fser
FSER file - formatted database subset for `hit' database entries. The ADAPT program may be used to convert this file to the ASER file format. Usually only used by CCDC staff.
SAVE FSER
Select COMMAND in the BUILD or SEARCH menus, and type SAVE FSER.
Output
<prob>.gcd
REFCODE file, containing the refcodes of every database `hit' kept by the user.
SAVE REFCODE
Select SAVE REFC in the SEARCH menu.
Output
<prob>.gls
GLIST


Output
<prob>.ins
Used by Graphical Input to communicate with QUEST program. It is of no relevance to the user .
Not applicable
-
Output
<prob>.jnl
QUEST Journal file - automatically provides record of a QUEST session's query, and any `hit' entries.
No instruction switches this on, but storing of `hit' information may be turned off with the NOJOURNAL command
No instruction switches this on, but storing of `hit' information may be turned off by selecting NOJOURNAL in the SEARCH menu.
Output
<prob>.mdl
MODEL file. (Interface to molecular modelling packages)
Not available
Select SAVE MODEL in the SEARCH menu.
Output
<prob>.ps
POSTSCRIPT file, containing 2D chemical diagrams for all `hit' entries.
Not available
Select POSTSCRIPT in the SEARCH menu.
Output
<prob>.scr
For machine-specific implementations, this is one of a pair which constitute a searchable database subset.
SAVE ASER
Select SAVE ASER in the SEARCH menu.
Output
<prob>.sts
QUEST3D/STATS binary interface file.
Not available

Output
<prob>.sum
QUEST3D search scratch/summary list
Not available

Output
<prob>.tab
QUEST3D search cumulative table
Not available

Output
<prob>.tcd
ASER file (database subset), containing one entry for each database `hit' kept by the user.
SAVE ASER
Select SAVE ASER in the SEARCH menu.
Output
<prob>tmp.csd
Reserved for Silicon Graphics and SUN versions.


Output
Default: /dev/tty
QUEST output


Output

13.3. GSTAT

13.3.1. USING THE GSTAT PROCEDURE

To activate the GSTAT data analysis program under Unix, the command line may take one of two forms, depending on whether a common root filename is to be used for all input/output files:

	<PROMPT>	gstat  root_filename

or

	<PROMPT>	gstat

In both cases, the shell script that initiates GSTAT will enquire whether you wish to display the output on the screen, or re-direct it to a file. It is convenient to inspect output on the screen for a test run, and to use re-direction to a file when a permanent record is desired.

Example 1 - Using A Root Filename.

To use a root filename, invoke GSTAT with a command of the form:

	<PROMPT>	gstat   sugars

The shell script which starts GSTAT will then assume that the FDAT data input file is called sugars.dat, the query file is called sugars.geo, and the output files will have names of the form sugars.XXX.

Example 2 - Supplying File Names Explicitly.

If GSTAT is initiated by simply typing:

	<PROMPT>	gstat

The user will then be prompted to supply the names for all input and output files explicitly. Note that in this case, file extensions are NOT added automatically, and filenames will be used exactly as typed by the user.

13.3.2. GSTAT FILES

The following table lists the file types and file names used by GSTAT.

<prob> is the problem name used in starting the GSTAT session. The 3- or 5-character filename suffix indicates the contents or purpose of the file.

Input Files

Unix File Name

Function
GSTAT Instruction
In or Out?
<prob>.dat
FDAT file, containing one entry for each database `hit' kept by the user.

Input
<prob>.fgn
Binary file containing fragment numbers, written during a QUEST3D search to communicate information to GSTAT.

Input
<prob>.geo
instructions to be read in when the input is not typed in via a terminal

Input

Output Files

Unix File Name

Function
GSTAT Instruction
In or Out?
<prob>.cls
Coordinates of the "Most representative fragment(s)" (MRF) deduced by the CLUSTER analysis package.
OUTPUT COORD
Output
<prob>.cor
list of atom labels, atomic coords - orthogonal or fractional
OUPUT COORD
Output
<prob>.fac
Output file containing Principal Component Analysis (PCA) scores for each fragment.
PCA/FAC
Output
<prob>.gcd
8-character reference codes for database entries which survive all selection procedures.
COD
Output
<prob>.lis
output when not directed to a terminal

Output
<prob>.sup
list of unit cell parameters, atomic coordinates
SUPERpose
Output

13.4. PLUTO

13.4.1. USING THE PLUTO PROCEDURE

To carry out a PLUTO session under Unix, the command line may contain two pieces of information in addition to the PLUTO command:

root_filename A name of your choice used by the PLUTO program to produce output files of the generic name root_filename.xxx where xxx indicates the contents of the file (listed below in section 13.2.4 "PLUTO Files").

The input FDAT file must also be called root_filename.dat.

It is good practice to choose a root_filename that describes the search you are doing, for example SUGARS. This will make files easier to locate after the session.

input_option If you wish to enter your instruction interactively, you may use the device file /dev/tty. /dev/tty is the default input_option.

If you wish to use PLUTO commands already set up in a command file prior to invoking PLUTO, you should enter the filename of the PLUTO instruction file here.

The CCDC-recommended filetype extension for PLUTO instruction documents is .cmd.

A PLUTO session is initiated using this information by typing:

	<PROMPT>	pluto    root_filename     <     input_option

Example 1 - Interactive PLUTO Session

An interactive PLUTO session on the saved data file sugars.dat may be initiated by typing:

	<PROMPT>	pluto    sugars

This will allow you to enter the question instructions immediately.

Note that the output files from this run will be named sugars.xxx.

Example 2 - Instructions in a File

If your PLUTO instructions are in a file called myfile.cmd, PLUTO is initiated by typing:

	<PROMPT>	pluto    myfile   <   myfile.cmd

Note that the output files from this run will be named myfile.xxx.

If your PLUTO instructions are in a file called myfile.cmd but you want your output files to be called output.xxx, PLUTO is initiated by typing:

	<PROMPT>	pluto    output    <   myfile.cmd

Example 3 - Supplying File Names Explicitly.

Typing:

	<PROMPT>	pluto

will give an interactive guide with defaults.

You may use any name at any prompt but output files should not have exactly the same name as input files.

13.4.2. PLUTO FILES

The following table lists the file types and file names used by PLUTO.

<prob> is the problem name used in starting the PLUTO session. The 3 character filename suffix indicates the contents or purpose of the file.

For example, if you typed

	<PROMPT>	pluto    glucose    

then a file called glucose.dat would be used for input.

Unix File Name

Function
In or Out?
<prob>.cmd
PLUTO instruction document, if not interactive.
Input
<prob>.dat
FDAT file, containing one entry for each database `hit' kept by the user.
Input
/dev/tty
The file to which Tektronix escape sequences are sent (Usually the terminal)
Output
<prob>.lis
PLUTO listing file.
Output

13.5. CSDCONVERT

The October 1992 release of the database contains new information which has resulted in the need to change slightly the format of the stored database.

Some users will have made subsets of the database and will wish to convert these.

To perform the conversion the CSDCONVERT command is provided.

Note that the command procedure invoked to initiate a QUEST search tests the specified database files and will halt if an old format database file is found.

13.5.1. EXAMPLE

Suppose an old database is thiazol.scr and thiazol.tcd.

To convert the database, specify the problem name (thiazol) as the first parameter, and the new subset name as the second parameter (e.g. newthiazol).

	<PROMPT>	csdconvert      thiazol     newthiazol


<< >> Contents