CSD System Version 5.1.6
Unix Installation Notes


Table of Contents



Overview


The UNIX distribution is now supplied on two separate CD-ROMs: Software & Database.

1.1 Basic Requirements


To install the UNIX version of the Cambridge Structural Database (CSD) System, you require:

The distribution comes with precompiled versions of the CSD System software for a select number of target machines on the Software CD-ROM (see Section 1.3 below). If you do not have one of these platforms you also require:

To run the X-Windows interface, you require:

In addition, you should be familiar with UNIX utilities like tar, cp, etc.

If disk space is a problem there are instructions for installing the system with some of the database files left on the Database CD-ROM.

1.2 Target Audience


This document is targeted at installers of the Cambridge Structural Database System. Please note that some UNIX systems still require root privileges to mount and unmount CD-ROMs. It is hoped that the CSD System can be introduced into your system with minimum disruption and maximum access.

1.3 Distributed Executables


Executables are provided compiled on the following platforms:

For the platforms listed above, the Software CD-ROM contains the executables. All the database files are contained on a second CD-ROM, the Database CD-ROM. If you have 565Mb of disk space available then the CSD System simply needs to be copied from the CD-ROMs to the hard disk. Users with these platforms need only read Section 2.

See Section 3 for details on accessing Brookhaven Protein Data Bank files through the CSD System.

1.4 Recompiling the CSD System


For platforms other than those listed in Section 1.3 above, it will be necessary to compile the system from the source code. Users should follow the instructions in Section 2 and then go to Section 5.

1.5 Document Contents


This document is divided as follows:

Section 1 Overview

Section 2 Installation

Section 3 The Protein Data Bank

Section 4 CSD System Documentation files/P>

Section 5 Recompiling the CSD System

Section 6 Appendices

1.6 Document Conventions


Italic is used for names of files, directories, host names and to emphasise new terms when introduced.

Bold is used for command names.

Typewriter is used for what you see on the screen, and for the contents of files.

Bold typewriter represents text you type exactly as shown.

Italic Typewriter is used in examples to show variables for which a context-specific substitution should be made such as the full path name of a directory.

UPPERCASE names that begin with a '$' such as $CSDHOME represent the value of a shell environment variable.

1.7 Content of CD-ROMs


The CD-ROM labelled "Software" contains the CSD System (executables and the PDB & DBUSE database files) for Sun (Solaris 2.x), Silicon Graphics, IBM RS/6000, DEC Alpha AXP (running OSF-1), Hewlett Packard 9000/[78]00 series and Linux in a format that can be run directly from the Software CD-ROM. The Software CD-ROM also includes the source code so that the CSD System can be recompiled for other target machines.

Note: The full CSD database & software cannot be run totally from the Software CD-ROM.

The CD-ROM labelled "Database V5.16" contains the database files for the three searchable databases compiled by the CCDC : the full CSD, the CSD-PDB, and the DBUSE.

The CD-ROM labelled "Protein Databank" contains all 'layer 2' entries released by Brookhaven up to, and including, July 30 1998 but excluding entries determined by NMR and theoretical modelling studies.

1.8 User Support


If you encounter any difficulties with the installation or use of the CSD System then please contact the CCDC User Support service.

User Support

Cambridge Crystallographic Data Centre
12 Union Road
Cambridge CB2 1EZ
UK

Fax: +44 1223 336033
Tel: +44 1223 336022
Email: support@ccdc.cam.ac.uk

If telephoning the Centre please have access to a workstation.

To help us to help you, please supply the following information when requesting user support:

  1. The name and model of the workstation you are using.

  2. The version of the operating system currently installed.

  3. Any error messages output by the system, or by the CSD software.

  4. If you have a problem with the installation, the step at which the problem occurred.


Installation


The following steps are necessary to install the CSD System from the CD-ROMs:

  1. Mount the Software CD-ROM.

  2. Check the contents of the CD-ROM.

  3. Check whether the CSD System will run directly from the Software CD-ROM.

  4. Decide where you want to put the CSD System files.

  5. Copy the CSD System files to the desired locations.

  6. Set up the environment for running the CSD System.

2.1 Step 1: Mount the Software CD-ROM


For guidelines on mounting the CD-ROM see Appendix 6.1. Alternatively, consult your manual pages for the UNIX mount command (type man mount and look for the keywords hsfs, cdfs, High Sierra and iso9660).

These installation instructions refer to the directory under which the CD-ROM is mounted, the mount-point, as CDROM. The examples assume that the CD-ROM is mounted under /cdrom.

The directory CDROM is referred to as the top directory of the CD-ROM.

Example: A typical mount command will have a form similar to the one for the DEC Alpha AXP under OSF/1:

mount -r -t cdfs -o noversion /dev/rz12c /cdrom

Remember that you will have to be super-user to execute the mount command, that device names (here rz12c) vary from system to system, and to first insert the CD-ROM into the drive before typing any mount command.

2.2 Step 2: Check the contents of the Software CD-ROM


List the contents of the Software CD-ROM.

If all directory names appear in lower case and without version numbers (e.g., csds ) then you can proceed to Step 3 (Section 2.3).

If file names appear in upper case (e.g. CSDS ) then you can also proceed to Step 3 (Section 2.3) but remember that all your directory, file and command names will have to be typed in upper case. Upper case names appear, for example, if you forget the '-o noversion' directive in the example in Section 2.1 above.

If file names appear in any other form (e.g., with version numbers: CONFIG.SH;1) then you should consider transferring the CSD System from the CD-ROM as soon as possible using the CSD System utility cdcopy. Pay particular attention to Section 2.3..

2.3 Step 3: Check if the CSD System will run directly from the Software CD-ROM


WARNING

If you have a previous release of the CSD System currently installed on your machine, please take care that environment variables and aliases pertaining to this previous release such as CSDEXEC, CSD_TCD and the aliases quest, pluto etc. have been removed from your current working environment. This will avoid any confusion of script names.

Please Note: We would recommend you to backup and delete your previous installation of the CSD System.

Before copying any files to the hard disk you should check that the CSD System will run directly from the Software CD-ROM. The full CSD database will not run totally from the Software CD-ROM but the CSD-PDB & DBUSE databases will.

1. Set the environment variable CSDHOME to be the mount point of the CD-ROM.

For the C-shell (csh), type:

setenv CSDHOME CDROM

Example:

setenv CSDHOME /cdrom/csds_v516

For the Bourne shell (sh) or Korn shell (ksh) type:

CSDHOME=CDROM; export CSDHOME

Example:

CSDHOME=/cdrom/csds_v516; export CSDHOME

2. Add $CSDHOME/bin (or $CSDHOME/BIN if your file names are appearing in upper case) to your PATH.

For C-shell (csh) type:

setenv PATH $CSDHOME/bin:$PATH; rehash

For the Bourne shell (sh) or Korn shell (ksh), type:

PATH=$CSDHOME/bin:$PATH; export PATH

Please Note: In all of the examples and directions that follow it will be assumed that the environment variable CSDHOME is set. If you encounter any problems with running the CSD System you should first check that this variable is set correctly.

3. Move to a directory in which you wish to start the CSD System (the directory should be writable by you).

4. Start QUEST, type one of:

quest -j test1 -db PDB

quest -j test2 -db DBUSE

quest -j test3 -db $CSDHOME/examples/y2k

or if your CDROM filenames appear in upper case

QUEST -j test1 -db PDB

etc...

and satisfy yourself that the CSD System is running correctly.

If you get the message "Executable for machine type machine not found" then you may have to recompile the CSD System. Instructions for recompiling the CSD System are in Section 5, however, for the moment, you should continue to read the steps in this section for instructions on how to transfer files and set up the environment.

If you get any other error message then contact the CCDC.

Once you are satisfied that the CSD System appears to run correctly continue to Step 4 below.

2.3.1 Strange file names on the CD-ROM

It is possible that the file names appearing on the CD-ROM will have some or all of the following "attributes":

u) They are upper case.

v) They have a version number appended to them. e.g. CONFIG.SH;1

d) Files that have no extension appear with a period '.' at the end. e.g. bin/quest.

If one, two or all of these cases occur when you mount the CD-ROM you can still run the CSD System directly, but you will have to set the environment variable CSD_CDROM_FILENAMES to one of the trans_*.sh scripts stored in the CDROM/bin directory. Selecting which one will depend on which of the "attributes" (u), (v) or (d) you find. For example if you see that the makefile file in the CDROM directory appears as MAKEFILE.;1, then you should set CSD_CDROM_FILENAMES to the trans_uvd.sh script.

For the Bourne shell (sh) or Korn shell (ksh) type:

CSD_CDROM_FILENAMES=CDROM/BIN/TRANS_UVD.SH\;1 ;export CSD_CDROM_FILENAMES

For the C-shell (csh), type:

setenv CSD_CDROM_FILENAMES CDROM/BIN/TRANS_UVD.SH\;1

The you can run quest by typing, e.g.:

CDROM/BIN/QUEST.\;1 -j junk -db PDB

(Note: If you can't find trans_uvd.sh then use trans_uv.000.)

2.4 Step 4: Decide where to put the CSD System Files


You can no longer run the entire CSD System directly from the Software CD-ROM so you will have to copy some or all of the CSD System files to a hard disk. The CD-ROM is generally much slower than a magnetic hard-disk so the more of the CSD System you can transfer off the CD-ROM the more responsive the system will be.

If you intend to copy all of the necessary files from both CD-ROMs to one disk then go to Section 2.5.

If you wish to copy only some of the necessary files to disk or copy the files to more than one disk then go to Section 2.7.

2.5 Step 5: Getting the CSD System off the CD-ROMs and onto a hard disk


This Section gives instructions for copying all of the necessary files from the Software & Database CD-ROMs to the hard disk. This requires 565Mb. If you do not have this space available on one disk then go to Section 2.7.

The CSD System provides you with a utility cdcopy to ensure that all necessary files are copied and access permissions are set. Use the executable script cdcopy described in this section. Should cdcopy fail for any reason, the necessary files can always be copied manually from the CD-ROMs. Instructions for doing this are given at the end of this section.

In the following instructions:

The mount point of the CD-ROMs is CDROM.

The directory to which the files will be copied is CAMBRIDGE.

Before continuing, create the directory CAMBRIDGE to which you wish to copy the CSD System files. It is recommended that you copy files to an empty directory. We recommend you to backup and delete your previous installation of the CSD System before commencing this installation.

The bin directory of the Software CD-ROM contains an executable shell script cdcopy. You must provide cdcopy with the following options:

-level 1

-level 1 tells cdcopy to copy the following files & directories from the Software CD-ROM:

README file

bin executables directory no more than 27Mb per machine type.

config.sh configuration script file

csd database directory & some files 14Mb

csds configuration files directory ~1Mb

examples examples directory

makefile top-level makefile

man man pages directory

rc auxiliary scripts for config.sh (directory)

42Mb Total size (approx)

-from CDROM

-from takes as an argument the directory from which the CSD System files are to be copied. You should give the mount point of your CD-ROM drive.

-to CAMBRIDGE

-to takes as an argument the directory to which you wish to copy the CSD System files.

To run cdcopy make sure that CSDHOME is set to CDROM and type:

CDROM/bin/cdcopy -level 1 -from CDROM -to CAMBRIDGE

For example:

/cdrom/BIN/CDCOPY -level 1 -from /cdrom -to /usr/cambridge

If you have file names appearing with version numbers (see Section 2.3.1) then you should run cdcopy thus:

/cdrom/BIN/CDCOPY\;1 -uv -level 1 -from /cdrom -to /usr/cambridge

('-uv' standing for the fact that the CD-ROM file names appear in upper case and with version numbers)

To complete the full installation, load the Database CD-ROM and copy the two remaining database files to the CAMBRIDGE/csd directory. Following the names used in the example above, this is done manually as follows:

cp /cdrom/as516be.msk /usr/cambridge/csd

cp /cdrom/as516be.tcd /usr/cambridge/csd

The second command copies the main database text-conn-data file & may take some time. The copying of these two files is also explained in Section 2..

In the event that cdcopy fails, you can copy the necessary files from the CD-ROMs manually as follows. The command

sh /cdrom/bin/csdmach.sh

will output a string, MACHINE, (e.g. sunv5, sgiv5, rs6000 etc.) for use in copying the machine executables from the Software CD-ROM. Ensure the Software CD-ROM is loaded & perform the following copy commands.

cp -r /cdrom/csds /cdrom/rc config.sh CAMBRIDGE

mkdir CAMBRIDGE/bin

cp /cdrom/bin/* CAMBRIDGE/bin (This shouldn't copy directories)

mkdir CAMBRIDGE/bin/d_MACHINE

cp /cdrom/bin/d_MACHINE/* CAMBRIDGE/bin/d_MACHINE

Now mount the Database V5.16 CD-ROM and copy the database files

mkdir CAMBRIDGE/csd

cp /cdrom/* CAMBRIDGE/csd (Copies the database files - may take some time)

2.6 Step 6: Setting up the environment


You must now (re)set the UNIX environment CSDHOME to be the directory in which the CSD System is now installed (CAMBRIDGE) and add the new CSDHOME/bin to your PATH.

1. Set the environment variable CSDHOME.

For Bourne shell (sh) and Korn shell (ksh) type:

CSDHOME=CAMBRIDGE; export CSDHOME

Example :

CSDHOME=/usr/cambridge; export CSDHOME

For C-shell (csh) type:

setenv CSDHOME CAMBRIDGE

Example:

setenv CSDHOME /usr/cambridge

2. Add CSDHOME/bin to your PATH.

For Bourne shell (sh) and Korn shell (ksh) type:

PATH=$CSDHOME/bin:$PATH; export PATH

For C-shell (csh)

setenv PATH $CSDHOME/bin:$PATH; rehash

If you copied the two remaining database files from the Database CD-ROM as described above, the CSD System is now ready to run.

To make these changes permanent, add the commands executed in steps (1) and (2) to the users .login (csh) or .profile (sh, ksh) files in your HOME (login) directory. Do this for each user who will be accessing this particular installation of the CSD System or place them in a system-wide login or profile script such /etc/profile. Suitable commands to do this may be found in the files CSDHOME/bin/csdsetup.(csh,ksh,sh).

Go to Section 4 for details about copying other files (documentation and source code).

2.7 If you cannot fit all of the CSD System files on one disk


If space limitations prevent you from copying all of the CSD System to a single filesystem, then you need to answer the following three questions:

QUESTION 1. Do I have 42Mb of free space on a hard disk?

YES: You should do a level 1 cdcopy using a directory CAMBRIDGE on this disk as the new home for the CSD System. e.g. type:

CDROM/bin/cdcopy -level 1 -from CDROM -to CAMBRIDGE

Go to question 2.

NO: You cannot run the complete CSD System entirely from the Software CD-ROM. Only the PDB and DBUSE databases are accessible by running entirely from the Software CD-ROM. In performing the preliminary steps in Sections 2.1 - 2.3, you have already set up to enable the PDB and DBUSE databases to run in this way.

Go to Section 2.10.

QUESTION 2. Can I find a further 80Mb on any hard disk for the CSD mask file?

YES: The CSD mask file can be copied from the Software CD-ROM. The default mask file location is CAMBRIDGE/csd/. If CSD_MSK is a writable directory on that disk then copy the mask file. E.g. type:

cp CDROM/csd/as516be.msk CSD_MSK/as516be.msk

or if your file names appear in upper case, type:

cp CDROM/CSD/AS516BE.MSK CSD_MSK/as516be.msk

If you chose a location other than CAMBRIDGE/csd/ then you must do one of the three things listed under 'NO' below but replacing 'CDROM/csd' by 'CSD_MSK'. See the Notes below on how to make the best choice.

Go to question 3.

NO: You must do one of three things:

1) set the environment variable CSD_MSK to point to the top CDROM directory of the Database CD-ROM e.g. (for the C shell) type:

setenv CSD_MSK CDROM

2) edit the CAMBRIDGE/csds/db.dbl file. Details of what to do are given in Section 2.8.

3) create a soft link to the default mask file location, type:

ln -s CDROM/as516be.msk CAMBRIDGE/csd/as516be.msk

In all cases the Database CD-ROM must be loaded & remain mounted for the full CSD System to be usable. See the Notes below on how to make the best choice.

Go to question 3.

QUESTION 3. Can I find a further 440Mb on any hard disk for the CSD text-conn-data file?

YES: The CSD text-conn-data file can only be copied from the Database CD-ROM. The default text-conn-data file location is CAMBRIDGE/csd/. If CSD_TCD is a writable directory on that disk then copy the text-conn-data file. E.g. type:

cp CDROM/as516be.tcd CSD_TCD/as516be.tcd

or, if your file names appear in upper case, type:

cp CDROM/AS516BE.TCD CSD_TCD/as516be.tcd

If you chose a location other than CAMBRIDGE/csd/ then you must do one of the three things listed under 'NO' below but replacing 'CDROM/csd' by 'CSD_TCD'. See the Notes below on how to make the best choice.

NO: In this case the Database CD-ROM must remain mounted for the CSD System to be usable. You must also do one of three things:

1) set the environment variable CSD_TCD to point to the top CDROM directory of the Database CD-ROM e.g. (for the C shell) type:

setenv CSD_TCD CDROM

2) edit the CAMBRIDGE/csds/db.dbl file. Details of what to do are given in Section 2.8.

3) create soft link to the default text-conn-data file location, type:

ln -s CDROM/as516be.tcd CAMBRIDGE/csd/as516be.tcd

Notes:

The choice you make will depend on how your system is set up. For example, if you have a network with directory CSD_MSK mounted under different names depending on machine, then choice 1 will be most useful since different users can set the environment variable CSD_MSK to reflect the directory name as seen by their local machine. On the other hand, choice 3 is the simplest because there is no need to remember to set any environment variable, the CSD System picks up the correct location via the symbolic link. If you want the most flexibility then you can combine aspects of choices 1 and 3 by editing the database list file $CSDHOME/csds/db.dbl.

2.8 Editing the database list file csds/db.dbl.


The file $CSDHOME/csds/db.dbl is the file that all CSD System programs read to determine where the Cambridge Crystallographic Database files are currently located. As supplied, it contains three lines

${CSD_IND=csd}/as516be.ind

${CSD_MSK=csd}/as516be.msk

${CSD_TCD=csd}/as516be.tcd

representing the location of the three files that constitute the CSD database. The logic that the CSD System uses to interpret this file is as follows:

1) Each line represents a file name.

2) The files must appear in index, mask, text-conn-data file order.

3) Shell variables (names beginning with a $ and enclosed in brackets {}) are replaced by their values in the current environment using Bourne shell (sh) interpretation.

4) A resulting name that is relative (i.e. doesn't begin with a '/') is translated (e.g. converting to upper case) by CSD_CDROM_FILENAMES if set and then the environment variable CSDHOME (which is always set) is prepended, otherwise the name is passed on untouched.

For example, if you had to start quest with the command QUEST because the CD-ROM names appear to you in upper case then /as516be.ind would be converted to /AS516BE.IND and the file searched for would be /cdrom/AS516BE.IND.

If at any point you move any of the database files, you must edit csds/db.dbl to reflect their new location.

2.8.1 Simple examples:

If the index file is in CAMBRIDGE/csd and the mask and main database file are in the directory /disk2/database/csd (on another disk), then the database list file should be changed to contain the lines

csd/as516be.ind

/disk2/database/csd/as516be.msk

/disk2/database/csd/as516be.tcd

If the largest or 'tcd' database file had been left on the Database CD-ROM mounted under /cdrom, then the database list file would contain the lines

csd/as516be.ind

/disk2/database/csd/as516be.msk

/cdrom/as516be.tcd

If the files on your CD-ROM appear in upper case and you are leaving some of the CSD System files on the Database CD-ROM then type the full path names (beginning with a '/') exactly as they appear to you. e.g.

csd/as516be.ind

/disk2/database/csd/as516be.msk

/cdrom/AS516BE.TCD

2.8.2 A more complicated example:

Suppose you have placed the main database file as516be.tcd in the directory /disk2/export/ccdc/ on a machine named server, but your other machine, client1, has remote mounted this disk so that the directory name appears as /net/server/export/ccdc. To permit users on both client1 and server access to the CSD System without setting any extra environment variables, change the CAMBRIDGE/csds/db.dbl file thus:

csd/as516be.ind

csd/as516be.msk

csd/'uname -n'.tcd

(note that the quotes should be back- quotes.) Then, in the CAMBRIDGE/csd directory make the following soft links, type:

ln -s /disk2/export/ccdc/as516be.tcd CAMBRIDGE/csd/server.tcd

ln -s /net/server/export/ccdc/as516be.tcd CAMBRIDGE/csd/client1.tcd

This works because "uname -n" is a Unix command that returns the name of the machine executing it (here either server or client1). Because the CSD System interprets this file using sh, it will be looking for a text-conn-data file called csd/server1.tcd or csd/client.tcd, depending on which machine the CSD System command is running.

2.9 Setting CSDHOME and adding CSDHOME/bin to your PATH


1. Set the environment variable CSDHOME.

For Bourne shell (sh) or Korn shell (ksh), type:

CSDHOME=CAMBRIDGE; export CSDHOME

Example:

CSDHOME=/usr/cambridge; export CSDHOME

For the C-shell (csh), type:

setenv CSDHOME CAMBRIDGE

Example:

setenv CSDHOME /usr/cambridge

2. Add CSDHOME/bin to your PATH.

For the Bourne shell (sh) or Korn shell (ksh), type:

PATH=$CSDHOME/bin:$PATH; export PATH

For the C-shell (csh) type:

setenv PATH $CSDHOME/bin:$PATH; rehash

The CSD System should now be ready to run.

To make these changes permanent, add the commands executed in steps (1) and (2) to the users .login (csh) or .profile (sh, ksh) files in your HOME (login) directory. Do this for each user who will be accessing this particular installation of the CSD System or place them in a system-wide login or profile script such as /etc/profile.

2.10 Running PDB and DBUSE databases completely from the CD-ROM


It is no longer possible to search the full CSD database entirely from the Software CD-ROM.

In checking the Software CD-ROM, you will have set up the PDB and DBUSE databases to run completely off the Software CD-ROM See Section 2.3.

To make the setup permanent, add the commands executed in steps (1) and (2) of Section 2.3 above to your .login (csh) or .profile (sh, ksh) files. Do this for each user who will be accessing this particular installation of the CSD System or place them in a system-wide login or profile script such as

/etc/profile.


The Protein Data Bank


3.1 Overview


quest jobname -db PDB

pdbget -l -f jobname.pid

This will copy the files containing the relevant structures from the CD-ROM to the current directory. (See the man page pdbget(l))

3.2 Protein Data Bank Files


The Brookhaven Protein Data Bank is supplied by the CCDC as a single CD-ROM and contains 6626 compressed (actually gzip-ed) files. Each file contains information for a separate protein structure and is named after the corresponding PDB 4 character code. The files have been arranged to match the Brookhaven (4 CD-ROM) release: structures on the Brookhaven CD-ROM n are found in the subdirectories of distrn/.

In order to accommodate the compressed PDB files on one CD-ROM the CCDC no longer distributes entries determined from NMR experiments or theoretical modelling studies.

The gzip source-code and executables for many target machines are supplied[2]. You should make sure that gzip is in your PATH (see Section 2.6).

To make the CSD System aware of these files, you should set the environment variable CSDPDBHOME to point to the root directory of the mounted CD-ROM. If you choose to copy the files to disk or already have them (or a subset) on disk from another source then the CSD System needs more detailed information and so you should set the environment variable CSDPDBFILESEARCHPATH. This is explained in Section 3.5.

3.3 Protein Graphic Display


While searching the PDB ASER file using QUEST you can display your hits if you have a RasMol session running. QUEST will not attempt to start a RasMol session; only a RasMol session already running on the same screen as QUEST will be used. QUEST will not instruct RasMol to display a structure unless QUEST itself can find the PDB file that contains the atomic coordinates. You should therefore make sure that either CSDPDBHOME or CSDPDBFILESEARCHPATH is correctly set so that QUEST can find the data bank file to display.

RasMol is able to automatically display compressed PDB files: decompressing them is not necessary.

3.4 Retrieving PDB files


During a search of the PDB ASER file QUEST will write out a set of PDB 4 character codes in file jobname.pid. The corresponding PDB files can be "retrieved" from the CCDC PDB CD-ROM using pdbget. Normally you need only type the command.

pdbget -l -f jobname.pid

( the -l flag instructs pdbget to lower case all input 4 character codes since QUEST writes them out in upper case and the file names are all lower case). To copy all the PDB files specified in the pid file from CD-ROM to the current directory. pdbget uses the same environment variables as QUEST to locate the PDB files.

3.5 How the CSDS resolves file names from 4 character PDB codes


Given a 4 character PDB code pdbget and QUEST search for file names as found on the CCDC PDB CD-ROM. If you have moved or renamed the files or if your CD-ROM controller doesn't translate filenames using the Rock-Ridge extensions, the CSD System will have to be informed.

The CSD System uses the enviroment variable CSDPDBFILESEARCHPATH to convert 4 character PDB codes to file name paths. This enables existing installations of the PDB to be used with the CSD System.

If no CSDPDBFILESEARCHPATH value is specified then it defaults to the following string:

%h/distr1/%s/pdb%n.%e.gz:%h/distr2/%s/pdb%n.%e.gz:%h/distr3/pdb%n.%e.gz

:%h/distr4/pdb%n.%e.gz

The CSD System interprets CSDPDBFILESEARCHPATH as a colon separated line of filename templates. Each %character pair in the string is substituted in the following manner:

%character Replaced with...

%h The value of the environment variable CSDPDBHOME.

%H upper case of the value of the enviroment variable CSDPDBHOME.

%n 4 character PDB code

%N Upper case of 4 character PDB code

%s Middle two characters of 4 character PDB code.

%S Upper case of middle two characters of 4 character PDB code

%e The string "ent" if the 4 character PDB code begins with 1 or more and "noc" if it begins with 0.

%E The string "ENT" if the 4 character PDB code begins with 1 or more and "NOC" if it begins with 0.

%v The first character of the 4 character PDB code.

%V Upper case of the first character of the 4 character PDB code.

%a The second character of the 4 character PDB code.

%A Upper case of the second character of the 4 character PDB code.

%b The third character of the 4 character PDB code.

%B Upper case of the third character of the 4 character PDB code.

%c The fourth character of the 4 character PDB code.

%C Upper case of the fourth character of the 4 character PDB code.

%% Single % character

Once the substitutions have been made, programs in the CSD System will search the colon separated list from left to right using the strings as filenames and querying the filesystem until a match is found.

3.5.1 Example 1:

If CSDPDBHOME is set to /cdrom and CSDPDBFILESEARCHPATH is not set then the 4 character PDB code 1ycc will translate to:

/cdrom/distr1/yc/pdb1ycc.ent.gz:/cdrom/distr2/yc/pdb1ycc.ent.gz:/cdrom/distr3/yc/pdb1ycc.ent.gz:cdrom/distr4/yc/pdb1ycc.ent.gz:

One of these filenames hopefully contains the data for 1ycc.

3.5.2 Example 2:

If your CD-ROM driver does not interpret the Rock-Ridge extensions then the PDB filenames may appear as:

/cdrom/DISTR1/PDB1YCC.ENT;1

In this case CSDPDBFILESEARCHPATH should be set to

%h/DISTR1/PDB%N.%E;1:%h/DISTR2/PDB%N.%E;1:%h/DISTR3/PDB%N.%E;1

:%h/DISTR4/PDB%N.%E;1

Of course if you have your own versions of the Brookaven Proten Data Bank files already on disk you can use those. You will probably have to change the CSDPDBFILESEARCHPATH accordingly.

3.5.3 Example 3 :

Suppose you already have a number of PDB files uncompressed and all contained in the directory /local/PDB. Each file has the form nnnn.pdb. Then you can direct the CSDS to use these files by setting CSDPDBFILESEARCHPATH to be

/local/PDB/%n.pdb

(CSDPDBHOME need not be set.)

3.6 General Examples


The following examples assume that you have the CSDS executables in your PATH. i.e. CSDHOME/bin is in your PATH and that you are using the csh.

3.6.1 Copying a subset of PDB files from CD-ROM to disk:

Mount the CSDS PDB CD-ROM, e.g. for a DEC Alpha AXP under OSF/1:

mount -t cdfs -r -o noversion /dev/rz12c /cdrom/PDB

(See Appendix 6.1 for more machine-specific examples.)

setenv CSDPDBHOME /cdrom/PDB

Move to the directory that you wish the PDB files to reside Now run pdbget

pdbget

Type in the 4 character PDB codes you are interested in and end with a Control-D. e.g.

1ycc

1bct 0ace

1ae2

^D

You should now have the files pdb1ycc.ent.gz etc. in your current directory.

3.6.2 Displaying Protein Structure hits:

Mount the CSD System PDB CD-ROM and set the CSDPDBHOME environment variable as in 4.6.1 above. Now start RasMol and QUEST in separate windows

rasmol &

quest -j proteins -db PDB

type (to quest)

term x

menu full

Go to the SEARCH menu and press HITALL and START buttons. As you progress through the PDB ASER database file you should see RasMol displaying the corresponding atomic structures (if any).

3.7 Troubleshooting


3.7.1 Problem: pdbget prints the message:

/usr/local/bin/pdbget.x: Warning both CSDPDBHOME and CSDPDBFILESEARCHPATH environment variables and no "-p path" was given on the command line.

pdbget will be unable to find the PDB files. Set CSDPDBHOME to the root directory containing the files (the mount point of the CD-ROM). Type (csh)

setenv CSDPDBHOME PDB_CDROM

or (sh , ksh)

CSDPDBHOME=PDB_CDROM ; export CSDPDBHOME

and retry.

3.7.2 Problem: pdbget can't find the files and prints the following message:

pdbget: can't find file for code nnnn

This can be due to a number of conditions:

CSDPDBHOME not set correctly (see problem above).

Files are not in standard directories or don't have standard names. Section 3.5 describes how pdbget finds a file given a 4 character PDB code.

3.7.3 Problem: quest doesn't seem to be communicating with RasMol

RasMol & QUEST communicate using the Tk inter-process registry. If a previous RasMol process has exited without removing itself from the registry (by crashing!) then QUEST will be communicating with a non-existent process. You have only three choices at the moment.

1) type the CSDS command ipcpurge.

(ipcpurge will remove any process that is not responding from the registry.)

2) before running QUEST set the environment variable RASMOL to "rasmol #2"

3) exit the X-windows system and then start it up again.

3.7.4 Problem: RasMol can't find the PDB files.

If QUEST and RasMol do not share the same filesystems (for example if QUEST is being run remotely), then the PDB filenames that QUEST finds might not exist for RasMol. There is no work-around for this; both QUEST and RasMol must have access to the PDB files.


CSD System Documentation files


The documentation available on the Software CD-ROM includes UNIX man pages, release notes and a major part of the main CSD System documentation in Hyper-Text Markup Language (HTML) format.

4.1 man Pages


UNIX man pages for QUEST, PLUTO, VISTA, GSTAT, PREQUEST, cdcopy, csdupdat, pdbget, cnvrtdb can be found in the directory CSDHOME/man. You can access the pages by typing (for example):

quest -help

or

man -M ${CSDHOME}/man quest

Postscript and plaintext forms of these man pages exist in the CDROM/doc directory as *.ps and *.l files respectively.

4.2 Release Notes


The following release notes are included in the directory CDROM/doc or CAMBRIDGE/doc in Postscript (*.ps) and plain text (*.txt) format:

The Postscript files are provided in both A4 (rnv5xa4.ps) and US letter (rnv5xusl.ps) format.

4.3 HTML Documentation


The files included cover the following sections of the CSDS Documentation:

All HTML files are present in the directory html on the Software CD-ROM.

To simplify the copying of the HTML files from the Software CD-ROM to disk, the html directory has been tar'ed and compressed in the file thtml.Z. This will be found in the top directory of the Software CD-ROM. In some cases this name may appear as thtml.z.

When copied to hard disk these files will occupy approximately 5.4Mb of disk space.

To install the files on your hard disk:

1. Copy thtml.Z to the directory in which you wish the html directory to reside.

e.g., cp CDROM/THTML.Z /usr/cambridge/thtml.Z

2. Uncompress the file using either the UNIX uncompress command or gunzip; gunzip is available from various anonymous ftp sites around the world.

e.g., uncompress thtml.Z

This will replace file thtml.Z with thtml

3. Extract the files by typing

tar xvf thtml

A directory called html will be created in the directory to which you copied thtml.Z.You can now delete thtml. You can do all these steps at once using e.g.:

zcat CDROM/thtml.Z | (cd /usr/cambridge ; tar xBf - )

The root document is zdocmain.html. To access this file through Mosaic/Netscape either open it through the Open Local option of the File menu or start Mosaic/Netscape with the absolute path of this file given on the command line. If you want to access newly installed html files type, e.g.:

netscape CAMBRIDGE/html/zdocmain.html


Recompiling the CSD System


The C source code to the CSD System may be found in CDROM/source/*/cc subdirectories. More information about the source code can be found in the file README contained in the top directory of the Software CD-ROM.

5.1 Setting up the Directory


The following files must be present in CAMBRIDGE:

config.sh configuration script (file)

rc auxiliary configuration scripts (directory)

csds auxiliary ASCII files (directory)

If you have done a 'level 1' copy using the CSD System command cdcopy (described in Section 2.5) then you already have these files and directories. Otherwise type:

CDROM/bin/cdcopy -from CDROM -to CAMBRIDGE -level 0

In addition you need the source code itself. For convenience, the source directory is tar'ed and compressed in the file tsource.Z which can be found in the top directory of the Software CD-ROM. In some cases this name may appear as tsource.z. If you wish to copy the source code to disk then tsource.Z should be copied to the directory CAMBRIDGE. When uncompressed and untar'ed, the source directory will occupy approximately 32Mb of disk space.

To install the source code on your hard disk:

1. If tsource.Z does not exist in the directory CAMBRIDGE, copy tsource.Z to CAMBRIDGE e.g:

cp CDROM/tsource.Z CAMBRIDGE/tsource.Z

2. Uncompress the file using either the UNIX uncompress command or gunzip; gunzip is available from various anonymous ftp sites around the world. Type:

uncompress tsource.Z

This will replace tsource.Z with tsource

3. Extract the files by typing

tar xvf tsource

A directory called source will be created in the directory to which you copied tsource.Z.You can now delete tsource. You can do all these steps at once with:

zcat CDROM/tsource.Z | (cd CAMBRIDGE ; tar xBf - )

5.2 To recompile


1. Ensure that you have access to a C compiler (cc/gcc)

2. Move to the directory CAMBRIDGE.

3. If there are changes to be made to the source code as described on the README sheet accompanying these installation notes (if any) or as a result of a notified bug fix then edit the relevant file.

4. Run config.sh, type:

sh config.sh

Type y when asked whether you want to compile the executables from source code and answer the subsequent questions. If you are unsure as to what the answer is, accept the default suggested answer.

You can run config.sh as many times as you like until you are satisfied.

You will be compiling C source code alone, so you should answer none when prompted for the name of your FORTRAN compiler!

3. When config.sh has finished type make software from CAMBRIDGE.

(NB: we have found it useful to set the environment variable TMPDIR to point to a directory situated in a larger disk partition than that of /tmp. e.g. setenv TMPDIR . This is especially true of Silicon Graphics machines.)

If the make utility fails then it will terminate with words similar to *** Error code n where n is greater than 0. See Section 5.3 below.

To restart the make process at approximately the point where it failed type:

make restart

4. Make sure you are still in the CAMBRIDGE directory and type make install.

5.3 If "make software" fails.


The code supplied has been successfully compiled on different UNIX platforms using both K&R and ANSI-C compilers. On machines for which we have supplied executables the make software command should run cleanly to the end.

This section describes problems in four areas which may prevent you (re)creating the CSD System:

1) System Problems: the environment in which you make the executables.

2) Compilation Problems: F77 non-conformance.

3) Linking Problems: Assumptions about libraries available. Fortran-C interface.

4) Execution Problems: Fortran-C argument mismatches and database byteorder & recordlength problems.

Once you have found and corrected the problem you can restart the compilations by typing

make restart. If you want to start at the very beginning again retype make software.

5.3.1 System Problems:

5.3.1.1 Not Enough room:

Most common system problems are associated with lack of file space. While compiling and linking, the make software command will increase the size of the source directory to almost 60Mb. Check that a "filesystem full" line hasn't appeared on your console.

You can compile the CSD System incrementally, deleting the object '*.o' files after each executable or library has been built. Each directory source/*/cc/d_MACHINE has a makefile generated by config.sh such that simply typing "make" while in these directories should produce the executables specified by the directory name (e.g. typing make in the source/quest/cc/d_MACHINE directory will recompile the quest executables).

Start in the src/cc/d_MACHINE directory and build the library first and then move to each of the directories clib, quest/cc/d_MACHINE,pluto/cc/d_MACHINE, vista/cc/d_MACHINE, prequest/cc/d_MACHINE, and gstat/cc/d_MACHINE. Typing make in each of these directories will create the associated executables.

As you are compiling the C code then the 'f2c' library will also have to be created. Type make in the directories f2c/libf77 and f2c/libi77 just after you have created the library in

src/cc.

5.3.1.2 /tmp filesystem full.

Many of the applications such as ar, cc etc. store temporary files in the /tmp directory. If this is located within a small filesystem then it may rapidly fill up. Again, check for a 'filesystem full' warning. For some utilities, notably for ar, you can redefine the temporary directory by setting an environment variable (such as TMPDIR) to point to a new (larger!) directory. Other programs will do the same thing with a '-tmp=newdir' or '-temp=newdir' command line directive. You will have to check the appropriate man pages.

5.3.1.3 No compiler....

For some networks, use of the compilers is regulated by a network license manager. Check that the network has remained up and that the C compiler is available. Note that the script config.sh requires the use of the C compiler.

5.3.1.4 make fails with error sh: ....... : not found.

If make fails with an error such as

sh: /usr/bin/ar: not found

*** Error code 1

then it probably means that you have not run config.sh. (config.sh creates the makefiles found in the source subdirectories from templates found in rc/makefils/ adding a "header" specific to the current target machine on which it is run.) The current makefiles are probably targeted for another machine-type from a previous run of config.sh.

5.3.2 Compilation Problems

5.3.2.1 Header files missing

Some machines don't have any X11 header files (they may not have been included when the system was installed). Some copies are contained in the source/include directory. You will need to add -I../../include to the list of C compiler directives (CFLAGS) (best done by re-running config.sh). Please don't use these unless absolutely necessary.

5.3.3 Execution Problems

The problem described is an inability to read the database files.

5.3.3.1 Quest can't find/read the database.

If Quest can't find the database, go to Section 2.8 and find out how to edit the db.dbl file to make sure that the CSD System can locate the main database(s).


Appendices


6.1 Mounting the CD-ROM Locally


Most of the examples below all assume that you will be mounting the CD-ROMs under the directory /cdrom. If this directory does not exist then first create it with the command:

mkdir /cdrom

If your machine does not appear below then consult your manual pages for the command mount. Type:

man mount

Remember to first insert the CD-ROM into the drive before typing any mount command.

6.1.1 Sun SunOS 4.1.x (Solaris 1.x)

mount -r -t hsfs /dev/sr0 /cdrom

6.1.2 Sun SunOS 5.x (Solaris 2.x)

If the volume manager daemon vold is running then the Software CD-ROM will be automatically mounted and CSDHOME will be /cdrom/csds_v516.

6.1.3 Silicon Graphics IRIX 4.x

mount -o ro -t iso9660 /dev/scsi/sc0d5l0 /cdrom

You can also start the daemon cdromd with:

cdromd -o ro -d /dev/scsi/sc0d5l0 /cdrom

and then just insert the CD-ROM into the drive. Use the eject command to unmount the CD-ROM.

6.1.4 Silicon Graphics IRIX 5.x/6.x

The Software CD-ROM will be automatically mounted on these systems.

6.1.5 IBM/RS6000 AIX 3.2

mount -r -v cdrfs /dev/cd0 /cdrom

6.1.6 DEC Alpha AXP OSF/1

mount -r -t cdfs -o noversion /dev/rz3c /cdrom

6.1.7 HP 700 HP-UX

mount -r -t cdfs /dev/dsk/c201d2s0 /cdrom

6.2 Remote mounting the CD-ROM


If you only have access to a CD-ROM through the network then locally mount the CD-ROM on the remote machine. You can then use the mount command to remote mount the directory that the CD-ROM was mounted under in the normal fashion. e.g. For a Sun running SunOS 4.1.x type on the remote machine:

remote# mount -r -t hsfs /dev/sr0 /cdrom

Then on the local machine type:

local# mount remote:/cdrom /cdrom

If your network runs the network file system (NFS) you may choose to share the mounted CD-ROM on the remote machine. In this case adjust the dfstab or exports file on the remote machine appropriately.

6.3 The CSD System and X-windows


The Unix versions of quest, vista, prequest and pluto interface to an X-server for all their graphical input and output. This gives them special networking capabilities. For example you can run a database search on a remote machine while directing it via the display menus and mouse clicks on your local machine. Although both machines must be running an X-server (if you are using SUN's "Openwindows" or Silicon Graphics's 4DWM, for example, then you are already running an X-server), they do not have to be from the same vendor. For example, you can run quest on a SUN with the menus appearing on the screen of a Silicon Graphics workstation.

1. How to run quest, vista, prequest and pluto over the Network

Set permissions on your local machine by typing:

local% xhost +remote

Remote-login to the machine where the database and software are installed by using rlogin or telnet. e.g.

local% rlogin remote

Tell the remote machine where you want the graphical display sent by typing one of the two following lines:

remote% setenv DISPLAY local:0 (csh)

remote$ DISPLAY=local:0 ; export DISPLAY (sh, ksh)

quest may now be run on the remote machine. Type:

remote% quest -j mystructure

and in response to the quest prompt, type

term x

menu

You will get the quest display menu appearing on your local workstation screen. If you see the error:

Xlib: Connection to "0:0" refused by server

Xlib: Client is not authorized to connect to server.

then you may have changed your user ID (with su for example). Make sure that you are the user that originally logged in and started up the window manager.

2. How to customise quest, vista, prequest and pluto

You have some control over the appearance that quest, vista, prequest and pluto present on the workstation screen. You can change the size of the display menu's window, its initial position, the type and colour of the cursor used within the window and the font seen on the menus and structure diagrams.

To do this type the following:

(csh)

% setenv XENVIRONMENT Xdefaults

(sh,ksh)

$ XENVIRONMENT=Xdefaults ; export XENVIRONMENT

Then create a file called Xdefaults with lines similar to the following:

quest*fontSchema: *-helvetica-bold-o-normal

quest*cursor-shape: XC_crosshair

quest*cursor-color: Green

quest*geometry: 600x400+100+200

pluto*fontMin: 2

pluto*fontMax: 12

An example Xdefaults file exists in the CSDHOME/csds directory. You can list the available fonts from your machine by typing:

% xlsfonts

Wildcards ("*") are permitted in the font preference. The cursor-shape integer selects a cursor from the cursor font (it must either be an even integer between 0 and 154 inclusive or a cursor font macro name such as XC_arrow). The cursor-color is a name from the standard Xlib colour library, a list of which you can find in /usr/openwin/lib/rgb.txt on a SUN and /usr/lib/X11/rgb.txt on Silicon Graphics and OSF. The geometry preference is given in pixels and has the following meaning for the initial window size

height x width + top_left_corner_x + top_left_corner_y

Remember: (0,0) is the top left corner of the screen and the preferred aspect ratio is: heigth:width = 3130:4096. This is from the original Tektronix screen ratio.

An alternative is to create a file .Xdefaults-hostname in your $HOME (or $LOGDIR) directory where hostname is the name of computer on which you are running quest, vista, prequest or pluto (i.e. the output of the hostname command).


[1] RasMol is distributed free with the CSD System with permission from its author Roger Sayle. A copy of the source code can be found in the directory source/external/rasmol.
[2] gzip is distributed free under the terms of the FSF license. A copy of the source code can be found in the directory source/external.