Back to Table of Contents
Another statistical analysis tool available in Vista 2.0 is Principal Component Analysis (PCA). The basic method for generating Principal Component (PC) scores in version 2.0 is little different from that employed in Vista version 1.0, with the major exception that at the end of the PCA the results are stored in the spreadsheet itself. Previously, the results were discarded at the end of the PCA.
To perform a PCA on a set of parameters simply select the relevant columns of the Spreadsheet and then press Generate P.C. Scores. Four or more parameters are required for PCA, and for the results to be statistically meaningful the parameters chosen are usually ones that are derived from similar geometrical properties. For instance it is meaningful to perform a PCA on the six torsion angles of a cyclohexane ring fragment, but it is not statistically meaningful to attempt a PCA on three torsion angles and three distances from the same dataset.
Upon completion of the PCA the user is given the opportunity to review the results before the PC scores are written to the spreadsheet. A pop-up with a brief summary of the PC scores is presented. Clicking on any of the items in the scrolling list will display the eigenvalue and % variance. The blue -/+ keys will move up/down the list sequentially.
If the PCA results seem satisfactory then hit OK and the PC score data will be added to the spreadsheet. Selecting CANCEL will abandon the PCA analysis and not write the PC scores to the spreadsheet.
Once the PC scores are in the spreadsheet they can be analysed in the same fashion as any other spreadsheet parameter. Plotting functions such as the Scattergram can be used as well and are usually invaluable in the interpretation of the PCA as a whole.
The current maximum number of parameters permissible in Vista 2.0 is 50. If the proposed PCA causes the number of parameters to exceed this limit, an error message will alert the user to this.
One of the major new features developed for Vista 2.0 is the ability to create new parameters by application of mathematical functions to the parameters defined in the original Quest3D search.
There are two modes of operation. Create and Transform.
Create will generate a new numerical parameter, to be appended to the end of the spreadsheet. No parameters need to be selected prior to pressing Create.
Transform simply applies the input equation to the current selected parameter. This must be a numerical parameter. On completion the original parameter is replaced by the transformed parameter
NFRAG and REFCOD are untouchable, as certain Vista 2.0 operations require this information to be constant.
The current maximum number of parameters permissible in Vista 2.0 is 50. The user can create any number of new parameters up to this limit.
The equation can either be constructed by pressing buttons on the calculator and/or keying in the text via the keyboard. The buttons essentially mirror those currently available in the Quest transform menu.
The scrolling list is active and by clicking on an item in the list, the item text phrase will be inserted into the calculator display. If a parameter is being transformed, the selected parameter is highlighted to remind the user. In addition, the parameter being created/transformed also appears in the title bar (TRN09 in the example shown above).
When new parameters are being created, they are named, by default, TRNn, where n is the parameter number. If a parameter of this name already exists then it will currently be overwritten by the new parameter.
The equation can be up to 256 characters long. The calculator display can typically show up to 60 characters. The display can however be scrolled left and right to view strings longer than 60 characters. Using the left and right cursor keys, the I-beam cursor can also be scrolled through the equation text. In addition, clicking on characters will also re-position the cursor. Any text entered (whether by keyboard or by button presses) will be inserted at the current cursor position.
One feature which is new to transform is the availability of memory buttons. Pressing M will store the current display contents in a memory. MR will place a copy of the contents of the memory into the display, at the current cursor position. This feature could be particularly useful if you wish to apply the same/similar transform to many parameters. For instance:-
could be stored in memory. Only the contents of the inner-most brackets then need to be entered if MR is used to repeat this function several times.
Pressing MC will clear the contents of the memory.
Selecting C will clear the equation and hitting AC will clear the display and erase the memory contents.
As with other pop-up menus, pressing OK or hitting RETURN will apply the transform. Any errors will be reported and the user is given another chance to rectify their mistake. CANCEL will abort the operation.
As mentioned previously, suppressed entries are effectively removed from the dataset and so the transform is only applied to those entries that are currently active. This is useful if you wish to apply the transform to a particular group of entries. An example of this would be in the case of angles ranging from 0-180 degrees. Those entries less than or equal to 90 degrees could be suppressed and the transform "angle=180-angle" applied to the rest
Back to Table of Contents
Spreadsheet Part 3 .