Analysis of the Kluyveromyces lactis Heat Shock Factor:
A Representative of the Helix-Turn-Helix
Transcription Factor Family





SUMMARY

DISEASE

FAMILY

FUNCTION

STRUCTURE

PROPERTIES

SEQUENCE-FUNCTION

STRUCTURE-FUNCTION

EVOLUTION

STRUCTURE COMPARISON





Evolution



To determine how the sequence of the Heat Shock Factor protein has evolved over time I discovered protein sequences that are closely homologous to HSF using two different search tools, one that aligns proteins based on local sequence similarities (BLAST) and one that aligns sequences based on global sequence similarities (FASTA). After determining which protein sequences are the most related to HSF, I searched for sequences which are less closely related using an iterated search program that takes into account areas of the subject protein with high homology and looks for other proteins containing related domains (Psi-BLAST).


BLAST results



BLAST aligns sequences based on local sequence similarities
The top ten proteins that share homology with Kluyveromyces lactis HSF are shown in order of significance.

                                    Protein             E value

gb|AAA34689.1| Saccharomyces cerevisiae HSF 6e-67
sp|Q02953|HSF_SCHPO Saccharomyces pombe HSF 6e-37
gb|AAD51329.1|AF172640_1 Rattus norvegicus HSF2 6e-35
ref|NP_032323.1| Mus musculus HSF2 1e-34
ref|NP_004497.1| Homo sapiens HSF2 2e-34
sp|P38530|HSF2_CHICK Gallus gallus HSF2 2e-33
sp|P38531|HSF3_CHICK Gallus gallus HSF3 5e-33
ref|NP_001529.1| Homo sapiens HSF4 2e-32
sp|P41154|HSF_XENLA Xenopus laevis HSF1 3e-32
gb|AAF80399.1|AF160966_1 Mus musculus HSF4 6e-32


FASTA results


FASTA aligns sequences based on global sequence similarities.
The top ten proteins that share homology with Kluyveromyces lactis HSF are shown in order of significance.
                       Protein                      E value

HSF_YEAST Saccharomyces cerevisiae HSF 1.8e-21
HSF_SCHPO Saccharomyces pombe HSF 1.8e-19
HSF1_CHICK Gallus gallus HSF1 3.3e-19
HSF2_HUMAN Homo sapiens HSF2 1.5e-17
HSF2_RAT Rattus norvegicus HSF2 3.6e-17
HSF2_MOUSE Mus musculus HSF2 4.5e-17
HSF1_HUMAN-01 Homo sapiens HSF4 4.9e-17
HSF_XENLA Xenopus laevis HSF1 9e-17
HSF4_HUMAN Homo sapiens HSF4 1.1e-16
HSF3_CHICK Gallus gallus HSF3 2.3e-16


Psi-BLAST results


Psi-BLAST is an iterated alignment program which is capable of uncovering distantly related homologs.

The top five proteins that share homology with Kluyveromyces lactis HSF but are not considered to be Heat Shock Factors are shown in order of significance.
                                            Protein                            E value

pir||T04552 pherophorin - like protein from Arabidopsis thaliana 0.005
sp|P03866|YP2A_STAAU HYPOTHETICAL 26.9 KD PROTEIN from Staphylococcus aureus 0.006
ref|NP_055490.1| KIAA0445 gene product from Human 0.011
dbj|BAB06300.1| Unknown conserved protein Bacillus halodurans 0.011
sp|P32164|YIIU_ECOLI HYPOTHETICAL 9.6 KD PROTEIN from E. Coli 0.012


It's interesting that both the local and global search programs reported almost all of the same sequences but in a different order. This shows that local and global similarities differ some what between homologous sequences and that both types of searches agree (at least for the most part) on what sequences are the closest relatives. The Psi-BLAST search revealed three bacterial proteins which may be ancestral to the heat shock factor in yeast and showed that a portion of pherophorin proteins in plants may be distantly related to Heat Shock Factor. After determining which protein sequences are the most related to Kluyveromyces lactis HSF I decided to align several so that conserved residues within the sequence of all HSF homologs could be found. Below is a multiple sequence alignment of the two most conserved regions of Heat Shock Factor, the "winged" helix-turn-helix DNA binding domain and the trimerization domain. These two domains are highly conserved in all HSF homologs suggesting that DNA binding and multimerization are two of the most important functions of HSF.


Sequence alignment of the DNA binding domains from several HSF homologs



Sequence alignment of the trimerization domains from several HSF homologs


Based on the multiple sequence alignment of HSF homologs, I was able to generate a rooted phylogenetic tree of the HSF sequences using a neighbor joining method that determines the most parsimonious linkage by finding the tree which requires the fewest and smallest branches. The image below shows that this method of tree building grouped together specific homologs. For instance all three of the yeast homolog where placed on one branch which is linked to a second branch containing all of the plant homologs. These groupings suggest that homologs in the same cluster are more closely related to one another than to any of the other homologs in the tree. Also, the tree shows what the closest ancestor of any two homologs is and how many times a "novel" homolog has arisen. For instance, the linkages between the vertebrate HSF homologs suggest that homolog 4 was the first to split off from the last common ancestor followed by homolog 2 and finally 3. Divergence from the common ancestor may suggest a novel functional role for the new homologs.