Advertisement
Advertisement

The Internet Journal of Genomics and Proteomics ISSN: 1540-2630


Evolutionary implication of protein secondary structure among Archaea and Bacteria


P. Chellapandi Department of Bioinformatics, School of Life Sciences, Bharathidasan University
C. Karthigeyen Department of Biotechnology, School of Life Sciences, Bharathidasan University
S. Sivaramakrishnan Department of Biotechnology, School of Life Sciences, Bharathidasan University

Citation:  P. Chellapandi, C. Karthigeyen & S. Sivaramakrishnan: Evolutionary implication of protein secondary structure among Archaea and Bacteria. The Internet Journal of Genomics and Proteomics. 2009 Volume 4 Number 2

Keywords:  Evolution, Secondary structure, superfamily, Archaea, SOPMA, Metalloproteins

Abstract

Molecular structures and sequences are generally more revealing of evolutionary relationships than classical phenotypes, particularly among microorganisms. Archaea are unique group of organisms among other kingdoms, which are widely diverged in metabolic pathways and have well distinguished metabolic genes, particularly involved in methanogenesis, osmoregulation, sulfur toxicity, metal detoxification and stress response. Physiochemical characteristics and secondary structure of some superprotein families from archaeal domain were compared within Archaea and with Bacteria. The results of this work revealed that many of the proteins did not show close proximity to bacterial proteins, but few of them showed evolutionary relationship to proteins with similar biochemical functions in Bacteria. Proteins involved in methanogenesis were highly unique to methanogens in archaeal domain, but proteins responsible for carbon assimilation shared their ancestral behaviors among prokaryotes with reference to similarities in secondary structural elements. CO dehydrogenase, 4-vinyl reductase, allophanate reductase, quinine oxidase, dihydrolipoamide dehydrogenase and NADH oxidase were somehow similar to prokaryotes which indicated a wide diversification and mobilization of such families between Archaea and Bacteria during evolution process. Another one noteworthy of our study is that a stable divergence and slow evolutionary process were occurred in topoisomerase family of Archaea. Perhaps, this attempt can be helpful to understand evolutionary mechanisms of some key metabolic proteins at secondary structural level in Archaea.


Introduction

The extremophilic nature of many Archaea has stimulated intense efforts to understand the physiological adaptations for living in extremes environments and to probe the potential biotechnological applications of their stable cellular components. Specific archaeal metabolites have also been purified and characterized and some of them have potential industrial uses (Alquéres et al., 2007). About 85% extremophiles are included in Archaea, still these are capable to adopt in new extreme environmental conditions, because the distinguished possessive nature of conserved amino acids and hydrophobic, and structural modulation. Among species of Archaea, there are a variety of metabolic regimes which differ greatly from the better-known metabolic pathways of the Bacteria and eukaryotes (Brown and Doolittle, 1997; Apic et al., 2008; Ebenhöh et al., 2005). Some of these beneficial features might be either diverged within archaeal domain and or transferred to other kingdoms through horizontal gene transfer. A key to modeling and understanding the evolutionary process is the identification and characterization of the constraints that the evolution perceives as protein diverges. The limitations of widely used models of sequence evolution often prevent more refined and informative questions from being addressed (Goldman et al., 1998).

Generally, protein sequences are more conserved than nucleotide sequences in turn to rather than protein structure. The amino acid variation among closely related protein structures can reveal the presence of structural constraints or plasticity. Secondary structure in proteins consists of local inter-residue interactions mediated by hydrogen bonds. The most common secondary structures are α-helices and β-sheets. Other helices, such as the 310 helix and α-helix, are calculated to have energetically favorable hydrogen-bonding patterns but are rarely if ever observed in natural proteins except at the ends of helices due to unfavorable backbone packing in the center of the helix. Other extended structures such as the polyproline helix and α-sheet are rare in native state proteins but are often hypothesized as important protein folding intermediates. Tight turns and loose, flexible loops link the more “regular” secondary structure elements. The random coil is not a true secondary structure, but is the classes of conformations that indicate an absence of regular secondary structure. Amino acids vary in their ability to form the various secondary structure elements. Proline and glycine are sometimes known as “helix breakers” because they disrupt the regularity of a helical backbone conformation; however, both have unusual conformational abilities and are commonly found in turns. Amino acids that prefer to adopt helical conformations in proteins include methionine, alanine, leucine, glutamate and lysine; by contrast, the large aromatic residues (tryptophan, tyrosine and phenylalanine) and Cβ-branched amino acids (isoleucine, valine, and threonine) prefer to adopt β-strand conformations. However, these preferences are not strong enough to produce a reliable method of predicting secondary structure from sequence alone. An amino acid change at the protein surface usually produces only local rearrangements and can be stabilized by the reorganization of the solvent molecules. Moreover, buried residues in a protein are less affected by the environment external to that protein, which might be very different among different evolutionary lineages (Pietro et al., 1998). So that, selective constraints may act to preserve protein structure (Jeffrey et al., 1996).

The ordination of the amino acids in terms of the most frequent substitutions agrees with the conservation of the α-helix, β-sheet, and β-turn formation tendencies during evolution. The same correspondence has been demonstrated for the conservation of the physicochemical properties in the amino acid substitutions (Angélica Soto et al., 1985). Secondary structural similarities and identities may also be impacted on the conservation of amino acids in a protein. A major fluctuation in α-helix, β-turn or random coil of protein secondary structure can effect on the regeneration and degeneration of some of the metabolic genes in the divergence time (Chothia and Lesk, 1986; Chothia and Gerstein, 1997; Gerstein, 1997). Only a stretch of amino acids in proteins is conferring its conformation in secondary structure to folding process. To understand better the effects of amino acid substitution in catalytic and regulatory regions of a protein, structural biologists study and attempt to predict secondary structure (Pietro et al., 1998). Therefore, the conservation in these aspects will obviously define the evolutionary significance of this domain (Apic et al., 2008; Ebenhöh et al., 2005). In this context, we described how predicted secondary structure of selected superproteins is played a crucial role in evolutionary relationships between Archaea and Bacteria. In addition, this work was aimed to study the physiochemical properties of these proteins and to compare with homologous proteins retrieved from both Bacteria and Archaea.

Materials and Methods

The sequences of 25 archaeal superproteins were retrieved from NCBI database, which were used to search for pairwise similarity sequences from both Archaea and Bacteria by NCBI-BLASp algorithm (Altschul et al., 1999) using default parameters. Comprehensive information of protein sequences used in this study is presented in Table 1. The physiochemical features of these proteins include molecular weight, theoretical pI, estimated half-life, instability and aliphatic index, and grand average of hydropathicity (GRAVY) were computed by using ProtParam tool (Gasteiger et al., 2005). SOPMA (Geourjon and Deléage, 1995) tool was used to predict the secondary structure of proteins. A protein sequence was uploaded in a working space of this server and run with default parameters (window width 15, similarity threshold 1 and number of states 4) for prediction of -helix, extended strand, -turn, and random coil in %. The result was displayed in a graphical interface. Appropriate positions of those secondary structural elements were pointed where the suitable matches were found on sequences as uploaded.

Results

Physiochemical characteristics of our query proteins are listed in Table 1. It showed that Ile-tRNA synthetase was the largest molecular weight protein where as coenzyme M and F420 were smallest proteins selected in this study. The molecular masses of other proteins were ranged from 214 to 638 KD. Quinol oxidase I, CoP methyltransferase, topoisomerase

VI and sarcosine oxidase were alkaline proteins because of pI of them fallen to above 8.45 and rest of them belonged to acidic proteins (ranged 4.39 to 6.22). Quinol oxidase I,

4-vinyl reductase, DB synthetase and sarcosine oxidase have high aliphatic index and GRAVY values so that more hydrophobicity would be reflected for these proteins. Apart from these other proteins had moderate hydrophobic and hydrophilic ratios.

As shown in Table 2, a homolog NADH reductase of Methanosarcina acetivorans C2A has the largest molecular weight. One of the domains of this NADH reductase/oxidase family may be similar to our query protein. Allophonate hydrolase, sarcosine oxidase and tRNA PU synthetase and quinol oxidases were alkaline proteins while other proteins were acidic proteins except G-6-P synthetase.

Hydrophobicity and aliphatic index of many proteins ranged between 70 and 114, but maximum was 121 to DB synthetase of Methanococcus maripaludis S2. Thus, physiochemical characteristics of many superproteins were closely resembled to same protein family or domains of Archaea.


                  Table 1:Physiochemical properties of our selected archaeal superproteins (query)

Table 1:Physiochemical properties of our selected archaeal superproteins (query)


                  Table 2:Physiochemical properties of the best archaeal homologs obtained for our selected archaeal proteins (query)

Table 2:Physiochemical properties of the best archaeal homologs obtained for our selected archaeal proteins (query)

As reported in Table 3, we obtained many hits which similar to archaeal superproteins at primary structure levels. A homolog tRNA PU synthetase of Rubrobacter xylanophilus DSM 9941 has the largest molecular weight and remaining were ranged from 213 to 637 KD. RUBISCO, CoP methyltransferase, Ile-tRNA transferase, topoisomerase VIB and DB synthetase belonged to alkaline proteins (range 7.77-9.24) while other proteins acted as acidic in nature. Hydrophobicity and aliphatic index of few proteins predicted above 100, maximum was 114 to NADH reductase of

Solibacter ustitatus Ellin 6076. Accordingly, we state that physiochemical features of archaeal proteins much more similar to proteins from archaeal domain than Bacteria.


                  Table 3:Physiochemical properties of the best bacterial homologs obtained for our selected archaeal proteins (query)

Table 3:Physiochemical properties of the best bacterial homologs obtained for our selected archaeal proteins (query)


                  CO dehydrogenase

CO dehydrogenase


                  MH cyclohydrolase

MH cyclohydrolase


                  Coenzyme F420

Coenzyme F420


                  Coenzyme M

Coenzyme M


                  RUBISCO

RUBISCO


                  G-6-P synthetase

G-6-P synthetase


                  Quinone reductase

Quinone reductase


                  NADH oxidase

NADH oxidase


                  Quinol oxidase I

Quinol oxidase I


                  CoP methyltransferase

CoP methyltransferase


                  4 - Vinyl reductase

4 - Vinyl reductase


                  tRNA PU synthase B

tRNA PU synthase B


                  Ile - tRNA synthetase

Ile - tRNA synthetase


                  Topoisomerase VI B

Topoisomerase VI B


                  Ribonuclease H

Ribonuclease H


                  DB synthetase

DB synthetase


                  DHL dehydrogenase

DHL dehydrogenase


                  HN synthase

HN synthase


                     BA dehydrogenase

BA dehydrogenase


                  Choline dehydrogenase

Choline dehydrogenase


                  Allophonate hydrolase

Allophonate hydrolase


                  Sarcosine oxidase α SB

Sarcosine oxidase α SB


                  CoHD reductase

CoHD reductase


                  Acetyl transferase

Acetyl transferase

Figure 1 showing the predicted secondary structure of superproteins used in this work (Y axis represents amino acid length).

The secondary structure of these proteins was predicted by SOPMA server, and graphical representations of the structural information including helix, sheet, coil and turn are depicted in Figure 1.

After predicting secondary structure of these proteins, the structural elements have been used to compare with the homologous proteins of Archaea and Bacteria (Table 4 & 5). A large portion of the secondary structure was occupied by random coil, followed by α-helix. The variations in percentile were 5-10% in α-helix and 20-25 % in random coil when compared within archaeal proteins. In contrast, there was a little variation among extended coils, but many proteins showed major differences in β-turns at secondary structure level.


                  Table 4: Comparative analysis of protein secondary structures of our archaeal superproteins and of the best archaeal homologs

Table 4: Comparative analysis of protein secondary structures of our archaeal superproteins and of the best archaeal homologs

Unlike similarities found at primary levels, secondary structure implied more conserved in nature among archaeal proteins in this study.

Similarly, secondary structure of these proteins compared with homologous proteins of Bacteria as shown in Table 5. It revealed that tRNA PU synthetase, choline dehydrogenase and allophonate hydrolase of Archaea were more differed from bacterial proteins in β-turn, but other proteins closely related with Bacteria. Excluding the proteins involved in carbon assimilation other proteins such as proteins involved in sulfur metabolism, bacterial photosynthesis and halo-adaptation were quantitatively resembled with bacterial proteins. The variations in percentile were 10-15% in α-helix and 20-25 % in random coil when compared within archaeal proteins. In contrast, there was a major variation among extended coils, but many proteins showed major identities in α-helixes at secondary structure level.


                  Table 5: Comparative analysis of protein secondary structures of our archaeal superproteins and of the best bacterial homologs

Table 5: Comparative analysis of protein secondary structures of our archaeal superproteins and of the best bacterial homologs

Discussion

The archaeal proteins with more acidic amino acids involved in methanogenesis, osmoregulation, photorespiration and urea metabolism showed close similarity to Archaea than Bacteria. Similarly, NADH reductase of Sulfolobus acidocaldarius with more alkaline amino acids was homologs to Sulfolobus solfataricus, but not to prokaryotes. The isoelectric point of CoP methyltransferase, topoisomerase VIB, allophanate hydrolase, Ile-tRNA synthetase from Archaea were not matched each other and they were independent. Slesarev et al. (2002) found that M. kandleri proteins with high content of negatively charged amino acids are determined how this methanogenic bacterium is capable of adapting to the high intracellular salinity. Thus changes of amino acid resides are to determine the physiological functions of archaeal proteins.

The instability index of CO dehydrogenase, 4-vinyl reductase, CoHD reductase, allophonate reductase, NADH reductase, quinol oxidase, DHL dehydrogenase, ribonuclease H, topoidomerase VI was above 40. Therefore, these proteins are expected to be unstable structurally on subsequent divergence revealing a chance to protein family evolution. Aliphatic index of CoHD reductase showed close proximity between Bacteria and Archaea. There are many ways to categorize amino acids by chemical properties (e.g., hydrophobicity, charge, relative size of side chain), and physicochemical distances between amino acid types have been suggested (Grantham 1974; Taylor and Jones 1993), but these categorizations or physicochemical distances may not directly reflect the differences among amino acid types that are acted upon by evolution (Jeffrey et al., 1996). An amino acid is replaced very frequently by a physiochemically similar one. In the Dayhoff model, replacement rates were derived from alignments of protein sequences that are at least 85% identical. The assignments are used to build phylogenetic trees and the internal nodes of the tree give inferred ancestral sequences (Dayhaff et al., 1972). Thus, this work states that if any amino acid change occurred protein function will be changed from which structural constraints would be reordered.

Secondary structure elements represent regularities and basic building blocks of the architecture of a protein; thus, secondary structure is much more directly related to tertiary structure than the primary structure (Pietro et al., 1998). It is also important to note that secondary structural elements are more conserved than the precise atomic structure (Mizuguchi and Go, 1995) and that protein architecture depends on constraints related to bring key residues close in space. Similarly, a computational study of the protein sequences and structures of the superfamily of archaeo-eukaryotic primases has reported by Iyer et al. (2005). Thus the results of the secondary structure of some superprotein families are clustered to address questions related the distributions of -helix, extended strand, -turn and random coil among Archaea and Bacteria. Although about 95% extended coil similarities of superproteins found between Archaea and Bacteria, α-helix, β-turn and random coil configuration not found to show identity to all groups; it revealed the conformational variables have to be adjusted in order to stabilize the proteins structure during evolution.

The α-Helix of ammonia monooxygenase, β-turn of coenzyme M, CoP methyltransferase, and quinol oxidase I and random coil of DB synthetase, and Ile-tRNA synthetase showed more similarity to Archaea, but secondary structural conformation of CO dehydrogenase did not show similarity as such to Archaea except α-helix. Since ribonuclease H and topoisomerase VIB possessed unique conformations and structural orientations they were diverged independently. The proteins structurally are related to Archaea and not to Bacteria as secondary structure are conserved in a specific orientation (Galagan et al., 2002). Although protein secondary structures evolve far more slowly than protein sequences, they do evolve. The major changes are at the boundaries of α-helixes and β-sheets. This means that the amino acids at the boundaries of α-helixes and β-sheets may experience replacement rates that are quite different from those values experienced by residues in the middle of a structure element (Pietro et al., 1998). It is interesting that the α-helix rate estimate is greater than that for loops, but the biological significance is unclear. It may be the case that helices do evolve at greater rates than loops (Jeffrey et al., 1996).

Excluding extended coil, many of the conformations of NADH reductase and Ile-tRNA synthetase have been varied consecutively like bacterial proteins. G-6-P synthetase and AMO had shown their dissimilarities in α-helix and extended coil to archaea. Topoisomerase VI and allophonate hydorlase in β-turn and random coil, BA dehydrogenase and ribonuclease H in α-helix and random and 4-vinyl reductase in extended and β-turn proved maximum similarities to Bacteria. There were no considerable conformational changes in secondary structure of HN synthase (β), sarcosine oxidase (β), acetyl transferase (thiolase) (α), quinol oxidase (heme-Cu oxidase) (extended) and choline dehydrogenase (random). Thus, natural selection acts on both the secondary structure elements, because of architectural constraints, and on a few critical residues directly involved in catalysis. Probably, these key residues are also those that are responsible for compensatory variations and longer range correlations in amino acid sequences (Pietro et al., 1998).

The genomes have different frequencies of supersecondary structures, with yeast having relatively more consecutive strands, Haemophilus having more consecutive helices, and Methanococcus having more alternating helix strand structures (Chothia and M. Gerstein, 1997; Gerstein, 1997; Russell et al., 1996). Like methyl CoM reductase in methanogens, topoisomerase VIB is also having a unique feature in secondary structure. Topoisomerase VI could interact in their respective domain with protein partners that are domain specific, preventing their correct function in a foreign domain (Apic et al., 2001; Gadelle et al., 2003). This results support the existence of majestic metabolic and protein diversity in Archaea, which are not closely correspondence to bacterial metabolism. In addition, one expects analysis of structure to reveal more about distant evolutionary relationships than just sequence comparison since structure is usually more conserved than sequence. The distinction between buried and exposed residues is particularly fruitful for molecular evolutionary studies because it allows consideration of protein regions undergoing different selection pressures. One of the most important advances in the reconstruction of evolutionary trees has been the consideration of heterogeneity of evolutionary rates among sequence sites (Yang, 1996). Nick et al. (1998) reported that rate heterogeneity is strongly associated with structural environment. Exposed sites tend to experience ~2X the rate of amino acid replacements experienced by buried sites. A higher rate of replacement for exposed sites is seen for each secondary structure type. The association between accessibility status and replacement rates is a noteworthy feature of protein evolution but has received scant attention in the field of molecular evolution (Yang, 1996; Nick et al., 1998).

Reconstruction of phylogenies from sequences with known structures involves less uncertainty and is therefore expected to be more accurate than reconstruction of phylogenies from sequences with unknown structures (Jeffrey et al., 1996). Although secondary structural elements of these proteins quantitatively resembled with bacterial homologs, they are not qualitatively corresponded at specific sequence position. However, having the key residues at functional positions and structural constraints are still maintaining the functional activity and structural conservation of these proteins among Archaea and Bacteria. Thus, the results obtained in this work point to the strong necessities of better understanding of the microbial diversity, particularly archaeal domain and of its evolutionary relationships. This present attempt will obviously provide a new view to the evolutionary biologists to strengthen the conceptual idea about diversity of protein superfamilies among Archaea and Bacteria with response to quantitative secondary structure conformations during archaeal evolution.

Acknowledgement

The corresponding author is grateful to the University Grants Commission, New Delhi, India for financial assistance (UGC Sanction No. 32-559/2006) to carry out the work.

References

Alquéres, S.M.C., Almeida, R.V., Clementino, M.M., Vieira, R.P., Almeida, W.I., et al. 2007. Exploring the biotechnological applications in the archaeal domain. Brazilian J Microbiol38: 398-405.
Altschul, Stephen, F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J., 1999. Basic local alignment search tool. J Mol Biol 215 (3): 403-410. PMID 2231712
Angélica Soto, M., Aquiles, S., José, C.T., 1985. Conservation of the secondary structure of protein during evolution and the role of the genetic code. J. Origins Life Evo. Biospher 16 (2): 157-164. DOI 10.1007/BF01809469
Brown, J.R and Doolittle, W.F, 1997.Archaea and the Prokaryote-to-Eukaryote Transition. Microbiol. Mol bio Rev 61(4): 456–502. PMID: 9409149
Chothia, C and Gerstein M., 1997. How far can sequences diverge? Nature 385: 579-581
Chothia, C., and Lesk, L.M., 1986. The relation between the divergence of sequence and structure in proteins. EMBO J. 5: 823-826. PMID: 3709526
Dayhoff, M.O., Eck, R.V., Park, C.M. 1972. A model of protein evolutionary change in proteins. In Dayhoff, M.O. (ed.), Atlas of protein sequence and structure. National Biomedical Research Foundation, Washington, D.C. Vol.5, pp.89-99.
Ebenhöh, O., Handorf, T., and Heinrich, R., 2005 A cross species comparison of metabolic network functions. Genome Informatics 16: 203-213. PMID: 16362923
Gadelle, D., File´e, J., Buhler, C and Forterre, P., 2003. Phylogenomics of type II DNA topoisomerases. BioEssays 25:232–242. PMID: 12596227
Galagan, J.E., Nusbaum, C., Roy, A., Endrizzi, M.G., Macdonald, P., FitzHugh, W., Calvo, Engels, S.R., Smirnov, S., Atnoor, D., et al. 2002. The genome of M. acetivorans reveals extensive metabolic and physiological diversity. Genome Res. 12 (4): 532–542. PMID: 11932238
Gasteiger, E., Hoogland, C., Gattiker, A., Duvaud, S., Wilkins, M.R., Appel, R.D., Bairoch A. 2005. Protein Identification and Analysis Tools on the ExPASy Server; (In) John M. Walker (ed): The Proteomics Protocols Handbook, Humana Press. pp. 571-607.
Geourjon, C and Deléage, C., 1995. SOPMA: significant improvements in protein secondary structure prediction by consensus prediction from multiple alignments. Bioinformatics 11(6): 681-684.
Gerstein M, 1997. A structural census of genomes: comparing bacterial, eukaryotic, and archaeal genomes in terms of protein structure. J Mol Biol 274(4): 562-576. PMID: 9417935
Gordana, A., Julian, G., Sarah, A.T. 2001. Domain Combinations in Archaeal, Eubacterial and Eukaryotic Proteomes. J. Mol Biol 310: 311-325. PMID: 11428892
Iyer, L.M, Koonin E.V., Leipe, D.D and Aravind, L., 2005. Origin and evolution of the archaeo-eukaryotic primase superfamily and related palm-domain proteins: structural insights and new members. Nucl. Acids Res. 33(12): 12 3875–3896. PMID: 16027112
Mizuguchi, K., Go, N. 1995. Comparison of spatial arrangements of the secondary structure elements in proteins. Protein Eng 8: 353-362.
Nick, G., Jeffrey, L.T., David, T.J. 1998. Assessing the Impact of Secondary Structure and Solvent Accessibility on Protein Evolution. Genetics 149: 445–458.
Pietro, L., Nick, G., Jeffrey, L.T., David, T.J. (1998). PASSML: combining evolutionary inference and protein secondary structure prediction. Bionformatics 14 (8): 726-733.
Russell F. Doolittle, Da-Fei Feng, Simon, T., Glen, C., Elizabeth, L. 1996. Determining divergence times of the major kingdoms of living organisms with a protein clock. Science 271 (5248): 470 – 477.
Slesarev, A.I., Mezhevaya, K.V., Makarova, K.S., Polushin, N.N., Shcherbinina, O.V., Shakhova, V.V., Belova, G.I., Aravind, L., Natale, D.A., Rogozin, I.B., et al. 2002. The complete genome of hyperthermophile Methanopyrus kandleri AV19 and monophyly of archaeal methanogens. Proc Natl Acad Sci 99: 4644–4649. PMID: 11930014
Taylor, W. R., Jones, D.T. 1993. Deriving an amino acid distance matrix. J Theor Bio. 164: 65-83.
Yang, Z., 1996. Among-site rate variation and its impact on phylogenetic analysis. Tree 11: 367–372.

Generated at: Thu, 09 Feb 2012 20:27:02 -0600 (00001208) — http://www.ispub.com:80/journal/the-internet-journal-of-genomics-and-proteomics/volume-4-number-2/evolutionary-implication-of-protein-secondary-structure-among-archaea-and-bacteria.html