Secondary literature sources for PriCT_1
The following references were automatically generated.
- Brown SD, Babbitt PC
- New insights about enzyme evolution from large scale studies of sequence and structure relationships.
- J Biol Chem. 2014; 289: 30221-8
- Display abstract
Understanding how enzymes have evolved offers clues about their structure-function relationships and mechanisms. Here, we describe evolution of functionally diverse enzyme superfamilies, each representing a large set of sequences that evolved from a common ancestor and that retain conserved features of their structures and active sites. Using several examples, we describe the different structural strategies nature has used to evolve new reaction and substrate specificities in each unique superfamily. The results provide insight about enzyme evolution that is not easily obtained from studies of one or only a few enzymes.
- Yutin N, Faure G, Koonin EV, Mushegian AR
- Chordopoxvirus protein F12 implicated in enveloped virion morphogenesis is an inactivated DNA polymerase.
- Biol Direct. 2014; 9: 22-22
- Display abstract
Through the course of their evolution, viruses with large genomes have acquired numerous host genes, most of which perform function in virus reproduction in a manner that is related to their original activities in the cells, but some are exapted for new roles. Here we report the unexpected finding that protein F12, which is conserved among the chordopoxviruses and is implicated in the morphogenesis of enveloped intracellular virions, is a derived DNA polymerase, possibly of bacteriophage origin, in which the polymerase domain and probably the exonuclease domain have been inactivated. Thus, F12 appears to present a rare example of a drastic, exaptive functional change in virus evolution. REVIEWERS: This article was reviewed by Frank Eisenhaber and Juergen Brosius.
- Berkner S, Hinojosa MP, Prangishvili D, Lipps G
- Identification of the minimal replicon and the origin of replication of the crenarchaeal plasmid pRN1.
- Microbiologyopen. 2014; 3: 688-701
- Display abstract
We have determined the minimal replicon of the crenarchaeal plasmid pRN1. It consists of 3097 base pairs amounting to 58% of the genome of pRN1. The minimal replicon comprises replication operon orf56/orf904 coding for a transcriptional repressor and the replication protein of pRN1. An upstream region of 64 bp that contains the promoter of the replication operon is essential as well as 166 bp of sequence downstream of the orf904 gene. This region contains a putative transcriptional terminator and a 100 nucleotides long stem-loop structure. Only the latter structure was shown to be required for replication. In addition replication was sustained when the stem-loop was displaced to another part of the pRN1 sequence. By mutational analysis we also find that the integrity of the stem-loop structure is required to maintain the replication of pRN1-derived constructs. As similar stem-loop structures are also present in other members of the pRN family, we suggest that this conserved structural element could be the origin of replication for the pRN plasmids. Further bioinformatic analysis revealed that the domain structure of the replication protein and the presence of a similar stem-loop structure as the putative replication origin are also found in several bacteriophages.
- Kuron A et al.
- Evaluation of DNA primase DnaG as a potential target for antibiotics.
- Antimicrob Agents Chemother. 2014; 58: 1699-706
- Display abstract
Mycobacteria contain genes for several DNA-dependent RNA primases, including dnaG, which encodes an essential replication enzyme that has been proposed as a target for antituberculosis compounds. An in silico analysis revealed that mycobacteria also possess archaeo-eukaryotic superfamily primases (AEPs) of unknown function. Using a homologous recombination system, we obtained direct evidence that wild-type dnaG cannot be deleted from the chromosome of Mycobacterium smegmatis without disrupting viability, even in backgrounds in which mycobacterial AEPs are overexpressed. In contrast, single-deletion AEP mutants or mutants defective for all four identified M. smegmatis AEP genes did not exhibit growth defects under standard laboratory conditions. Deletion of native dnaG in M. smegmatis was tolerated only after the integration of an extra intact copy of the M. smegmatis or Mycobacterium tuberculosis dnaG gene, under the control of chemically inducible promoters, into the attB site of the chromosome. M. tuberculosis and M. smegmatis DnaG proteins were overproduced and purified, and their primase activities were confirmed using radioactive RNA synthesis assays. The enzymes appeared to be sensitive to known inhibitors (suramin and doxorubicin) of DnaG. Notably, M. smegmatis bacilli appeared to be sensitive to doxorubicin and resistant to suramin. The growth and survival of conditional mutant mycobacterial strains in which DnaG was significantly depleted were only slightly affected under standard laboratory conditions. Thus, although DnaG is essential for mycobacterial viability, only low levels of protein are required for growth. This suggests that very efficient inhibition of enzyme activity would be required for mycobacterial DnaG to be useful as an antibiotic target.
- Makarova KS, Wolf YI, Forterre P, Prangishvili D, Krupovic M, Koonin EV
- Dark matter in archaeal genomes: a rich source of novel mobile elements, defense systems and secretory complexes.
- Extremophiles. 2014; 18: 877-93
- Display abstract
Microbial genomes encompass a sizable fraction of poorly characterized, narrowly spread fast-evolving genes. Using sensitive methods for sequences comparison and protein structure prediction, we performed a detailed comparative analysis of clusters of such genes, which we denote "dark matter islands", in archaeal genomes. The dark matter islands comprise up to 20% of archaeal genomes and show remarkable heterogeneity and diversity. Nevertheless, three classes of entities are common in these genomic loci: (a) integrated viral genomes and other mobile elements; (b) defense systems, and (c) secretory and other membrane-associated systems. The dark matter islands in the genome of thermophiles and mesophiles show similar general trends of gene content, but thermophiles are substantially enriched in predicted membrane proteins whereas mesophiles have a greater proportion of recognizable mobile elements. Based on this analysis, we predict the existence of several novel groups of viruses and mobile elements, previously unnoticed variants of CRISPR-Cas immune systems, and new secretory systems that might be involved in stress response, intermicrobial conflicts and biogenesis of novel, uncharacterized membrane structures.
- Zarate-Perez F et al.
- Oligomeric properties of adeno-associated virus Rep68 reflect its multifunctionality.
- J Virol. 2013; 87: 1232-41
- Display abstract
The adeno-associated virus (AAV) encodes four regulatory proteins called Rep. The large AAV Rep proteins Rep68 and Rep78 are essential factors required in almost every step of the viral life cycle. Structurally, they share two domains: a modified version of the AAA(+) domain that characterizes the SF3 family of helicases and an N-terminal domain that binds DNA specifically. The combination of these two domains imparts extraordinary multifunctionality to work as initiators of DNA replication and regulators of transcription, in addition to their essential role during site-specific integration. Although most members of the SF3 family form hexameric rings in vitro, the oligomeric nature of Rep68 is unclear due to its propensity to aggregate in solution. We report here a comprehensive study to determine the oligomeric character of Rep68 using a combination of methods that includes sedimentation velocity ultracentrifugation, electron microscopy, and hydrodynamic modeling. We have determined that residue Cys151 induces Rep68 to aggregate in vitro. We show that Rep68 displays a concentration-dependent dynamic oligomeric behavior characterized by the presence of two populations: one with monomers and dimers in slow equilibrium and a second one consisting of a mixture of multiple-ring structures of seven and eight members. The presence of either ATP or ADP induces formation of larger complexes formed by the stacking of multiple rings. Taken together, our results support the idea of a Rep68 molecule that exhibits the flexible oligomeric behavior needed to perform the wide range of functions occurring during the AAV life cycle.
- Rymer RU et al.
- Binding mechanism of metalNTP substrates and stringent-response alarmones to bacterial DnaG-type primases.
- Structure. 2012; 20: 1478-89
- Display abstract
Primases are DNA-dependent RNA polymerases found in all cellular organisms. In bacteria, primer synthesis is carried out by DnaG, an essential enzyme that serves as a key component of DNA replication initiation, progression, and restart. How DnaG associates with nucleotide substrates and how certain naturally prevalent nucleotide analogs impair DnaG function are unknown. We have examined one of the earliest stages in primer synthesis and its control by solving crystal structures of the S. aureus DnaG catalytic core bound to metal ion cofactors and either individual nucleoside triphosphates or the nucleotidyl alarmones, pppGpp and ppGpp. These structures, together with both biochemical analyses and comparative studies of enzymes that use the same catalytic fold as DnaG, pinpoint the predominant nucleotide-binding site of DnaG and explain how the induction of the stringent response in bacteria interferes with primer synthesis.
- Cohen-Gihon I, Fong JH, Sharan R, Nussinov R, Przytycka TM, Panchenko AR
- Evolution of domain promiscuity in eukaryotic genomes--a perspective from the inferred ancestral domain architectures.
- Mol Biosyst. 2011; 7: 784-92
- Display abstract
Most eukaryotic proteins are composed of two or more domains. These assemble in a modular manner to create new proteins usually by the acquisition of one or more domains to an existing protein. Promiscuous domains which are found embedded in a variety of proteins and co-exist with many other domains are of particular interest and were shown to have roles in signaling pathways and mediating network communication. The evolution of domain promiscuity is still an open problem, mostly due to the lack of sequenced ancestral genomes. Here we use inferred domain architectures of ancestral genomes to trace the evolution of domain promiscuity in eukaryotic genomes. We find an increase in average promiscuity along many branches of the eukaryotic tree. Moreover, domain promiscuity can proceed at almost a steady rate over long evolutionary time or exhibit lineage-specific acceleration. We also observe that many signaling and regulatory domains gained domain promiscuity around the Bilateria divergence. In addition we show that those domains that played a role in the creation of two body axes and existed before the divergence of the bilaterians from fungi/metazoan achieve a boost in their promiscuities during the bilaterian evolution.
- Bilewitch JP, Degnan SM
- A unique horizontal gene transfer event has provided the octocoral mitochondrial genome with an active mismatch repair gene that has potential for an unusual self-contained function.
- BMC Evol Biol. 2011; 11: 228-228
- Display abstract
BACKGROUND: The mitochondrial genome of the Octocorallia has several characteristics atypical for metazoans, including a novel gene suggested to function in DNA repair. This mtMutS gene is favored for octocoral molecular systematics, due to its high information content. Several hypotheses concerning the origins of mtMutS have been proposed, and remain equivocal, although current weight of support is for a horizontal gene transfer from either an epsilonproteobacterium or a large DNA virus. Here we present new and compelling evidence on the evolutionary origin of mtMutS, and provide the very first data on its activity, functional capacity and stability within the octocoral mitochondrial genome. RESULTS: The mtMutS gene has the expected conserved amino acids, protein domains and predicted tertiary protein structure. Phylogenetic analysis indicates that mtMutS is not a member of the MSH family and therefore not of eukaryotic origin. MtMutS clusters closely with representatives of the MutS7 lineage; further support for this relationship derives from the sharing of a C-terminal endonuclease domain that confers a self-contained mismatch repair function. Gene expression analyses confirm that mtMutS is actively transcribed in octocorals. Rates of mitochondrial gene evolution in mtMutS-containing octocorals are lower than in their hexacoral sister-group, which lacks the gene, although paradoxically the mtMutS gene itself has higher rates of mutation than other octocoral mitochondrial genes. CONCLUSIONS: The octocoral mtMutS gene is active and codes for a protein with all the necessary components for DNA mismatch repair. A lower rate of mitochondrial evolution, and the presence of a nicking endonuclease domain, both indirectly support a theory of self-sufficient DNA mismatch repair within the octocoral mitochondrion. The ancestral affinity of mtMutS to non-eukaryotic MutS7 provides compelling support for an origin by horizontal gene transfer. The immediate vector of transmission into octocorals can be attributed to either an epsilonproteobacterium in an endosymbiotic association or to a viral infection, although DNA viruses are not currently known to infect both bacteria and eukaryotes, nor mitochondria in particular. In consolidating the first known case of HGT into an animal mitochondrial genome, these findings suggest the need for reconsideration of the means by which metazoan mitochondrial genomes evolve.
- Zhao DL et al.
- Characterization of a cryptic plasmid pSM429 and its application for heterologous expression in psychrophilic Pseudoalteromonas.
- Microb Cell Fact. 2011; 10: 30-30
- Display abstract
BACKGROUND: Pseudoalteromonas is an important genus widespread in marine environment, and a lot of psychrophilic Pseudoalteromonas strains thrive in deep sea and polar sea. By now, there are only a few genetic systems for Pseudoalteromonas reported and no commercial Pseudoalteromonas genetic system is available, which impedes the study of Pseudoalteromonas, especially for psychrophilic strains. The aim of this study is to develop a heterologous expression system for psychrophilic Pseudoalteromonas. RESULTS: A cryptic plasmid pSM429 isolated from psychrophilic Pseudoalteromonas sp. BSi20429 from the Arctic sea ice, was sequenced and characterized. The plasmid pSM429 is 3874 bp in length, with a G+C content of 28%. Four putative open reading frames (ORFs) were identified on pSM429. Based on homology, the ORF4 was predicted to encode a replication initiation (Rep) protein. A shuttle vector (Escherichia coli, Pseudoalteromonas), pWD, was constructed by ligating pSM429 and pUC19 and inserting a chloramphenicol acetyl transferase (CAT) cassette conferring chloramphenicol resistance. To determine the minimal replicon of pSM429 and to check the functionality of identified ORFs, various pWD derivatives were constructed. All derivatives except the two smallest ones were shown to allow replication in Pseudoalteromonas sp. SM20429, a plasmid-cured strain of Pseudoalteromonas sp. BSi20429, suggesting that the orf4 and its flanking intergenic regions are essential for plasmid replication. Although not essential, the sequence including some repeats between orf1 and orf2 plays important roles in segregational stability of the plasmid. With the aid of pWD-derived plasmid pWD2, the erythromycin resistance gene and the cd gene encoding the catalytic domain of a cold-adapted cellulase were successfully expressed in Pseudoalteromonas sp. SM20429. CONCLUSIONS: Plasmid pSM429 was isolated and characterized, and the regions essential for plasmid replication and stability were determined, helping the development of pSM429-based shuttle vectors. The shuttle vectors pWD and its derivatives could be used as cloning vectors for Pseudoalteromonas, offering new perspectives in the genetic manipulation of Pseudoalteromonas strains. With the aid of pWD-derived vector and its host, the erythromycin resistance gene and the cd gene of a cold-adapted protein were successfully expressed, indicating that the potential use of this system for recombinant protein production, especially for cold-adapted proteins.
- Hines JC, Ray DS
- A second mitochondrial DNA primase is essential for cell growth and kinetoplast minicircle DNA replication in Trypanosoma brucei.
- Eukaryot Cell. 2011; 10: 445-54
- Display abstract
The mitochondrial DNA of trypanosomes contains two types of circular DNAs, minicircles and maxicircles. Both minicircles and maxicircles replicate from specific replication origins by unidirectional theta-type intermediates. Initiation of the minicircle leading strand and also that of at least the first Okazaki fragment involve RNA priming. The Trypanosoma brucei genome encodes two mitochondrial DNA primases, PRI1 and PRI2, related to the primases of eukaryotic nucleocytoplasmic large DNA viruses. These primases are members of the archeoeukaryotic primase superfamily, and each of them contain an RNA recognition motif and a PriCT-2 motif. In Leishmania species, PRI2 proteins are approximately 61 to 66 kDa in size, whereas in Trypanosoma species, PRI2 proteins have additional long amino-terminal extensions. RNA interference (RNAi) of T. brucei PRI2 resulted in the loss of kinetoplast DNA and accumulation of covalently closed free minicircles. Recombinant PRI2 lacking this extension (PRI2DeltaNT) primes poly(dA) synthesis on a poly(dT) template in an ATP-dependent manner. Mutation of two conserved aspartate residues (PRI2DeltaNTCS) resulted in loss of enzymatic activity but not loss of DNA binding. We propose that PRI2 is directly involved in initiating kinetoplast minicircle replication.
- Nasir A, Naeem A, Khan MJ, Nicora HD, Caetano-Anolles G
- Annotation of Protein Domains Reveals Remarkable Conservation in the Functional Make up of Proteomes Across Superkingdoms.
- Genes (Basel). 2011; 2: 869-911
- Display abstract
The functional repertoire of a cell is largely embodied in its proteome, the collection of proteins encoded in the genome of an organism. The molecular functions of proteins are the direct consequence of their structure and structure can be inferred from sequence using hidden Markov models of structural recognition. Here we analyze the functional annotation of protein domain structures in almost a thousand sequenced genomes, exploring the functional and structural diversity of proteomes. We find there is a remarkable conservation in the distribution of domains with respect to the molecular functions they perform in the three superkingdoms of life. In general, most of the protein repertoire is spent in functions related to metabolic processes but there are significant differences in the usage of domains for regulatory and extra-cellular processes both within and between superkingdoms. Our results support the hypotheses that the proteomes of superkingdom Eukarya evolved via genome expansion mechanisms that were directed towards innovating new domain architectures for regulatory and extra/intracellular process functions needed for example to maintain the integrity of multicellular structure or to interact with environmental biotic and abiotic factors (e.g., cell signaling and adhesion, immune responses, and toxin production). Proteomes of microbial superkingdoms Archaea and Bacteria retained fewer numbers of domains and maintained simple and smaller protein repertoires. Viruses appear to play an important role in the evolution of superkingdoms. We finally identify few genomic outliers that deviate significantly from the conserved functional design. These include Nanoarchaeum equitans, proteobacterial symbionts of insects with extremely reduced genomes, Tenericutes and Guillardia theta. These organisms spend most of their domains on information functions, including translation and transcription, rather than on metabolism and harbor a domain repertoire characteristic of parasitic organisms. In contrast, the functional repertoire of the proteomes of the Planctomycetes-Verrucomicrobia-Chlamydiae superphylum was no different than the rest of bacteria, failing to support claims of them representing a separate superkingdom. In turn, Protista and Bacteria shared similar functional distribution patterns suggesting an ancestral evolutionary link between these groups.
- Williams RS, Kunkel TA
- FEN nucleases: bind, bend, fray, cut.
- Cell. 2011; 145: 171-2
- Display abstract
In this issue, Orans et al. (2011) and Tsutakawa et al. (2011) report exciting insights into the molecular principles governing diverse endo- and exonucleolytic cleavage specificities of members of the RAD2/FEN superfamily of nucleases, which have critical roles in DNA replication and maintenance.
- Sanchez-Pulido L, Ponting CP
- Cdc45: the missing RecJ ortholog in eukaryotes?
- Bioinformatics. 2011; 27: 1885-8
- Display abstract
DNA replication is one of the most ancient of cellular processes and functional similarities among its molecular machinery are apparent across all cellular life. Cdc45 is one of the essential components of the eukaryotic replication fork and is required for the initiation and elongation of DNA replication, but its molecular function is currently unknown. In order to trace its evolutionary history and to identify functional domains, we embarked on a computational sequence analysis of the Cdc45 protein family. Our findings reveal eukaryotic Cdc45 and prokaryotic RecJ to possess a common ancestry and Cdc45 to contain a catalytic site within a predicted exonuclease domain. The likely orthology between Cdc45 and RecJ reveals new lines of enquiry into DNA replication mechanisms in eukaryotes.
- Van Etten JL, Lane LC, Dunigan DD
- DNA viruses: the really big ones (giruses).
- Annu Rev Microbiol. 2010; 64: 83-99
- Display abstract
Viruses with genomes greater than 300 kb and up to 1200 kb are being discovered with increasing frequency. These large viruses (often called giruses) can encode up to 900 proteins and also many tRNAs. Consequently, these viruses have more protein-encoding genes than many bacteria, and the concept of small particle/small genome that once defined viruses is no longer valid. Giruses infect bacteria and animals although most of the recently discovered ones infect protists. Thus, genome gigantism is not restricted to a specific host or phylogenetic clade. To date, most of the giruses are associated with aqueous environments. Many of these large viruses (phycodnaviruses and Mimiviruses) probably have a common evolutionary ancestor with the poxviruses, iridoviruses, asfarviruses, ascoviruses, and a recently discovered Marseillevirus. One issue that is perhaps not appreciated by the microbiology community is that large viruses, even ones classified in the same family, can differ significantly in morphology, lifestyle, and genome structure. This review focuses on some of these differences than on extensive details about individual viruses.
- Beck K, Vannini A, Cramer P, Lipps G
- The archaeo-eukaryotic primase of plasmid pRN1 requires a helix bundle domain for faithful primer synthesis.
- Nucleic Acids Res. 2010; 38: 6707-18
- Display abstract
The plasmid pRN1 encodes for a multifunctional replication protein with primase, DNA polymerase and helicase activity. The minimal region required for primase activity encompasses amino-acid residues 40-370. While the N-terminal part of that minimal region (residues 47-247) folds into the prim/pol domain and bears the active site, the structure and function of the C-terminal part (residues 248-370) is unknown. Here we show that the C-terminal part of the minimal region folds into a compact domain with six helices and is stabilized by a disulfide bond. Three helices superimpose well with the C-terminal domain of the primase of the bacterial broad host range plasmid RSF1010. Structure-based site-directed mutagenesis shows that the C-terminal helix of the helix bundle domain is required for primase activity although it is distant to the active site in the crystallized conformation. Furthermore, we identified mutants of the C-terminal domain, which are defective in template binding, dinucleotide formation and conformation change prior to DNA extension.
- Vaithiyalingam S, Warren EM, Eichman BF, Chazin WJ
- Insights into eukaryotic DNA priming from the structure and functional interactions of the 4Fe-4S cluster domain of human DNA primase.
- Proc Natl Acad Sci U S A. 2010; 107: 13684-9
- Display abstract
DNA replication requires priming of DNA templates by enzymes known as primases. Although DNA primase structures are available from archaea and bacteria, the mechanism of DNA priming in higher eukaryotes remains poorly understood in large part due to the absence of the structure of the unique, highly conserved C-terminal regulatory domain of the large subunit (p58C). Here, we present the structure of this domain determined to 1.7-A resolution by X-ray crystallography. The p58C structure reveals a novel arrangement of an evolutionarily conserved 4Fe-4S cluster buried deeply within the protein core and is not similar to any known protein structure. Analysis of the binding of DNA to p58C by fluorescence anisotropy measurements revealed a strong preference for ss/dsDNA junction substrates. This approach was combined with site-directed mutagenesis to confirm that the binding of DNA occurs to a distinctively basic surface on p58C. A specific interaction of p58C with the C-terminal domain of the intermediate subunit of replication protein A (RPA32C) was identified and characterized by isothermal titration calorimetry and NMR. Restraints from NMR experiments were used to drive computational docking of the two domains and generate a model of the p58C-RPA32C complex. Together, our results explain functional defects in human DNA primase mutants and provide insights into primosome loading on RPA-coated ssDNA and regulation of primase activity.
- Yutin N, Wolf YI, Raoult D, Koonin EV
- Eukaryotic large nucleo-cytoplasmic DNA viruses: clusters of orthologous genes and reconstruction of viral genome evolution.
- Virol J. 2009; 6: 223-223
- Display abstract
BACKGROUND: The Nucleo-Cytoplasmic Large DNA Viruses (NCLDV) comprise an apparently monophyletic class of viruses that infect a broad variety of eukaryotic hosts. Recent progress in isolation of new viruses and genome sequencing resulted in a substantial expansion of the NCLDV diversity, resulting in additional opportunities for comparative genomic analysis, and a demand for a comprehensive classification of viral genes. RESULTS: A comprehensive comparison of the protein sequences encoded in the genomes of 45 NCLDV belonging to 6 families was performed in order to delineate cluster of orthologous viral genes. Using previously developed computational methods for orthology identification, 1445 Nucleo-Cytoplasmic Virus Orthologous Groups (NCVOGs) were identified of which 177 are represented in more than one NCLDV family. The NCVOGs were manually curated and annotated and can be used as a computational platform for functional annotation and evolutionary analysis of new NCLDV genomes. A maximum-likelihood reconstruction of the NCLDV evolution yielded a set of 47 conserved genes that were probably present in the genome of the common ancestor of this class of eukaryotic viruses. This reconstructed ancestral gene set is robust to the parameters of the reconstruction procedure and so is likely to accurately reflect the gene core of the ancestral NCLDV, indicating that this virus encoded a complex machinery of replication, expression and morphogenesis that made it relatively independent from host cell functions. CONCLUSIONS: The NCVOGs are a flexible and expandable platform for genome analysis and functional annotation of newly characterized NCLDV. Evolutionary reconstructions employing NCVOGs point to complex ancestral viruses.
- Makarova KS, Wolf YI, van der Oost J, Koonin EV
- Prokaryotic homologs of Argonaute proteins are predicted to function as key components of a novel system of defense against mobile genetic elements.
- Biol Direct. 2009; 4: 29-29
- Display abstract
BACKGROUND: In eukaryotes, RNA interference (RNAi) is a major mechanism of defense against viruses and transposable elements as well of regulating translation of endogenous mRNAs. The RNAi systems recognize the target RNA molecules via small guide RNAs that are completely or partially complementary to a region of the target. Key components of the RNAi systems are proteins of the Argonaute-PIWI family some of which function as slicers, the nucleases that cleave the target RNA that is base-paired to a guide RNA. Numerous prokaryotes possess the CRISPR-associated system (CASS) of defense against phages and plasmids that is, in part, mechanistically analogous but not homologous to eukaryotic RNAi systems. Many prokaryotes also encode homologs of Argonaute-PIWI proteins but their functions remain unknown. RESULTS: We present a detailed analysis of Argonaute-PIWI protein sequences and the genomic neighborhoods of the respective genes in prokaryotes. Whereas eukaryotic Ago/PIWI proteins always contain PAZ (oligonucleotide binding) and PIWI (active or inactivated nuclease) domains, the prokaryotic Argonaute homologs (pAgos) fall into two major groups in which the PAZ domain is either present or absent. The monophyly of each group is supported by a phylogenetic analysis of the conserved PIWI-domains. Almost all pAgos that lack a PAZ domain appear to be inactivated, and the respective genes are associated with a variety of predicted nucleases in putative operons. An additional, uncharacterized domain that is fused to various nucleases appears to be a unique signature of operons encoding the short (lacking PAZ) pAgo form. By contrast, almost all PAZ-domain containing pAgos are predicted to be active nucleases. Some proteins of this group (e.g., that from Aquifex aeolicus) have been experimentally shown to possess nuclease activity, and are not typically associated with genes for other (putative) nucleases. Given these observations, the apparent extensive horizontal transfer of pAgo genes, and their common, statistically significant over-representation in genomic neighborhoods enriched in genes encoding proteins involved in the defense against phages and/or plasmids, we hypothesize that pAgos are key components of a novel class of defense systems. The PAZ-domain containing pAgos are predicted to directly destroy virus or plasmid nucleic acids via their nuclease activity, whereas the apparently inactivated, PAZ-lacking pAgos could be structural subunits of protein complexes that contain, as active moieties, the putative nucleases that we predict to be co-expressed with these pAgos. All these nucleases are predicted to be DNA endonucleases, so it seems most probable that the putative novel phage/plasmid-defense system targets phage DNA rather than mRNAs. Given that in eukaryotic RNAi systems, the PAZ domain binds a guide RNA and positions it on the complementary region of the target, we further speculate that pAgos function on a similar principle (the guide being either DNA or RNA), and that the uncharacterized domain found in putative operons with the short forms of pAgos is a functional substitute for the PAZ domain. CONCLUSION: The hypothesis that pAgos are key components of a novel prokaryotic immune system that employs guide RNA or DNA molecules to degrade nucleic acids of invading mobile elements implies a functional analogy with the prokaryotic CASS and a direct evolutionary connection with eukaryotic RNAi. The predictions of the hypothesis including both the activities of pAgos and those of the associated endonucleases are readily amenable to experimental tests.
- Yutin N, Koonin EV
- Evolution of DNA ligases of nucleo-cytoplasmic large DNA viruses of eukaryotes: a case of hidden complexity.
- Biol Direct. 2009; 4: 51-51
- Display abstract
BACKGROUND: Eukaryotic Nucleo-Cytoplasmic Large DNA Viruses (NCLDV) encode most if not all of the enzymes involved in their DNA replication. It has been inferred that genes for these enzymes were already present in the last common ancestor of the NCLDV. However, the details of the evolution of these genes that bear on the complexity of the putative ancestral NCLDV and on the evolutionary relationships between viruses and their hosts are not well understood. RESULTS: Phylogenetic analysis of the ATP-dependent and NAD-dependent DNA ligases encoded by the NCLDV reveals an unexpectedly complex evolutionary history. The NAD-dependent ligases are encoded only by a minority of NCLDV (including mimiviruses, some iridoviruses and entomopoxviruses) but phylogenetic analysis clearly indicated that all viral NAD-dependent ligases are monophyletic. Combined with the topology of the NCLDV tree derived by consensus of trees for universally conserved genes suggests that this enzyme was represented in the ancestral NCLDV. Phylogenetic analysis of ATP-dependent ligases that are encoded by chordopoxviruses, most of the phycodnaviruses and Marseillevirus failed to demonstrate monophyly and instead revealed an unexpectedly complex evolutionary trajectory. The ligases of the majority of phycodnaviruses and Marseillevirus seem to have evolved from bacteriophage or bacterial homologs; the ligase of one phycodnavirus, Emiliana huxlei virus, belongs to the eukaryotic DNA ligase I branch; and ligases of chordopoxviruses unequivocally cluster with eukaryotic DNA ligase III. CONCLUSIONS: Examination of phyletic patterns and phylogenetic analysis of DNA ligases of the NCLDV suggest that the common ancestor of the extant NCLDV encoded an NAD-dependent ligase that most likely was acquired from a bacteriophage at the early stages of evolution of eukaryotes. By contrast, ATP-dependent ligases from different prokaryotic and eukaryotic sources displaced the ancestral NAD-dependent ligase at different stages of subsequent evolution. These findings emphasize complex routes of viral evolution that become apparent through detailed phylogenomic analysis but not necessarily in reconstructions based on phyletic patterns of genes. REVIEWERS: This article was reviewed by: Patrick Forterre, George V. Shpakovski, and Igor B. Zhulin.
- Iyer LM, Abhiman S, Maxwell Burroughs A, Aravind L
- Amidoligases with ATP-grasp, glutamine synthetase-like and acetyltransferase-like domains: synthesis of novel metabolites and peptide modifications of proteins.
- Mol Biosyst. 2009; 5: 1636-60
- Display abstract
Recent studies have shown that the ubiquitin system had its origins in ancient cofactor/amino acid biosynthesis pathways. Preliminary studies also indicated that conjugation systems for other peptide tags on proteins, such as pupylation, have evolutionary links to cofactor/amino acid biosynthesis pathways. Following up on these observations, we systematically investigated the non-ribosomal amidoligases of the ATP-grasp, glutamine synthetase-like and acetyltransferase folds by classifying the known members and identifying novel versions. We then established their contextual connections using information from domain architectures and conserved gene neighborhoods. This showed remarkable, previously uncharacterized functional links between diverse peptide ligases, several peptidases of unrelated folds and enzymes involved in synthesis of modified amino acids. Using the network of contextual connections we were able to predict numerous novel pathways for peptide synthesis and modification, amine-utilization, secondary metabolite synthesis and potential peptide-tagging systems. One potential peptide-tagging system, which is widely distributed in bacteria, involves an ATP-grasp domain and a glutamine synthetase-like ligase, both of which are circularly permuted, an NTN-hydrolase fold peptidase and a novel alpha helical domain. Our analysis also elucidates key steps in the biosynthesis of antibiotics such as friulimicin, butirosin and bacilysin and cell surface structures such as capsular polymers and teichuronopeptides. We also report the discovery of several novel ribosomally synthesized bacterial peptide metabolites that are cyclized via amide and lactone linkages formed by ATP-grasp enzymes. We present an evolutionary scenario for the multiple convergent origins of peptide ligases in various folds and clarify the bacterial origin of eukaryotic peptide-tagging enzymes of the TTL family.
- De Silva FS, Paran N, Moss B
- Products and substrate/template usage of vaccinia virus DNA primase.
- Virology. 2009; 383: 136-41
- Display abstract
Vaccinia virus encodes a 90-kDa protein conserved in all poxviruses, with DNA primase and nucleoside triphosphatase activities. DNA primase products, synthesized with a single stranded varphiX174 DNA template, were resolved as dinucleotides and long RNAs on denaturing polyacrylamide and agarose gels. Following phosphatase treatment, the dinucleotides GpC and ApC in a 4:1 ratio were identified by nearest neighbor analysis in which (32)P was transferred from [alpha-(32)P]CTP to initiating purine nucleotides. Differences in the nucleotide binding sites for initiation and elongation were suggested by the absence of CpC and UpC dinucleotides as well as the inability of deoxynucleotides to mediate primer synthesis despite their incorporation into mixed RNA/DNA primers. Strong primase activity was detected with an oligo(dC) template. However, there was only weak activity with an oligo(dT) template and none with oligo(dA) or oligo(dG). The absence of stringent template specificity is consistent with a role for the enzyme in priming DNA synthesis at the replication fork.
- Thai V et al.
- Structural, biochemical, and in vivo characterization of the first virally encoded cyclophilin from the Mimivirus.
- J Mol Biol. 2008; 378: 71-86
- Display abstract
Although multiple viruses utilize host cell cyclophilins, including severe acute respiratory syndrome (SARS) and human immunodeficiency virus type-1(HIV-1), their role in infection is poorly understood. To help elucidate these roles, we have characterized the first virally encoded cyclophilin (mimicyp) derived from the largest virus discovered to date (the Mimivirus) that is also a causative agent of pneumonia in humans. Mimicyp adopts a typical cyclophilin-fold, yet it also forms trimers unlike any previously characterized homologue. Strikingly, immunofluorescence assays reveal that mimicyp localizes to the surface of the mature virion, as recently proposed for several viruses that recruit host cell cyclophilins such as SARS and HIV-1. Additionally mimicyp lacks peptidyl-prolyl isomerase activity in contrast to human cyclophilins. Thus, this study suggests that cyclophilins, whether recruited from host cells (i.e. HIV-1 and SARS) or virally encoded (i.e. Mimivirus), are localized on viral surfaces for at least a subset of viruses.
- Berger JM
- SnapShot: nucleic acid helicases and translocases.
- Cell. 2008; 134: 888-888
- Gu J, Lieber MR
- Mechanistic flexibility as a conserved theme across 3 billion years of nonhomologous DNA end-joining.
- Genes Dev. 2008; 22: 411-5
- Berthon J, Cortez D, Forterre P
- Genomic context analysis in Archaea suggests previously unrecognized links between DNA replication and translation.
- Genome Biol. 2008; 9: 71-71
- Display abstract
BACKGROUND: Comparative analysis of genomes is valuable to explore evolution of genomes, deduce gene functions, or predict functional linking between proteins. Here, we have systematically analyzed the genomic environment of all known DNA replication genes in 27 archaeal genomes to infer new connections for DNA replication proteins from conserved genomic associations. RESULTS: Two distinct sets of DNA replication genes frequently co-localize in archaeal genomes: the first includes the genes for PCNA, the small subunit of the DNA primase (PriS), and Gins15; the second comprises the genes for MCM and Gins23. Other genomic associations of genes encoding proteins involved in informational processes that may be functionally relevant at the cellular level have also been noted; in particular, the association between the genes for PCNA, transcription factor S, and NudF. Surprisingly, a conserved cluster of genes coding for proteins involved in translation or ribosome biogenesis (S27E, L44E, aIF-2 alpha, Nop10) is almost systematically contiguous to the group of genes coding for PCNA, PriS, and Gins15. The functional relevance of this cluster encoding proteins conserved in Archaea and Eukarya is strongly supported by statistical analysis. Interestingly, the gene encoding the S27E protein, also known as metallopanstimulin 1 (MPS-1) in human, is overexpressed in multiple cancer cell lines. CONCLUSION: Our genome context analysis suggests specific functional interactions for proteins involved in DNA replication between each other or with proteins involved in DNA repair or transcription. Furthermore, it suggests a previously unrecognized regulatory network coupling DNA replication and translation in Archaea that may also exist in Eukarya.
- Berquist BR, DasSarma P, DasSarma S
- Essential and non-essential DNA replication genes in the model halophilic Archaeon, Halobacterium sp. NRC-1.
- BMC Genet. 2007; 8: 31-31
- Display abstract
BACKGROUND: Information transfer systems in Archaea, including many components of the DNA replication machinery, are similar to those found in eukaryotes. Functional assignments of archaeal DNA replication genes have been primarily based upon sequence homology and biochemical studies of replisome components, but few genetic studies have been conducted thus far. We have developed a tractable genetic system for knockout analysis of genes in the model halophilic archaeon, Halobacterium sp. NRC-1, and used it to determine which DNA replication genes are essential. RESULTS: Using a directed in-frame gene knockout method in Halobacterium sp. NRC-1, we examined nineteen genes predicted to be involved in DNA replication. Preliminary bioinformatic analysis of the large haloarchaeal Orc/Cdc6 family, related to eukaryotic Orc1 and Cdc6, showed five distinct clades of Orc/Cdc6 proteins conserved in all sequenced haloarchaea. Of ten orc/cdc6 genes in Halobacterium sp. NRC-1, only two were found to be essential, orc10, on the large chromosome, and orc2, on the minichromosome, pNRC200. Of the three replicative-type DNA polymerase genes, two were essential: the chromosomally encoded B family, polB1, and the chromosomally encoded euryarchaeal-specific D family, polD1/D2 (formerly called polA1/polA2 in the Halobacterium sp. NRC-1 genome sequence). The pNRC200-encoded B family polymerase, polB2, was non-essential. Accessory genes for DNA replication initiation and elongation factors, including the putative replicative helicase, mcm, the eukaryotic-type DNA primase, pri1/pri2, the DNA polymerase sliding clamp, pcn, and the flap endonuclease, rad2, were all essential. Targeted genes were classified as non-essential if knockouts were obtained and essential based on statistical analysis and/or by demonstrating the inability to isolate chromosomal knockouts except in the presence of a complementing plasmid copy of the gene. CONCLUSION: The results showed that ten out of nineteen eukaryotic-type DNA replication genes are essential for Halobacterium sp. NRC-1, consistent with their requirement for DNA replication. The essential genes code for two of ten Orc/Cdc6 proteins, two out of three DNA polymerases, the MCM helicase, two DNA primase subunits, the DNA polymerase sliding clamp, and the flap endonuclease.
- Yamada T, Onimatsu H, Van Etten JL
- Chlorella viruses.
- Adv Virus Res. 2006; 66: 293-336
- Display abstract
Chlorella viruses or chloroviruses are large, icosahedral, plaque-forming, double-stranded-DNA-containing viruses that replicate in certain strains of the unicellular green alga Chlorella. DNA sequence analysis of the 330-kbp genome of Paramecium bursaria chlorella virus 1 (PBCV-1), the prototype of this virus family (Phycodnaviridae), predict approximately 366 protein-encoding genes and 11 tRNA genes. The predicted gene products of approximately 50% of these genes resemble proteins of known function, including many that are completely unexpected for a virus. In addition, the chlorella viruses have several features and encode many gene products that distinguish them from most viruses. These products include: (1) multiple DNA methyltransferases and DNA site-specific endonucleases, (2) the enzymes required to glycosylate their proteins and synthesize polysaccharides such as hyaluronan and chitin, (3) a virus-encoded K(+) channel (called Kcv) located in the internal membrane of the virions, (4) a SET domain containing protein (referred to as vSET) that dimethylates Lys27 in histone 3, and (5) PBCV-1 has three types of introns; a self-splicing intron, a spliceosomal processed intron, and a small tRNA intron. Accumulating evidence indicates that the chlorella viruses have a very long evolutionary history. This review mainly deals with research on the virion structure, genome rearrangements, gene expression, cell wall degradation, polysaccharide synthesis, and evolution of PBCV-1 as well as other related viruses.
- Iyer LM, Balaji S, Koonin EV, Aravind L
- Evolutionary genomics of nucleo-cytoplasmic large DNA viruses.
- Virus Res. 2006; 117: 156-84
- Display abstract
A previous comparative-genomic study of large nuclear and cytoplasmic DNA viruses (NCLDVs) of eukaryotes revealed the monophyletic origin of four viral families: poxviruses, asfarviruses, iridoviruses, and phycodnaviruses [Iyer, L.M., Aravind, L., Koonin, E.V., 2001. Common origin of four diverse families of large eukaryotic DNA viruses. J. Virol. 75 (23), 11720-11734]. Here we update this analysis by including the recently sequenced giant genome of the mimiviruses and several additional genomes of iridoviruses, phycodnaviruses, and poxviruses. The parsimonious reconstruction of the gene complement of the ancestral NCLDV shows that it was a complex virus with at least 41 genes that encoded the replication machinery, up to four RNA polymerase subunits, at least three transcription factors, capping and polyadenylation enzymes, the DNA packaging apparatus, and structural components of an icosahedral capsid and the viral membrane. The phylogeny of the NCLDVs is reconstructed by cladistic analysis of the viral gene complements, and it is shown that the two principal lineages of NCLDVs are comprised of poxviruses grouped with asfarviruses and iridoviruses grouped with phycodnaviruses-mimiviruses. The phycodna-mimivirus grouping was strongly supported by several derived shared characters, which seemed to rule out the previously suggested basal position of the mimivirus [Raoult, D., Audic, S., Robert, C., Abergel, C., Renesto, P., Ogata, H., La Scola, B., Suzan, M., Claverie, J.M. 2004. The 1.2-megabase genome sequence of Mimivirus. Science 306 (5700), 1344-1350]. These results indicate that the divergence of the major NCLDV families occurred at an early stage of evolution, prior to the divergence of the major eukaryotic lineages. It is shown that subsequent evolution of the NCLDV genomes involved lineage-specific expansion of paralogous gene families and acquisition of numerous genes via horizontal gene transfer from the eukaryotic hosts, other viruses, and bacteria (primarily, endosymbionts and parasites). Amongst the expansions, there are multiple families of predicted virus-specific signaling and regulatory domains. Most NCLDVs have also acquired large arrays of genes related to ubiquitin signaling, and the animal viruses in particular have independently evolved several defenses against apoptosis and immune response, including growth factors and potential inhibitors of cytokine signaling. The mimivirus displays an enormous array of genes of bacterial provenance, including a representative of a new class of predicted papain-like peptidases. It is further demonstrated that a significant number of genes found in NCLDVs also have homologs in bacteriophages, although a vertical relationship between the NCLDVs and a particular bacteriophage group could not be established. On the basis of these observations, two alternative scenarios for the origin of the NCLDVs and other groups of large DNA viruses of eukaryotes are considered. One of these scenarios posits an early assembly of an already large DNA virus precursor from which various large DNA viruses diverged through an ongoing process of displacement of the original genes by xenologous or non-orthologous genes from various sources. The second scenario posits convergent emergence, on multiple occasions, of large DNA viruses from small plasmid-like precursors through independent accretion of similar sets of genes due to strong selective pressures imposed by their life cycles and hosts.
- McGeoch AT, Bell SD
- Eukaryotic/archaeal primase and MCM proteins encoded in a bacteriophage genome.
- Cell. 2005; 120: 167-8
- Kato M, Ito T, Wagner G, Ellenberger T
- A molecular handoff between bacteriophage T7 DNA primase and T7 DNA polymerase initiates DNA synthesis.
- J Biol Chem. 2004; 279: 30554-62
- Display abstract
The T7 DNA primase synthesizes tetraribonucleotides that prime DNA synthesis by T7 DNA polymerase but only on the condition that the primase stabilizes the primed DNA template in the polymerase active site. We used NMR experiments and alanine scanning mutagenesis to identify residues in the zinc binding domain of T7 primase that engage the primed DNA template to initiate DNA synthesis by T7 DNA polymerase. These residues cover one face of the zinc binding domain and include a number of aromatic amino acids that are conserved in bacteriophage primases. The phage T7 single-stranded DNA-binding protein gp2.5 specifically interfered with the utilization of tetraribonucleotide primers by interacting with T7 DNA polymerase and preventing a productive interaction with the primed template. We propose that the opposing effects of gp2.5 and T7 primase on the initiation of DNA synthesis reflect a sequence of mutually exclusive interactions that occur during the recycling of the polymerase on the lagging strand of the replication fork.
- Anantharaman V, Aravind L
- Novel conserved domains in proteins with predicted roles in eukaryotic cell-cycle regulation, decapping and RNA stability.
- BMC Genomics. 2004; 5: 45-45
- Display abstract
BACKGROUND: The emergence of eukaryotes was characterized by the expansion and diversification of several ancient RNA-binding domains and the apparent de novo innovation of new RNA-binding domains. The identification of these RNA-binding domains may throw light on the emergence of eukaryote-specific systems of RNA metabolism. RESULTS: Using sensitive sequence profile searches, homology-based fold recognition and sequence-structure superpositions, we identified novel, divergent versions of the Sm domain in the Scd6p family of proteins. This family of Sm-related domains shares certain features of conventional Sm domains, which are required for binding RNA, in addition to possessing some unique conserved features. We also show that these proteins contain a second previously uncharacterized C-terminal domain, termed the FDF domain (after a conserved sequence motif in this domain). The FDF domain is also found in the fungal Dcp3p-like and the animal FLJ22128-like proteins, where it fused to a C-terminal domain of the YjeF-N domain family. In addition to the FDF domains, the FLJ22128-like proteins contain yet another divergent version of the Sm domain at their extreme N-terminus. We show that the YjeF-N domains represent a novel version of the Rossmann fold that has acquired a set of catalytic residues and structural features that distinguish them from the conventional dehydrogenases. CONCLUSIONS: Several lines of contextual information suggest that the Scd6p family and the Dcp3p-like proteins are conserved components of the eukaryotic RNA metabolism system. We propose that the novel domains reported here, namely the divergent versions of the Sm domain and the FDF domain may mediate specific RNA-protein and protein-protein interactions in cytoplasmic ribonucleoprotein complexes. More specifically, the protein complexes containing Sm-like domains of the Scd6p family are predicted to regulate the stability of mRNA encoding proteins involved in cell cycle progression and vesicular assembly. The Dcp3p and FLJ22128 proteins may localize to the cytoplasmic processing bodies and possibly catalyze a specific processing step in the decapping pathway. The explosive diversification of Sm domains appears to have played a role in the emergence of several uniquely eukaryotic ribonucleoprotein complexes, including those involved in decapping and mRNA stability.
- Pei J, Sadreyev R, Grishin NV
- PCMA: fast and accurate multiple sequence alignment based on profile consistency.
- Bioinformatics. 2003; 19: 427-8
- Display abstract
PCMA (profile consistency multiple sequence alignment) is a progressive multiple sequence alignment program that combines two different alignment strategies. Highly similar sequences are aligned in a fast way as in ClustalW, forming pre-aligned groups. The T-Coffee strategy is applied to align the relatively divergent groups based on profile-profile comparison and consistency. The scoring function for local alignments of pre-aligned groups is based on a novel profile-profile comparison method that is a generalization of the PSI-BLAST approach to profile-sequence comparison. PCMA balances speed and accuracy in a flexible way and is suitable for aligning large numbers of sequences. AVAILABILITY: PCMA is freely available for non-commercial use. Pre-compiled versions for several platforms can be downloaded from ftp://iole.swmed.edu/pub/PCMA/.
- Evguenieva-Hackenberg E, Walter P, Hochleitner E, Lottspeich F, Klug G
- An exosome-like complex in Sulfolobus solfataricus.
- EMBO Rep. 2003; 4: 889-93
- Display abstract
We present the first experimental evidence for the existence of an exosome-like protein complex in Archaea. In Eukarya, the exosome is essential for many pathways of RNA processing and degradation. Co-immunoprecipitation with antibodies directed against the previously predicted Sulfolobus solfataricus orthologue of the exosome subunit ribosomal-RNA-processing protein 41 (Rrp41) led to the purification of a 250-kDa protein complex from S. solfataricus. Approximately half of the complex cosediments with ribosomal subunits. It comprises four previously predicted orthologues of the core exosome subunits from yeast (Rrp41, Rrp42, Rrp4 and Csl4 (cepl synthetic lethality 4; an RNA-binding protein and exosome sub-unit)), whereas other predicted subunits were not found. Surprisingly, the archaeal homologue of the bacterial DNA primase DnaG was tightly associated with the complex. This suggests an RNA-related function for the archaeal DnaG-like proteins. Comparison of experimental data from different organisms shows that the minimal core of the exosome consists of at least one phosphate-dependent ribonuclease PH homologue, and of Rrp4 and Csl4. Such a protein complex was probably present in the last common ancestor of Archaea and Eukarya.
- Makarova KS, Aravind L, Grishin NV, Rogozin IB, Koonin EV
- A DNA repair system specific for thermophilic Archaea and bacteria predicted by genomic context analysis.
- Nucleic Acids Res. 2002; 30: 482-96
- Display abstract
During a systematic analysis of conserved gene context in prokaryotic genomes, a previously undetected, complex, partially conserved neighborhood consisting of more than 20 genes was discovered in most Archaea (with the exception of Thermoplasma acidophilum and Halobacterium NRC-1) and some bacteria, including the hyperthermophiles Thermotoga maritima and Aquifex aeolicus. The gene composition and gene order in this neighborhood vary greatly between species, but all versions have a stable, conserved core that consists of five genes. One of the core genes encodes a predicted DNA helicase, often fused to a predicted HD-superfamily hydrolase, and another encodes a RecB family exonuclease; three core genes remain uncharacterized, but one of these might encode a nuclease of a new family. Two more genes that belong to this neighborhood and are present in most of the genomes in which the neighborhood was detected encode, respectively, a predicted HD-superfamily hydrolase (possibly a nuclease) of a distinct family and a predicted, novel DNA polymerase. Another characteristic feature of this neighborhood is the expansion of a superfamily of paralogous, uncharacterized proteins, which are encoded by at least 20-30% of the genes in the neighborhood. The functional features of the proteins encoded in this neighborhood suggest that they comprise a previously undetected DNA repair system, which, to our knowledge, is the first repair system largely specific for thermophiles to be identified. This hypothetical repair system might be functionally analogous to the bacterial-eukaryotic system of translesion, mutagenic repair whose central components are DNA polymerases of the UmuC-DinB-Rad30-Rev1 superfamily, which typically are missing in thermophiles.
- Campos-Olivas R, Louis JM, Clerot D, Gronenborn B, Gronenborn AM
- The structure of a replication initiator unites diverse aspects of nucleic acid metabolism.
- Proc Natl Acad Sci U S A. 2002; 99: 10310-5
- Display abstract
Rolling circle replication is a mechanism for copying single-stranded genomes by means of double-stranded intermediates. A multifunctional replication initiator protein (Rep) is indispensable for the precise initiation and termination of this process. Despite the ubiquitous presence and fundamental importance of rolling circle replication elements, structural information on their respective replication initiators is still missing. Here we present the solution NMR structure of the catalytic domain of Rep, the initiator protein of tomato yellow leaf curl virus. It is composed of a central five-stranded anti-parallel beta-sheet, flanked by a small two-stranded beta-sheet, a beta-hairpin and two alpha-helices. Surprisingly, the structure reveals that the catalytic Rep domain is related to a large group of proteins that bind RNA or DNA. Identification of Rep as resembling the family of ribonucleoprotein/RNA-recognition motif fold proteins establishes a structure-based evolutionary link between RNA binding proteins, splicing factors, and replication initiators of prokaryotic and eukaryotic single-stranded DNA elements and mammalian DNA tumor viruses.
- Frick DN, Richardson CC
- DNA primases.
- Annu Rev Biochem. 2001; 70: 39-80
- Display abstract
DNA primases are enzymes whose continual activity is required at the DNA replication fork. They catalyze the synthesis of short RNA molecules used as primers for DNA polymerases. Primers are synthesized from ribonucleoside triphosphates and are four to fifteen nucleotides long. Most DNA primases can be divided into two classes. The first class contains bacterial and bacteriophage enzymes found associated with replicative DNA helicases. These prokaryotic primases contain three distinct domains: an amino terminal domain with a zinc ribbon motif involved in binding template DNA, a middle RNA polymerase domain, and a carboxyl-terminal region that either is itself a DNA helicase or interacts with a DNA helicase. The second major primase class comprises heterodimeric eukaryotic primases that form a complex with DNA polymerase alpha and its accessory B subunit. The small eukaryotic primase subunit contains the active site for RNA synthesis, and its activity correlates with DNA replication during the cell cycle.
- Anantharaman V, Koonin EV, Aravind L
- Regulatory potential, phyletic distribution and evolution of ancient, intracellular small-molecule-binding domains.
- J Mol Biol. 2001; 307: 1271-92
- Display abstract
Central cellular functions such as metabolism, solute transport and signal transduction are regulated, in part, via binding of small molecules by specialized domains. Using sensitive methods for sequence profile analysis and protein structure comparison, we exhaustively surveyed the protein sets from completely sequenced genomes for all occurrences of 21 intracellular small-molecule-binding domains (SMBDs) that are represented in at least two of the three major divisions of life (bacteria, archaea and eukaryotes). These included previously characterized domains such as PAS, GAF, ACT and ferredoxins, as well as three newly predicted SMBDs, namely the 4-vinyl reductase (4VR) domain, the NIFX domain and the 3-histidines (3H) domain. Although there are only a limited number of different superfamilies of these ancient SMBDs, they are present in numerous distinct proteins combined with various enzymatic, transport and signal-transducing domains. Most of the SMBDs show considerable evolutionary mobility and are involved in the generation of many lineage-specific domain architectures. Frequent re-invention of analogous architectures involving functionally related, but not homologous, domains was detected, such as, fusion of different SMBDs to several types of DNA-binding domains to form diverse transcription regulators in prokaryotes and eukaryotes. This is suggestive of similar selective forces affecting the diverse SMBDs and resulting in the formation of multidomain proteins that fit a limited number of functional stereotypes. Using the "guilt by association approach", the identification of SMBDs allowed prediction of functions and mode of regulation for a variety of previously uncharacterized proteins.
- Aravind L, Koonin EV
- Prokaryotic homologs of the eukaryotic DNA-end-binding protein Ku, novel domains in the Ku protein and prediction of a prokaryotic double-strand break repair system.
- Genome Res. 2001; 11: 1365-74
- Display abstract
Homologs of the eukaryotic DNA-end-binding protein Ku were identified in several bacterial and one archeal genome using iterative database searches with sequence profiles. Identification of prokaryotic Ku homologs allowed the dissection of the Ku protein sequences into three distinct domains, the Ku core that is conserved in eukaryotes and prokaryotes, a derived von Willebrand A domain that is fused to the amino terminus of the core in eukaryotic Ku proteins, and the newly recognized helix-extension-helix (HEH) domain that is fused to the carboxyl terminus of the core in eukaryotes and in one of the Ku homologs from the Actinomycete Streptomyces coelicolor. The version of the HEH domain present in eukaryotic Ku proteins represents the previously described DNA-binding domain called SAP. The Ku homolog from S. coelicolor contains a distinct version of the HEH domain that belongs to a previously unnoticed family of nucleic-acid-binding domains, which also includes HEH domains from the bacterial transcription termination factor Rho, bacterial and eukaryotic lysyl-tRNA synthetases, bacteriophage T4 endonuclease VII, and several uncharacterized proteins. The distribution of the Ku homologs in bacteria coincides with that of the archeal-eukaryotic-type DNA primase and genes for prokaryotic Ku homologs form predicted operons with genes coding for an ATP-dependent DNA ligase and/or archeal-eukaryotic-type DNA primase. Some of these operons additionally encode an uncharacterized protein that may function as nuclease or an Slx1p-like predicted nuclease containing a URI domain. A hypothesis is proposed that the Ku homolog, together with the associated gene products, comprise a previously unrecognized prokaryotic system for repair of double-strand breaks in DNA.
- Arguello-Astorga GR, Ruiz-Medrano R
- An iteron-related domain is associated to Motif 1 in the replication proteins of geminiviruses: identification of potential interacting amino acid-base pairs by a comparative approach.
- Arch Virol. 2001; 146: 1465-85
- Display abstract
Geminiviruses encode a replication initiator protein, Rep, which binds in a sequence-specific fashion to iterated DNA motifs (iterons) functioning as essential elements for virus-specific replication. By using the iterons of more than one hundred geminiviruses as heuristic devices, we have identified a Rep subdomain 8 to 10 residues in length, whose primary structure varies among viruses harboring different iterons, but which is similar among viruses with identical iterons, regardless of their differences in host range, insect vector, geographical origin or genome structure. Close analysis of this iteron-related domain (IRD) revealed consistent correlations between specific Rep residues and defined nucleotides of its cognate iteron, thus providing important insights about the molecular code which dictates the Rep preference for specific DNA sequences. A model of potential Rep-iteron contacts is proposed. The identified IRD is adjacent to a conserved motif characteristic of a superfamily of rolling-circle (RC) replication proteins, and secondary structure predictions suggest that those Rep subdomains form together the core of a novel DNA-binding domain possessing a beta-sheet as recognition subdomain, which is apparently conserved in the replication proteins of nanoviruses, circoviruses, microviruses, and a variety of ssDNA plasmids of eubacteria, archaebacteria and red algae. The evolutionary implications of these findings are discussed.
- Wolf YI, Rogozin IB, Grishin NV, Tatusov RL, Koonin EV
- Genome trees constructed using five different approaches suggest new major bacterial clades.
- BMC Evol Biol. 2001; 1: 8-8
- Display abstract
BACKGROUND: The availability of multiple complete genome sequences from diverse taxa prompts the development of new phylogenetic approaches, which attempt to incorporate information derived from comparative analysis of complete gene sets or large subsets thereof. Such attempts are particularly relevant because of the major role of horizontal gene transfer and lineage-specific gene loss, at least in the evolution of prokaryotes. RESULTS: Five largely independent approaches were employed to construct trees for completely sequenced bacterial and archaeal genomes: i) presence-absence of genomes in clusters of orthologous genes; ii) conservation of local gene order (gene pairs) among prokaryotic genomes; iii) parameters of identity distribution for probable orthologs; iv) analysis of concatenated alignments of ribosomal proteins; v) comparison of trees constructed for multiple protein families. All constructed trees support the separation of the two primary prokaryotic domains, bacteria and archaea, as well as some terminal bifurcations within the bacterial and archaeal domains. Beyond these obvious groupings, the trees made with different methods appeared to differ substantially in terms of the relative contributions of phylogenetic relationships and similarities in gene repertoires caused by similar life styles and horizontal gene transfer to the tree topology. The trees based on presence-absence of genomes in orthologous clusters and the trees based on conserved gene pairs appear to be strongly affected by gene loss and horizontal gene transfer. The trees based on identity distributions for orthologs and particularly the tree made of concatenated ribosomal protein sequences seemed to carry a stronger phylogenetic signal. The latter tree supported three potential high-level bacterial clades,: i) Chlamydia-Spirochetes, ii) Thermotogales-Aquificales (bacterial hyperthermophiles), and ii) Actinomycetes-Deinococcales-Cyanobacteria. The latter group also appeared to join the low-GC Gram-positive bacteria at a deeper tree node. These new groupings of bacteria were supported by the analysis of alternative topologies in the concatenated ribosomal protein tree using the Kishino-Hasegawa test and by a census of the topologies of 132 individual groups of orthologous proteins. Additionally, the results of this analysis put into question the sister-group relationship between the two major archaeal groups, Euryarchaeota and Crenarchaeota, and suggest instead that Euryarchaeota might be a paraphyletic group with respect to Crenarchaeota. CONCLUSIONS: We conclude that, the extensive horizontal gene flow and lineage-specific gene loss notwithstanding, extension of phylogenetic analysis to the genome scale has the potential of uncovering deep evolutionary relationships between prokaryotic lineages.
- Burgers PM et al.
- Eukaryotic DNA polymerases: proposal for a revised nomenclature.
- J Biol Chem. 2001; 276: 43487-90
- Koonin EV, Wolf YI, Kondrashov AS, Aravind L
- Bacterial homologs of the small subunit of eukaryotic DNA primase.
- J Mol Microbiol Biotechnol. 2000; 2: 509-12
- Villarreal LP, DeFilippis VR
- A hypothesis for DNA viruses as the origin of eukaryotic replication proteins.
- J Virol. 2000; 74: 7079-84
- Display abstract
The eukaryotic replicative DNA polymerases are similar to those of large DNA viruses of eukaryotic and bacterial T4 phages but not to those of eubacteria. We develop and examine the hypothesis that DNA virus replication proteins gave rise to those of eukaryotes during evolution. We chose the DNA polymerase from phycodnavirus (which infects microalgae) as the basis of this analysis, as it represents a virus of a primitive eukaryote. We show that it has significant similarity with replicative DNA polymerases of eukaryotes and certain of their large DNA viruses. Sequence alignment confirms this similarity and establishes the presence of highly conserved domains in the polymerase amino terminus. Subsequent reconstruction of a phylogenetic tree indicates that these algal viral DNA polymerases are near the root of the clade containing all eukaryotic DNA polymerase delta members but that this clade does not contain the polymerases of other DNA viruses. We consider arguments for the polarity of this relationship and present the hypothesis that the replication genes of DNA viruses gave rise to those of eukaryotes and not the reverse direction.
- Cuff JA, Clamp ME, Siddiqui AS, Finlay M, Barton GJ
- JPred: a consensus secondary structure prediction server.
- Bioinformatics. 1998; 14: 892-3
- Display abstract
An interactive protein secondary structure prediction Internet server is presented. The server allows a single sequence or multiple alignment to be submitted, and returns predictions from six secondary structure prediction algorithms that exploit evolutionary information from multiple sequences. A consensus prediction is also returned which improves the average Q3 accuracy of prediction by 1% to 72.9%. The server simplifies the use of current prediction algorithms and allows conservation patterns important to structure and function to be identified. AVAILABILITY: http://barton.ebi.ac.uk/servers/jpred.h tml CONTACT: geoff@ebi.ac.uk
- Dandekar T, Snel B, Huynen M, Bork P
- Conservation of gene order: a fingerprint of proteins that physically interact.
- Trends Biochem Sci. 1998; 23: 324-8
- Display abstract
A systematic comparison of nine bacterial and archaeal genomes reveals a low level of gene-order (and operon architecture) conservation. Nevertheless, a number of gene pairs are conserved. The proteins encoded by conserved gene pairs appear to interact physically. This observation can therefore be used to predict functions of, and interactions between, prokaryotic gene products.
- Altschul SF et al.
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.
- Nucleic Acids Res. 1997; 25: 3389-402
- Display abstract
The BLAST programs are widely used tools for searching protein and DNA databases for sequence similarities. For protein comparisons, a variety of definitional, algorithmic and statistical refinements described here permits the execution time of the BLAST programs to be decreased substantially while enhancing their sensitivity to weak similarities. A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original. In addition, a method is introduced for automatically combining statistically significant alignments produced by BLAST into a position-specific score matrix, and searching the database using this matrix. The resulting Position-Specific Iterated BLAST (PSI-BLAST) program runs at approximately the same speed per iteration as gapped BLAST, but in many cases is much more sensitive to weak but biologically relevant sequence similarities. PSI-BLAST is used to uncover several new and interesting members of the BRCT superfamily.
- Guex N, Peitsch MC
- SWISS-MODEL and the Swiss-PdbViewer: an environment for comparative protein modeling.
- Electrophoresis. 1997; 18: 2714-23
- Display abstract
Comparative protein modeling is increasingly gaining interest since it is of great assistance during the rational design of mutagenesis experiments. The availability of this method, and the resulting models, has however been restricted by the availability of expensive computer hardware and software. To overcome these limitations, we have developed an environment for comparative protein modeling that consists of SWISS-MODEL, a server for automated comparative protein modeling and of the SWISS-PdbViewer, a sequence to structure workbench. The Swiss-PdbViewer not only acts as a client for SWISS-MODEL, but also provides a large selection of structure analysis and display tools. In addition, we provide the SWISS-MODEL Repository, a database containing more than 3500 automatically generated protein models. By making such tools freely available to the scientific community, we hope to increase the use of protein structures and models in the process of experiment design.
- Evans JT, Leisy DJ, Rohrmann GF
- Characterization of the interaction between the baculovirus replication factors LEF-1 and LEF-2.
- J Virol. 1997; 71: 3114-9
- Display abstract
The Autographa californica multinucleocapsid nuclear polyhedrosis virus has six genes required and three genes stimulatory for transient DNA replication. We demonstrate that the products of two of these genes, LEF-1 and LEF-2, interact in both two-hybrid assays using Saccharomyces cerevisiae and glutathione S-transferase fusion affinity assays. Using yeast-two-hybrid assays, we mapped the interaction domain of LEF-2 to amino acids between positions 20 and 60. Extensive deletion analyses of LEF-1 failed to reveal a delimited interaction domain, suggesting that there may be essential secondary structural elements that are inactivated by these deletions. All clones expressing LEF-1 and LEF-2 that were unable to interact also failed to support significant levels of transient DNA replication, suggesting that this interaction is required for DNA replication. Sequence analysis of LEF-1 revealed a primase-like motif, WVVDAD. When this motif was mutated to WVVQAD, LEF-1 no longer supported transient DNA replication.
- Shinohara M, Itoh T
- Specificity determinants in interaction of the initiator (Rep) proteins with the origins in the plasmids ColE2-P9 and ColE3-CA38 identified by chimera analysis.
- J Mol Biol. 1996; 257: 290-300
- Display abstract
The ColE2-P9 rep protein specifically binds to the orgin and initiates DNA synthesis. Interaction of the Rep protein with the origins of plasmids ColE2-P9 and ColE-3-CA38 (one of the close relatives of ColE2-P9) is plasmid-specific. By using chimeric rep genes and chimeric origins we showed that the two region, A and B, in the C-terminal regions of the Rep proteins and the two sites alpha and beta, in the origins are important for the determination of specificity. When each of the A/alpha and B/beta pairs is from the same plasmids, the plasmid replication is efficient. On the other hand, if only the A/alpha pair is from the same plasmids, the plasmid replication is inefficient. For the region A, the plasmid-specificity is mainly determined by the presence or absence of a nine-amino acid sequence. For the region B, the specificity is probably determined by several amino acids. The region B, contains a segment of amino acid sequence which shows significant homology with the DNA recognition helices of various DNA binding proteins. At the site alpha, the single additional base-pair in the ColE3-CA38 origin can be either A/T or T/A. At the site beta, however, the single additional base-pair in the ColE2-P9 origin must be G/C. Among other possibilities we propose that the region A is a linker connecting the two domains in the Rep protein involved in DNA-binding and that the region B is a part of the sequence-specific DNA-binding domain.
- Holm L, Sander C
- The FSSP database: fold classification based on structure-structure alignment of proteins.
- Nucleic Acids Res. 1996; 24: 206-9
- Display abstract
The FSSP database presents a continuously updated classification of 3-D protein folds based on an all-against-all comparison of structures currently in the Protein Data Bank (PDB) [Bernstein et al. (1977) J. Mol. Biol., 112, 535- 542]. The database currently contains an extended structural family for each of 600 representative protein chains which have <25% mutual sequence identity. The results of the exhaustive pairwise structure comparisons are reported in the form of a fold tree generated by hierarchical clustering and as a series of structurally representative sets of folds at varying levels of uniqueness. For each query structure from the representative set, there is a database entry containing structure-structure alignments with its structural neighbours in the representative set and its sequence homologs in the PDB. All alignments are based purely on the 3-D co-ordinates of the proteins and are derived by an automatic structure comparison program (Dali). The FSSP database is accessible electronically on the World Wide Web and by anonymous ftp.
- Dracheva S, Koonin EV, Crute JJ
- Identification of the primase active site of the herpes simplex virus type 1 helicase-primase.
- J Biol Chem. 1995; 270: 14148-53
- Display abstract
Herpes simplex virus type 1 (HSV-1) encodes a heterotrimeric helicase-primase composed of the products of the three DNA replication-specific genes UL5, UL8, and UL52 (Crute, J. J., and Lehman, I. R. (1991) J. Biol. Chem. 266, 4484-4488). The UL5 and UL52 products constitute a heterodimeric subassembly of the holoenzyme that contains both helicase and primase activities (Calder, J. M., and Stow, N. D. (1990) Nucleic Acids Res. 18, 3573-3578; Dodson, M. S., and Lehman, I. R. (1991) Proc. Natl. Acad. Sci. U. S. A. 88, 1105-1109). The role of the UL52 product in the active HSV-1 helicase-primase was examined. A sequence located between residues 610 and 636 on the UL52 protein was found to be conserved among the UL52 homologues of eight herpesviruses. The carboxyl-terminal portion of this conserved sequence consisted of two Asp residues separated by a variable hydrophobic amino acid residue and is analogous to the divalent metal-binding site of DNA polymerases and several DNA primases. This motif has been designated the herpesvirus primase DXD motif. To study the role of the HSV-1 primase DXD motif in primase action, three site-directed changes were introduced into the UL52 gene. The helicase activity of the recombinant holoenzymes was unaffected by any of the introduced changes. Changing either of the two Asp residues that constitute the divalent metal-binding site (Asp628 or Asp630) to Ala dramatically reduced the primase activity of the HSV-1 helicase-primase holoenzyme in vitro, whereas alteration of the nearby conserved residue Asn624 to Gly had minimal effect. Therefore, in the three-subunit HSV-1 helicase-primase, the UL52 product provides at least a part of the primase catalytic site.
- Klinedinst DK, Challberg MD
- Helicase-primase complex of herpes simplex virus type 1: a mutation in the UL52 subunit abolishes primase activity.
- J Virol. 1994; 68: 3693-701
- Display abstract
The UL52 gene product of herpes simplex virus type 1 (HSV-1) comprises one subunit of a 3-protein helicase-primase complex that is essential for replication of viral DNA. The functions of the individual subunits of the complex are not known with certainty, although it is clear that the UL8 subunit is not required for either helicase or primase activity. Examination of the predicted amino acid sequence of the UL5 gene reveals the existence of conserved helicase motifs; it seems likely, therefore, that UL5 is responsible for the helicase activity of the complex. We have undertaken mutational analysis of UL52 in an attempt to understand the functional contribution of this protein to the helicase-primase complex. Amino acid substitution mutations were introduced into five regions of the UL52 gene that are highly conserved among HSV-1 and the related herpesviruses equine herpesvirus 1, human cytomegalovirus, Epstein-Barr virus, and varicella-zoster virus. Of seven mutants analyzed by an in vivo replication assay, three mutants, in three different conserved regions of the protein, failed to support DNA replication. Within one of the conserved regions is a 6-amino-acid motif (IL)(VIM)(LF)DhD (where h is a hydrophobic residue), which is also conserved in mouse, yeast, and T7 primases. Mutagenesis of the first aspartate residue of the motif, located at position 628 of the UL52 protein, abolished the ability of the complex to support replication of an origin-containing plasmid in vivo and to synthesize oligoribonucleotide primers in vitro. The ATPase and helicase activities were unaffected, as was the ability of the mutant enzyme to support displacement synthesis on a preformed fork substrate. These results provide experimental support for the idea that UL52 is responsible for the primase activity of the HSV helicase-primase complex.
- Sonntag KC, Schnitzler P, Koonin EV, Darai G
- Chilo iridescent virus encodes a putative helicase belonging to a distinct family within the "DEAD/H" superfamily: implications for the evolution of large DNA viruses.
- Virus Genes. 1994; 8: 151-8
- Display abstract
The complete nucleotide sequence of the EcoRI DNA fragment M (7099 bp; 0.310-0.345 map units) of the genome of insect iridescent virus type 6--Chilo iridescent virus (CIV)--was determined. A 606 codon open reading frame located in this region encoded a protein (p69) related to a distinct family of putative DNA and/or RNA helicases belonging to the "DEAD/H" superfamily. Unique sequence signatures were derived that allowed selective retrieval of the putative helicases of the new family from amino acid sequence databases. The family includes yeast, Drosophila, mammalian, and bacterial proteins involved in transcription regulation and in repair of damaged DNA. It is hypothesized that p69 of CIV may be a DNA or RNA helicase possibly involved in viral transcription. A distant relationship was observed to exist between this family of helicases and another group of proteins that consists of putative helicases of poxviruses, African swine fever virus, and yeast mitochondrial plasmids. It is shown that p69 of CIV is much more closely related to cellular helicases than any of the other known viral helicases. Phylogenetic analysis suggested an independent origin for the p69 gene and the genes encoding other viral helicases.
- Koonin EV
- A common set of conserved motifs in a vast variety of putative nucleic acid-dependent ATPases including MCM proteins involved in the initiation of eukaryotic DNA replication.
- Nucleic Acids Res. 1993; 21: 2541-7
- Display abstract
A new superfamily of (putative) DNA-dependent ATPases is described that includes the ATPase domains of prokaryotic NtrC-related transcription regulators, MCM proteins involved in the initiation of eukaryotic DNA replication, and a group of uncharacterized bacterial and chloroplast proteins. MCM proteins are shown to contain a modified form of the ATP-binding motif and are predicted to mediate ATP-dependent opening of double-stranded DNA in the replication origins. In a second line of investigation, it is demonstrated that the products of unidentified open reading frames from Marchantia mitochondria and from yeast, and a domain of a baculovirus protein involved in viral DNA replication are related to the superfamily III of DNA and RNA helicases that previously has been known to include only proteins of small viruses. Comparison of the multiple alignments showed that the proteins of the NtrC superfamily and the helicases of superfamily III share three related sequence motifs tightly packed in the ATPase domain that consists of 100-150 amino acid residues. A similar array of conserved motifs is found in the family of DnaA-related ATPases. It is hypothesized that the three large groups of nucleic acid-dependent ATPases have similar structure of the core ATPase domain and have evolved from a common ancestor.
- Ilyina TV, Gorbalenya AE, Koonin EV
- Organization and evolution of bacterial and bacteriophage primase-helicase systems.
- J Mol Evol. 1992; 34: 351-7
- Display abstract
Amino acid sequences of primases and associated helicases involved in the DNA replication of eubacteria and bacteriophages T7, T3, T4, P4, and P22 were compared by computer-assisted methods. There are two types of such systems, the first one represented by distinct helicase and primase proteins (e.g., DnaB and DnaG proteins of Escherichia coli), and the second one by single polypeptides comprising both activities (gp4 of bacteriophages T7 and T3, and alpha protein of bacteriophage P4). Pronounced sequence similarity was revealed between approximately 250 amino acid residue N-terminal domains of stand-alone primases and the primase-helicase proteins of T7(T3) and P4. All these domains contain, close to their N-termini, a conserved Zn-finger pattern that may be implicated in template DNA recognition by the primases. In addition, they encompass five other conserved motifs some of which may be involved in substrate (NTP) binding. Significant similarity was also observed between the primase-associated helicases (DnaB, gp12 and P22 and gp41 of T4) and the C-terminal domain of T7(T3) gp4. On the other hand the C-terminal domain of P-alpha of P4 is related to another group of DNA and RNA helicases. Tentative phylogenetic trees generated for the primases and the associated helicases showed no grouping of the phage proteins, with the exception of the primase domains of bacteriophages T4 and P4. This may indicate a common origin for one-component primase-helicase systems. Two scenarios for the evolution of primase-helicase systems are discussed.(ABSTRACT TRUNCATED AT 250 WORDS)
- Salas M
- Protein-priming of DNA replication.
- Annu Rev Biochem. 1991; 60: 39-71
- Delarue M, Poch O, Tordo N, Moras D, Argos P
- An attempt to unify the structure of polymerases.
- Protein Eng. 1990; 3: 461-7
- Display abstract
With the great availability of sequences from RNA- and DNA-dependent RNA and DNA polymerases, it has become possible to delineate a few highly conserved regions for various polymerase types. In this work a DNA polymerase sequence from bacteriophage SPO2 was found to be homologous to the polymerase domain of the Klenow fragment of polymerase I from Escherichia coli, which is known to be closely related to those from Staphylococcus pneumoniae, Thermus aquaticus and bacteriophages T7 and T5. The alignment of the SPO2 polymerase with the other five sequences considerably narrowed the conserved motifs in these proteins. Three of the motifs matched reasonably all the conserved motifs of another DNA polymerase type, characterized by human polymerase alpha. It is also possible to find these three motifs in monomeric DNA-dependent RNA polymerases and two of them in DNA polymerase beta and DNA terminal transferases. These latter two motifs also matched two of the four motifs recently identified in 84 RNA-dependent polymerases. From the known tertiary architecture of the Klenow fragment of E. coli pol I, a spatial arrangement can be implied for these motifs. In addition, numerous biochemical experiments suggesting a role for the motifs in a common function (dNTP binding) also support these inferences. This speculative hypothesis, attempting to unify polymerase structure at least locally, if not globally, under the pol I fold, should provide a useful model to direct mutagenesis experiments to probe template and substrate specificity in polymerases.
- Foiani M, Santocanale C, Plevani P, Lucchini G
- A single essential gene, PRI2, encodes the large subunit of DNA primase in Saccharomyces cerevisiae.
- Mol Cell Biol. 1989; 9: 3081-7
- Display abstract
DNA primase activity of the yeast DNA polymerase-primase complex is related to two polypeptides, p58 and p48. The reciprocal role of these protein species has not yet been clarified, although both participate in formation of the active center of the enzyme. The gene encoding the p58 subunit has been cloned by screening of a lambda gt11 yeast genomic DNA library, using specific anti-p58 antiserum. Antibodies that inhibited DNA primase activity could be purified by lysates of Escherichia coli cells infected with a recombinant bacteriophage containing the entire gene, which we designate PR12. The gene was found to be transcribed in a 1.7-kilobase mRNA whose level appeared to fluctuate during the mitotic cell cycle. Nucleotide sequence determination indicated that PR12 encodes a 528-amino-acid polypeptide with a calculated molecular weight of 62,262. The gene is unique in the haploid yeast genome, and its product is essential for cell viability, as has been shown for other components of the yeast DNA polymerase-primase complex.