WorldWideScience

Sample records for eukaryotic structural genomics

  1. Comparative genomics and evolution of eukaryotic phospholipidbiosynthesis

    Energy Technology Data Exchange (ETDEWEB)

    Lykidis, Athanasios

    2006-12-01

    Phospholipid biosynthetic enzymes produce diverse molecular structures and are often present in multiple forms encoded by different genes. This work utilizes comparative genomics and phylogenetics for exploring the distribution, structure and evolution of phospholipid biosynthetic genes and pathways in 26 eukaryotic genomes. Although the basic structure of the pathways was formed early in eukaryotic evolution, the emerging picture indicates that individual enzyme families followed unique evolutionary courses. For example, choline and ethanolamine kinases and cytidylyltransferases emerged in ancestral eukaryotes, whereas, multiple forms of the corresponding phosphatidyltransferases evolved mainly in a lineage specific manner. Furthermore, several unicellular eukaryotes maintain bacterial-type enzymes and reactions for the synthesis of phosphatidylglycerol and cardiolipin. Also, base-exchange phosphatidylserine synthases are widespread and ancestral enzymes. The multiplicity of phospholipid biosynthetic enzymes has been largely generated by gene expansion in a lineage specific manner. Thus, these observations suggest that phospholipid biosynthesis has been an actively evolving system. Finally, comparative genomic analysis indicates the existence of novel phosphatidyltransferases and provides a candidate for the uncharacterized eukaryotic phosphatidylglycerol phosphate phosphatase.

  2. Compositional patterns in the genomes of unicellular eukaryotes.

    Science.gov (United States)

    Costantini, Maria; Alvarez-Valin, Fernando; Costantini, Susan; Cammarano, Rosalia; Bernardi, Giorgio

    2013-11-05

    The genomes of multicellular eukaryotes are compartmentalized in mosaics of isochores, large and fairly homogeneous stretches of DNA that belong to a small number of families characterized by different average GC levels, by different gene concentration (that increase with GC), different chromatin structures, different replication timing in the cell cycle, and other different properties. A question raised by these basic results concerns how far back in evolution the compartmentalized organization of the eukaryotic genomes arose. In the present work we approached this problem by studying the compositional organization of the genomes from the unicellular eukaryotes for which full sequences are available, the sample used being representative. The average GC levels of the genomes from unicellular eukaryotes cover an extremely wide range (19%-60% GC) and the compositional patterns of individual genomes are extremely different but all genomes tested show a compositional compartmentalization. The average GC range of the genomes of unicellular eukaryotes is very broad (as broad as that of prokaryotes) and individual compositional patterns cover a very broad range from very narrow to very complex. Both features are not surprising for organisms that are very far from each other both in terms of phylogenetic distances and of environmental life conditions. Most importantly, all genomes tested, a representative sample of all supergroups of unicellular eukaryotes, are compositionally compartmentalized, a major difference with prokaryotes.

  3. GenColors-based comparative genome databases for small eukaryotic genomes.

    Science.gov (United States)

    Felder, Marius; Romualdi, Alessandro; Petzold, Andreas; Platzer, Matthias; Sühnel, Jürgen; Glöckner, Gernot

    2013-01-01

    Many sequence data repositories can give a quick and easily accessible overview on genomes and their annotations. Less widespread is the possibility to compare related genomes with each other in a common database environment. We have previously described the GenColors database system (http://gencolors.fli-leibniz.de) and its applications to a number of bacterial genomes such as Borrelia, Legionella, Leptospira and Treponema. This system has an emphasis on genome comparison. It combines data from related genomes and provides the user with an extensive set of visualization and analysis tools. Eukaryote genomes are normally larger than prokaryote genomes and thus pose additional challenges for such a system. We have, therefore, adapted GenColors to also handle larger datasets of small eukaryotic genomes and to display eukaryotic gene structures. Further recent developments include whole genome views, genome list options and, for bacterial genome browsers, the display of horizontal gene transfer predictions. Two new GenColors-based databases for two fungal species (http://fgb.fli-leibniz.de) and for four social amoebas (http://sacgb.fli-leibniz.de) were set up. Both new resources open up a single entry point for related genomes for the amoebozoa and fungal research communities and other interested users. Comparative genomics approaches are greatly facilitated by these resources.

  4. Distinct gene number-genome size relationships for eukaryotes and non-eukaryotes: gene content estimation for dinoflagellate genomes.

    Directory of Open Access Journals (Sweden)

    Yubo Hou

    Full Text Available The ability to predict gene content is highly desirable for characterization of not-yet sequenced genomes like those of dinoflagellates. Using data from completely sequenced and annotated genomes from phylogenetically diverse lineages, we investigated the relationship between gene content and genome size using regression analyses. Distinct relationships between log(10-transformed protein-coding gene number (Y' versus log(10-transformed genome size (X', genome size in kbp were found for eukaryotes and non-eukaryotes. Eukaryotes best fit a logarithmic model, Y' = ln(-46.200+22.678X', whereas non-eukaryotes a linear model, Y' = 0.045+0.977X', both with high significance (p0.91. Total gene number shows similar trends in both groups to their respective protein coding regressions. The distinct correlations reflect lower and decreasing gene-coding percentages as genome size increases in eukaryotes (82%-1% compared to higher and relatively stable percentages in prokaryotes and viruses (97%-47%. The eukaryotic regression models project that the smallest dinoflagellate genome (3x10(6 kbp contains 38,188 protein-coding (40,086 total genes and the largest (245x10(6 kbp 87,688 protein-coding (92,013 total genes, corresponding to 1.8% and 0.05% gene-coding percentages. These estimates do not likely represent extraordinarily high functional diversity of the encoded proteome but rather highly redundant genomes as evidenced by high gene copy numbers documented for various dinoflagellate species.

  5. The eukaryotic genome is structurally and functionally more like a social insect colony than a book.

    Science.gov (United States)

    Qiu, Guo-Hua; Yang, Xiaoyan; Zheng, Xintian; Huang, Cuiqin

    2017-11-01

    Traditionally, the genome has been described as the 'book of life'. However, the metaphor of a book may not reflect the dynamic nature of the structure and function of the genome. In the eukaryotic genome, the number of centrally located protein-coding sequences is relatively constant across species, but the amount of noncoding DNA increases considerably with the increase of organismal evolutional complexity. Therefore, it has been hypothesized that the abundant peripheral noncoding DNA protects the genome and the central protein-coding sequences in the eukaryotic genome. Upon comparison with the habitation, sociality and defense mechanisms of a social insect colony, it is found that the genome is similar to a social insect colony in various aspects. A social insect colony may thus be a better metaphor than a book to describe the spatial organization and physical functions of the genome. The potential implications of the metaphor are also discussed.

  6. Single Cell Genomics and Transcriptomics for Unicellular Eukaryotes

    Energy Technology Data Exchange (ETDEWEB)

    Ciobanu, Doina; Clum, Alicia; Singh, Vasanth; Salamov, Asaf; Han, James; Copeland, Alex; Grigoriev, Igor; James, Timothy; Singer, Steven; Woyke, Tanja; Malmstrom, Rex; Cheng, Jan-Fang

    2014-03-14

    Despite their small size, unicellular eukaryotes have complex genomes with a high degree of plasticity that allow them to adapt quickly to environmental changes. Unicellular eukaryotes live with prokaryotes and higher eukaryotes, frequently in symbiotic or parasitic niches. To this day their contribution to the dynamics of the environmental communities remains to be understood. Unfortunately, the vast majority of eukaryotic microorganisms are either uncultured or unculturable, making genome sequencing impossible using traditional approaches. We have developed an approach to isolate unicellular eukaryotes of interest from environmental samples, and to sequence and analyze their genomes and transcriptomes. We have tested our methods with six species: an uncharacterized protist from cellulose-enriched compost identified as Platyophrya, a close relative of P. vorax; the fungus Metschnikowia bicuspidate, a parasite of water flea Daphnia; the mycoparasitic fungi Piptocephalis cylindrospora, a parasite of Cokeromyces and Mucor; Caulochytrium protosteloides, a parasite of Sordaria; Rozella allomycis, a parasite of the water mold Allomyces; and the microalgae Chlamydomonas reinhardtii. Here, we present the four components of our approach: pre-sequencing methods, sequence analysis for single cell genome assembly, sequence analysis of single cell transcriptomes, and genome annotation. This technology has the potential to uncover the complexity of single cell eukaryotes and their role in the environmental samples.

  7. The Genome of Naegleria gruberi Illuminates Early Eukaryotic Versatility

    Energy Technology Data Exchange (ETDEWEB)

    Fritz-Laylin, Lillian K.; Prochnik, Simon E.; Ginger, Michael L.; Dacks, Joel; Carpenter, Meredith L.; Field, Mark C.; Kuo, Alan; Paredez, Alex; Chapman, Jarrod; Pham, Jonathan; Shu, Shengqiang; Neupane, Rochak; Cipriano, Michael; Mancuso, Joel; Tu, Hank; Salamov, Asaf; Lindquist, Erika; Shapiro, Harris; Lucas, Susan; Grigoriev, Igor V.; Cande, W. Zacheus; Fulton, Chandler; Rokhsar, Daniel S.; Dawson, Scott C.

    2010-03-01

    Genome sequences of diverse free-living protists are essential for understanding eukaryotic evolution and molecular and cell biology. The free-living amoeboflagellate Naegleria gruberi belongs to a varied and ubiquitous protist clade (Heterolobosea) that diverged from other eukaryotic lineages over a billion years ago. Analysis of the 15,727 protein-coding genes encoded by Naegleria's 41 Mb nuclear genome indicates a capacity for both aerobic respiration and anaerobic metabolism with concomitant hydrogen production, with fundamental implications for the evolution of organelle metabolism. The Naegleria genome facilitates substantially broader phylogenomic comparisons of free-living eukaryotes than previously possible, allowing us to identify thousands of genes likely present in the pan-eukaryotic ancestor, with 40% likely eukaryotic inventions. Moreover, we construct a comprehensive catalog of amoeboid-motility genes. The Naegleria genome, analyzed in the context of other protists, reveals a remarkably complex ancestral eukaryote with a rich repertoire of cytoskeletal, sexual, signaling, and metabolic modules.

  8. Genome-reconstruction for eukaryotes from complex natural microbial communities.

    Science.gov (United States)

    West, Patrick T; Probst, Alexander J; Grigoriev, Igor V; Thomas, Brian C; Banfield, Jillian F

    2018-04-01

    Microbial eukaryotes are integral components of natural microbial communities, and their inclusion is critical for many ecosystem studies, yet the majority of published metagenome analyses ignore eukaryotes. In order to include eukaryotes in environmental studies, we propose a method to recover eukaryotic genomes from complex metagenomic samples. A key step for genome recovery is separation of eukaryotic and prokaryotic fragments. We developed a k -mer-based strategy, EukRep, for eukaryotic sequence identification and applied it to environmental samples to show that it enables genome recovery, genome completeness evaluation, and prediction of metabolic potential. We used this approach to test the effect of addition of organic carbon on a geyser-associated microbial community and detected a substantial change of the community metabolism, with selection against almost all candidate phyla bacteria and archaea and for eukaryotes. Near complete genomes were reconstructed for three fungi placed within the Eurotiomycetes and an arthropod. While carbon fixation and sulfur oxidation were important functions in the geyser community prior to carbon addition, the organic carbon-impacted community showed enrichment for secreted proteases, secreted lipases, cellulose targeting CAZymes, and methanol oxidation. We demonstrate the broader utility of EukRep by reconstructing and evaluating relatively high-quality fungal, protist, and rotifer genomes from complex environmental samples. This approach opens the way for cultivation-independent analyses of whole microbial communities. © 2018 West et al.; Published by Cold Spring Harbor Laboratory Press.

  9. Challenges in Whole-Genome Annotation of Pyrosequenced Eukaryotic Genomes

    Energy Technology Data Exchange (ETDEWEB)

    Kuo, Alan; Grigoriev, Igor

    2009-04-17

    Pyrosequencing technologies such as 454/Roche and Solexa/Illumina vastly lower the cost of nucleotide sequencing compared to the traditional Sanger method, and thus promise to greatly expand the number of sequenced eukaryotic genomes. However, the new technologies also bring new challenges such as shorter reads and new kinds and higher rates of sequencing errors, which complicate genome assembly and gene prediction. At JGI we are deploying 454 technology for the sequencing and assembly of ever-larger eukaryotic genomes. Here we describe our first whole-genome annotation of a purely 454-sequenced fungal genome that is larger than a yeast (>30 Mbp). The pezizomycotine (filamentous ascomycote) Aspergillus carbonarius belongs to the Aspergillus section Nigri species complex, members of which are significant as platforms for bioenergy and bioindustrial technology, as members of soil microbial communities and players in the global carbon cycle, and as agricultural toxigens. Application of a modified version of the standard JGI Annotation Pipeline has so far predicted ~;;10k genes. ~;;12percent of these preliminary annotations suffer a potential frameshift error, which is somewhat higher than the ~;;9percent rate in the Sanger-sequenced and conventionally assembled and annotated genome of fellow Aspergillus section Nigri member A. niger. Also,>90percent of A. niger genes have potential homologs in the A. carbonarius preliminary annotation. Weconclude, and with further annotation and comparative analysis expect to confirm, that 454 sequencing strategies provide a promising substrate for annotation of modestly sized eukaryotic genomes. We will also present results of annotation of a number of other pyrosequenced fungal genomes of bioenergy interest.

  10. Comparative Genomics of Eukaryotes.

    NARCIS (Netherlands)

    Noort, V. van

    2007-01-01

    This thesis focuses on developing comparative genomics methods in eukaryotes, with an emphasis on applications for gene function prediction and regulatory element detection. In the past, methods have been developed to predict functional associations between gene pairs in prokaryotes. The challenge

  11. EUPAN enables pan-genome studies of a large number of eukaryotic genomes.

    Science.gov (United States)

    Hu, Zhiqiang; Sun, Chen; Lu, Kuang-Chen; Chu, Xixia; Zhao, Yue; Lu, Jinyuan; Shi, Jianxin; Wei, Chaochun

    2017-08-01

    Pan-genome analyses are routinely carried out for bacteria to interpret the within-species gene presence/absence variations (PAVs). However, pan-genome analyses are rare for eukaryotes due to the large sizes and higher complexities of their genomes. Here we proposed EUPAN, a eukaryotic pan-genome analysis toolkit, enabling automatic large-scale eukaryotic pan-genome analyses and detection of gene PAVs at a relatively low sequencing depth. In the previous studies, we demonstrated the effectiveness and high accuracy of EUPAN in the pan-genome analysis of 453 rice genomes, in which we also revealed widespread gene PAVs among individual rice genomes. Moreover, EUPAN can be directly applied to the current re-sequencing projects primarily focusing on single nucleotide polymorphisms. EUPAN is implemented in Perl, R and C ++. It is supported under Linux and preferred for a computer cluster with LSF and SLURM job scheduling system. EUPAN together with its standard operating procedure (SOP) is freely available for non-commercial use (CC BY-NC 4.0) at http://cgm.sjtu.edu.cn/eupan/index.html . ccwei@sjtu.edu.cn or jianxin.shi@sjtu.edu.cn. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  12. Origin and evolution of SINEs in eukaryotic genomes.

    Science.gov (United States)

    Kramerov, D A; Vassetzky, N S

    2011-12-01

    Short interspersed elements (SINEs) are one of the two most prolific mobile genomic elements in most of the higher eukaryotes. Although their biology is still not thoroughly understood, unusual life cycle of these simple elements amplified as genomic parasites makes their evolution unique in many ways. In contrast to most genetic elements including other transposons, SINEs emerged de novo many times in evolution from available molecules (for example, tRNA). The involvement of reverse transcription in their amplification cycle, huge number of genomic copies and modular structure allow variation mechanisms in SINEs uncommon or rare in other genetic elements (module exchange between SINE families, dimerization, and so on.). Overall, SINE evolution includes their emergence, progressive optimization and counteraction to the cell's defense against mobile genetic elements.

  13. GFFview: A Web Server for Parsing and Visualizing Annotation Information of Eukaryotic Genome.

    Science.gov (United States)

    Deng, Feilong; Chen, Shi-Yi; Wu, Zhou-Lin; Hu, Yongsong; Jia, Xianbo; Lai, Song-Jia

    2017-10-01

    Owing to wide application of RNA sequencing (RNA-seq) technology, more and more eukaryotic genomes have been extensively annotated, such as the gene structure, alternative splicing, and noncoding loci. Annotation information of genome is prevalently stored as plain text in General Feature Format (GFF), which could be hundreds or thousands Mb in size. Therefore, it is a challenge for manipulating GFF file for biologists who have no bioinformatic skill. In this study, we provide a web server (GFFview) for parsing the annotation information of eukaryotic genome and then generating statistical description of six indices for visualization. GFFview is very useful for investigating quality and difference of the de novo assembled transcriptome in RNA-seq studies.

  14. Long-Range Order and Fractality in the Structure and Organization of Eukaryotic Genomes

    Science.gov (United States)

    Polychronopoulos, Dimitris; Tsiagkas, Giannis; Athanasopoulou, Labrini; Sellis, Diamantis; Almirantis, Yannis

    2014-12-01

    The late Professor J.S. Nicolis always emphasized, both in his writings and in presentations and discussions with students and friends, the relevance of a dynamical systems approach to biology. In particular, viewing the genome as a "biological text" captures the dynamical character of both the evolution and function of the organisms in the form of correlations indicating the presence of a long-range order. This genomic structure can be expressed in forms reminiscent of natural languages and several temporal and spatial traces l by the functioning of dynamical systems: Zipf laws, self-similarity and fractality. Here we review several works of our group and recent unpublished results, focusing on the chromosomal distribution of biologically active genomic components: Genes and protein-coding segments, CpG islands, transposable elements belonging to all major classes and several types of conserved non-coding genomic elements. We report the systematic appearance of power-laws in the size distribution of the distances between elements belonging to each of these types of functional genomic elements. Moreover, fractality is also found in several cases, using box-counting and entropic scaling.We present here, for the first time in a unified way, an aggregative model of the genomic dynamics which can explain the observed patterns on the grounds of known phenomena accompanying genome evolution. Our results comply with recent findings about a "fractal globule" geometry of chromatin in the eukaryotic nucleus.

  15. Arabidopsis transcription factors: genome-wide comparative analysis among eukaryotes.

    Science.gov (United States)

    Riechmann, J L; Heard, J; Martin, G; Reuber, L; Jiang, C; Keddie, J; Adam, L; Pineda, O; Ratcliffe, O J; Samaha, R R; Creelman, R; Pilgrim, M; Broun, P; Zhang, J Z; Ghandehari, D; Sherman, B K; Yu, G

    2000-12-15

    The completion of the Arabidopsis thaliana genome sequence allows a comparative analysis of transcriptional regulators across the three eukaryotic kingdoms. Arabidopsis dedicates over 5% of its genome to code for more than 1500 transcription factors, about 45% of which are from families specific to plants. Arabidopsis transcription factors that belong to families common to all eukaryotes do not share significant similarity with those of the other kingdoms beyond the conserved DNA binding domains, many of which have been arranged in combinations specific to each lineage. The genome-wide comparison reveals the evolutionary generation of diversity in the regulation of transcription.

  16. Evidence of pervasive biologically functional secondary structures within the genomes of eukaryotic single-stranded DNA viruses.

    Science.gov (United States)

    Muhire, Brejnev Muhizi; Golden, Michael; Murrell, Ben; Lefeuvre, Pierre; Lett, Jean-Michel; Gray, Alistair; Poon, Art Y F; Ngandu, Nobubelo Kwanele; Semegni, Yves; Tanov, Emil Pavlov; Monjane, Adérito Luis; Harkins, Gordon William; Varsani, Arvind; Shepherd, Dionne Natalie; Martin, Darren Patrick

    2014-02-01

    Single-stranded DNA (ssDNA) viruses have genomes that are potentially capable of forming complex secondary structures through Watson-Crick base pairing between their constituent nucleotides. A few of the structural elements formed by such base pairings are, in fact, known to have important functions during the replication of many ssDNA viruses. Unknown, however, are (i) whether numerous additional ssDNA virus genomic structural elements predicted to exist by computational DNA folding methods actually exist and (ii) whether those structures that do exist have any biological relevance. We therefore computationally inferred lists of the most evolutionarily conserved structures within a diverse selection of animal- and plant-infecting ssDNA viruses drawn from the families Circoviridae, Anelloviridae, Parvoviridae, Nanoviridae, and Geminiviridae and analyzed these for evidence of natural selection favoring the maintenance of these structures. While we find evidence that is consistent with purifying selection being stronger at nucleotide sites that are predicted to be base paired than at sites predicted to be unpaired, we also find strong associations between sites that are predicted to pair with one another and site pairs that are apparently coevolving in a complementary fashion. Collectively, these results indicate that natural selection actively preserves much of the pervasive secondary structure that is evident within eukaryote-infecting ssDNA virus genomes and, therefore, that much of this structure is biologically functional. Lastly, we provide examples of various highly conserved but completely uncharacterized structural elements that likely have important functions within some of the ssDNA virus genomes analyzed here.

  17. diArk – a resource for eukaryotic genome research

    Directory of Open Access Journals (Sweden)

    Kollmar Martin

    2007-04-01

    Full Text Available Abstract Background The number of completed eukaryotic genome sequences and cDNA projects has increased exponentially in the past few years although most of them have not been published yet. In addition, many microarray analyses yielded thousands of sequenced EST and cDNA clones. For the researcher interested in single gene analyses (from a phylogenetic, a structural biology or other perspective it is therefore important to have up-to-date knowledge about the various resources providing primary data. Description The database is built around 3 central tables: species, sequencing projects and publications. The species table contains commonly and alternatively used scientific names, common names and the complete taxonomic information. For projects the sequence type and links to species project web-sites and species homepages are stored. All publications are linked to projects. The web-interface provides comprehensive search modules with detailed options and three different views of the selected data. We have especially focused on developing an elaborate taxonomic tree search tool that allows the user to instantaneously identify e.g. the closest relative to the organism of interest. Conclusion We have developed a database, called diArk, to store, organize, and present the most relevant information about completed genome projects and EST/cDNA data from eukaryotes. Currently, diArk provides information about 415 eukaryotes, 823 sequencing projects, and 248 publications.

  18. Automated Eukaryotic Gene Structure Annotation Using EVidenceModeler and the Program to Assemble Spliced Alignments

    Energy Technology Data Exchange (ETDEWEB)

    Haas, B J; Salzberg, S L; Zhu, W; Pertea, M; Allen, J E; Orvis, J; White, O; Buell, C R; Wortman, J R

    2007-12-10

    EVidenceModeler (EVM) is presented as an automated eukaryotic gene structure annotation tool that reports eukaryotic gene structures as a weighted consensus of all available evidence. EVM, when combined with the Program to Assemble Spliced Alignments (PASA), yields a comprehensive, configurable annotation system that predicts protein-coding genes and alternatively spliced isoforms. Our experiments on both rice and human genome sequences demonstrate that EVM produces automated gene structure annotation approaching the quality of manual curation.

  19. Introns Protect Eukaryotic Genomes from Transcription-Associated Genetic Instability.

    Science.gov (United States)

    Bonnet, Amandine; Grosso, Ana R; Elkaoutari, Abdessamad; Coleno, Emeline; Presle, Adrien; Sridhara, Sreerama C; Janbon, Guilhem; Géli, Vincent; de Almeida, Sérgio F; Palancade, Benoit

    2017-08-17

    Transcription is a source of genetic instability that can notably result from the formation of genotoxic DNA:RNA hybrids, or R-loops, between the nascent mRNA and its template. Here we report an unexpected function for introns in counteracting R-loop accumulation in eukaryotic genomes. Deletion of endogenous introns increases R-loop formation, while insertion of an intron into an intronless gene suppresses R-loop accumulation and its deleterious impact on transcription and recombination in yeast. Recruitment of the spliceosome onto the mRNA, but not splicing per se, is shown to be critical to attenuate R-loop formation and transcription-associated genetic instability. Genome-wide analyses in a number of distant species differing in their intron content, including human, further revealed that intron-containing genes and the intron-richest genomes are best protected against R-loop accumulation and subsequent genetic instability. Our results thereby provide a possible rationale for the conservation of introns throughout the eukaryotic lineage. Copyright © 2017 Elsevier Inc. All rights reserved.

  20. EuMicroSatdb: A database for microsatellites in the sequenced genomes of eukaryotes

    Directory of Open Access Journals (Sweden)

    Grover Atul

    2007-07-01

    Full Text Available Abstract Background Microsatellites have immense utility as molecular markers in different fields like genome characterization and mapping, phylogeny and evolutionary biology. Existing microsatellite databases are of limited utility for experimental and computational biologists with regard to their content and information output. EuMicroSatdb (Eukaryotic MicroSatellite database http://ipu.ac.in/usbt/EuMicroSatdb.htm is a web based relational database for easy and efficient positional mining of microsatellites from sequenced eukaryotic genomes. Description A user friendly web interface has been developed for microsatellite data retrieval using Active Server Pages (ASP. The backend database codes for data extraction and assembly have been written using Perl based scripts and C++. Precise need based microsatellites data retrieval is possible using different input parameters like microsatellite type (simple perfect or compound perfect, repeat unit length (mono- to hexa-nucleotide, repeat number, microsatellite length and chromosomal location in the genome. Furthermore, information about clustering of different microsatellites in the genome can also be retrieved. Finally, to facilitate primer designing for PCR amplification of any desired microsatellite locus, 200 bp upstream and downstream sequences are provided. Conclusion The database allows easy systematic retrieval of comprehensive information about simple and compound microsatellites, microsatellite clusters and their locus coordinates in 31 sequenced eukaryotic genomes. The information content of the database is useful in different areas of research like gene tagging, genome mapping, population genetics, germplasm characterization and in understanding microsatellite dynamics in eukaryotic genomes.

  1. Mapping and characterizing N6-methyladenine in eukaryotic genomes using single molecule real-time sequencing.

    Science.gov (United States)

    Zhu, Shijia; Beaulaurier, John; Deikus, Gintaras; Wu, Tao; Strahl, Maya; Hao, Ziyang; Luo, Guanzheng; Gregory, James A; Chess, Andrew; He, Chuan; Xiao, Andrew; Sebra, Robert; Schadt, Eric E; Fang, Gang

    2018-05-15

    N6-methyladenine (m6dA) has been discovered as a novel form of DNA methylation prevalent in eukaryotes, however, methods for high resolution mapping of m6dA events are still lacking. Single-molecule real-time (SMRT) sequencing has enabled the detection of m6dA events at single-nucleotide resolution in prokaryotic genomes, but its application to detecting m6dA in eukaryotic genomes has not been rigorously examined. Herein, we identified unique characteristics of eukaryotic m6dA methylomes that fundamentally differ from those of prokaryotes. Based on these differences, we describe the first approach for mapping m6dA events using SMRT sequencing specifically designed for the study of eukaryotic genomes, and provide appropriate strategies for designing experiments and carrying out sequencing in future studies. We apply the novel approach to study two eukaryotic genomes. For green algae, we construct the first complete genome-wide map of m6dA at single nucleotide and single molecule resolution. For human lymphoblastoid cells (hLCLs), joint analyses of SMRT sequencing and independent sequencing data suggest that putative m6dA events are enriched in the promoters of young, full length LINE-1 elements (L1s). These analyses demonstrate a general method for rigorous mapping and characterization of m6dA events in eukaryotic genomes. Published by Cold Spring Harbor Laboratory Press.

  2. Genomic impact of eukaryotic transposable elements.

    Science.gov (United States)

    Arkhipova, Irina R; Batzer, Mark A; Brosius, Juergen; Feschotte, Cédric; Moran, John V; Schmitz, Jürgen; Jurka, Jerzy

    2012-11-21

    The third international conference on the genomic impact of eukaryotic transposable elements (TEs) was held 24 to 28 February 2012 at the Asilomar Conference Center, Pacific Grove, CA, USA. Sponsored in part by the National Institutes of Health grant 5 P41 LM006252, the goal of the conference was to bring together researchers from around the world who study the impact and mechanisms of TEs using multiple computational and experimental approaches. The meeting drew close to 170 attendees and included invited floor presentations on the biology of TEs and their genomic impact, as well as numerous talks contributed by young scientists. The workshop talks were devoted to computational analysis of TEs with additional time for discussion of unresolved issues. Also, there was ample opportunity for poster presentations and informal evening discussions. The success of the meeting reflects the important role of Repbase in comparative genomic studies, and emphasizes the need for close interactions between experimental and computational biologists in the years to come.

  3. Genomic impact of eukaryotic transposable elements

    Directory of Open Access Journals (Sweden)

    Arkhipova Irina R

    2012-11-01

    Full Text Available Abstract The third international conference on the genomic impact of eukaryotic transposable elements (TEs was held 24 to 28 February 2012 at the Asilomar Conference Center, Pacific Grove, CA, USA. Sponsored in part by the National Institutes of Health grant 5 P41 LM006252, the goal of the conference was to bring together researchers from around the world who study the impact and mechanisms of TEs using multiple computational and experimental approaches. The meeting drew close to 170 attendees and included invited floor presentations on the biology of TEs and their genomic impact, as well as numerous talks contributed by young scientists. The workshop talks were devoted to computational analysis of TEs with additional time for discussion of unresolved issues. Also, there was ample opportunity for poster presentations and informal evening discussions. The success of the meeting reflects the important role of Repbase in comparative genomic studies, and emphasizes the need for close interactions between experimental and computational biologists in the years to come.

  4. Genome-wide mapping reveals single-origin chromosome replication in Leishmania, a eukaryotic microbe.

    Science.gov (United States)

    Marques, Catarina A; Dickens, Nicholas J; Paape, Daniel; Campbell, Samantha J; McCulloch, Richard

    2015-10-19

    DNA replication initiates on defined genome sites, termed origins. Origin usage appears to follow common rules in the eukaryotic organisms examined to date: all chromosomes are replicated from multiple origins, which display variations in firing efficiency and are selected from a larger pool of potential origins. To ask if these features of DNA replication are true of all eukaryotes, we describe genome-wide origin mapping in the parasite Leishmania. Origin mapping in Leishmania suggests a striking divergence in origin usage relative to characterized eukaryotes, since each chromosome appears to be replicated from a single origin. By comparing two species of Leishmania, we find evidence that such origin singularity is maintained in the face of chromosome fusion or fission events during evolution. Mapping Leishmania origins suggests that all origins fire with equal efficiency, and that the genomic sites occupied by origins differ from related non-origins sites. Finally, we provide evidence that origin location in Leishmania displays striking conservation with Trypanosoma brucei, despite the latter parasite replicating its chromosomes from multiple, variable strength origins. The demonstration of chromosome replication for a single origin in Leishmania, a microbial eukaryote, has implications for the evolution of origin multiplicity and associated controls, and may explain the pervasive aneuploidy that characterizes Leishmania chromosome architecture.

  5. Susceptibilities to DNA Structural Transitions within Eukaryotic Genomes

    Science.gov (United States)

    Zhabinskaya, Dina; Benham, Craig; Madden, Sally

    2012-02-01

    We analyze the competitive transitions to alternate secondary DNA structures in a negatively supercoiled DNA molecule of kilobase length and specified base sequence. We use statistical mechanics to calculate the competition among all regions within the sequence that are susceptible to transitions to alternate structures. We use an approximate numerical method since the calculation of an exact partition function is numerically cumbersome for DNA molecules of lengths longer than hundreds of base pairs. This method yields accurate results in reasonable computational times. We implement algorithms that calculate the competition between transitions to denatured states and to Z-form DNA. We analyze these transitions near the transcription start sites (TSS) of a set of eukaryotic genes. We find an enhancement of Z-forming regions upstream of the TSS and a depletion of denatured regions around the start sites. We confirm that these finding are statistically significant by comparing our results to a set of randomized genes with preserved base composition at each position relative to the gene start sites. When we study the correlation of these transitions in orthologous mouse and human genes we find a clear evolutionary conservation of both types of transitions around the TSS.

  6. Genome-wide analysis of eukaryote thaumatin-like proteins (TLPs with an emphasis on poplar

    Directory of Open Access Journals (Sweden)

    Duplessis Sébastien

    2011-02-01

    Full Text Available Abstract Background Plant inducible immunity includes the accumulation of a set of defense proteins during infection called pathogenesis-related (PR proteins, which are grouped into families termed PR-1 to PR-17. The PR-5 family is composed of thaumatin-like proteins (TLPs, which are responsive to biotic and abiotic stress and are widely studied in plants. TLPs were also recently discovered in fungi and animals. In the poplar genome, TLPs are over-represented compared with annual species and their transcripts strongly accumulate during stress conditions. Results Our analysis of the poplar TLP family suggests that the expansion of this gene family was followed by diversification, as differences in expression patterns and predicted properties correlate with phylogeny. In particular, we identified a clade of poplar TLPs that cluster to a single 350 kb locus of chromosome I and that are up-regulated by poplar leaf rust infection. A wider phylogenetic analysis of eukaryote TLPs - including plant, animal and fungi sequences - shows that TLP gene content and diversity increased markedly during land plant evolution. Mapping the reported functions of characterized TLPs to the eukaryote phylogenetic tree showed that antifungal or glycan-lytic properties are widespread across eukaryote phylogeny, suggesting that these properties are shared by most TLPs and are likely associated with the presence of a conserved acidic cleft in their 3D structure. Also, we established an exhaustive catalog of TLPs with atypical architectures such as small-TLPs, TLP-kinases and small-TLP-kinases, which have potentially developed alternative functions (such as putative receptor kinases for pathogen sensing and signaling. Conclusion Our study, based on the most recent plant genome sequences, provides evidence for TLP gene family diversification during land plant evolution. We have shown that the diverse functions described for TLPs are not restricted to specific clades but seem

  7. The COG database: an updated version includes eukaryotes

    Directory of Open Access Journals (Sweden)

    Sverdlov Alexander V

    2003-09-01

    Full Text Available Abstract Background The availability of multiple, essentially complete genome sequences of prokaryotes and eukaryotes spurred both the demand and the opportunity for the construction of an evolutionary classification of genes from these genomes. Such a classification system based on orthologous relationships between genes appears to be a natural framework for comparative genomics and should facilitate both functional annotation of genomes and large-scale evolutionary studies. Results We describe here a major update of the previously developed system for delineation of Clusters of Orthologous Groups of proteins (COGs from the sequenced genomes of prokaryotes and unicellular eukaryotes and the construction of clusters of predicted orthologs for 7 eukaryotic genomes, which we named KOGs after eukaryotic orthologous groups. The COG collection currently consists of 138,458 proteins, which form 4873 COGs and comprise 75% of the 185,505 (predicted proteins encoded in 66 genomes of unicellular organisms. The eukaryotic orthologous groups (KOGs include proteins from 7 eukaryotic genomes: three animals (the nematode Caenorhabditis elegans, the fruit fly Drosophila melanogaster and Homo sapiens, one plant, Arabidopsis thaliana, two fungi (Saccharomyces cerevisiae and Schizosaccharomyces pombe, and the intracellular microsporidian parasite Encephalitozoon cuniculi. The current KOG set consists of 4852 clusters of orthologs, which include 59,838 proteins, or ~54% of the analyzed eukaryotic 110,655 gene products. Compared to the coverage of the prokaryotic genomes with COGs, a considerably smaller fraction of eukaryotic genes could be included into the KOGs; addition of new eukaryotic genomes is expected to result in substantial increase in the coverage of eukaryotic genomes with KOGs. Examination of the phyletic patterns of KOGs reveals a conserved core represented in all analyzed species and consisting of ~20% of the KOG set. This conserved portion of the

  8. An SVD-based comparison of nine whole eukaryotic genomes supports a coelomate rather than ecdysozoan lineage

    Directory of Open Access Journals (Sweden)

    Stuart Gary W

    2004-12-01

    Full Text Available Abstract Background Eukaryotic whole genome sequences are accumulating at an impressive rate. Effective methods for comparing multiple whole eukaryotic genomes on a large scale are needed. Most attempted solutions involve the production of large scale alignments, and many of these require a high stringency pre-screen for putative orthologs in order to reduce the effective size of the dataset and provide a reasonably high but unknown fraction of correctly aligned homologous sites for comparison. As an alternative, highly efficient methods that do not require the pre-alignment of operationally defined orthologs are also being explored. Results A non-alignment method based on the Singular Value Decomposition (SVD was used to compare the predicted protein complement of nine whole eukaryotic genomes ranging from yeast to man. This analysis resulted in the simultaneous identification and definition of a large number of well conserved motifs and gene families, and produced a species tree supporting one of two conflicting hypotheses of metazoan relationships. Conclusions Our SVD-based analysis of the entire protein complement of nine whole eukaryotic genomes suggests that highly conserved motifs and gene families can be identified and effectively compared in a single coherent definition space for the easy extraction of gene and species trees. While this occurs without the explicit definition of orthologs or homologous sites, the analysis can provide a basis for these definitions.

  9. Meeting Report: Minutes from EMBO: Ten Years of Comparative Genomics of Eukaryotic Microorganisms

    Czech Academy of Sciences Publication Activity Database

    Lukeš, Julius; López-García, P.; Louis, E.; Boekhout, T.

    2016-01-01

    Roč. 167, č. 3 (2016), s. 217-221 ISSN 1434-4610 Institutional support: RVO:60077344 Keywords : protist * eukaryotic microorganisms * genomics Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 2.794, year: 2016

  10. Implications of structural genomics target selection strategies: Pfam5000, whole genome, and random approaches

    Energy Technology Data Exchange (ETDEWEB)

    Chandonia, John-Marc; Brenner, Steven E.

    2004-07-14

    The structural genomics project is an international effort to determine the three-dimensional shapes of all important biological macromolecules, with a primary focus on proteins. Target proteins should be selected according to a strategy which is medically and biologically relevant, of good value, and tractable. As an option to consider, we present the Pfam5000 strategy, which involves selecting the 5000 most important families from the Pfam database as sources for targets. We compare the Pfam5000 strategy to several other proposed strategies that would require similar numbers of targets. These include including complete solution of several small to moderately sized bacterial proteomes, partial coverage of the human proteome, and random selection of approximately 5000 targets from sequenced genomes. We measure the impact that successful implementation of these strategies would have upon structural interpretation of the proteins in Swiss-Prot, TrEMBL, and 131 complete proteomes (including 10 of eukaryotes) from the Proteome Analysis database at EBI. Solving the structures of proteins from the 5000 largest Pfam families would allow accurate fold assignment for approximately 68 percent of all prokaryotic proteins (covering 59 percent of residues) and 61 percent of eukaryotic proteins (40 percent of residues). More fine-grained coverage which would allow accurate modeling of these proteins would require an order of magnitude more targets. The Pfam5000 strategy may be modified in several ways, for example to focus on larger families, bacterial sequences, or eukaryotic sequences; as long as secondary consideration is given to large families within Pfam, coverage results vary only slightly. In contrast, focusing structural genomics on a single tractable genome would have only a limited impact in structural knowledge of other proteomes: a significant fraction (about 30-40 percent of the proteins, and 40-60 percent of the residues) of each proteome is classified in small

  11. Yeast 2.0-connecting the dots in the construction of the world's first functional synthetic eukaryotic genome.

    Science.gov (United States)

    Pretorius, I S; Boeke, J D

    2018-06-01

    biosafety, bioethics and regulatory aspects of their pioneering work. This article presents a shared vision of constructing a synthetic eukaryotic genome in a safe model organism by using novel concepts and advanced technologies. This multidisciplinary and collaborative project is conducted under a sound governance structure that does not only respect the scientific achievements and lessons from the past, but that is also focussed on leading the present and helping to secure a brighter future for all.

  12. Visualization of genome signatures of eukaryote genomes by batch-learning self-organizing map with a special emphasis on Drosophila genomes.

    Science.gov (United States)

    Abe, Takashi; Hamano, Yuta; Ikemura, Toshimichi

    2014-01-01

    A strategy of evolutionary studies that can compare vast numbers of genome sequences is becoming increasingly important with the remarkable progress of high-throughput DNA sequencing methods. We previously established a sequence alignment-free clustering method "BLSOM" for di-, tri-, and tetranucleotide compositions in genome sequences, which can characterize sequence characteristics (genome signatures) of a wide range of species. In the present study, we generated BLSOMs for tetra- and pentanucleotide compositions in approximately one million sequence fragments derived from 101 eukaryotes, for which almost complete genome sequences were available. BLSOM recognized phylotype-specific characteristics (e.g., key combinations of oligonucleotide frequencies) in the genome sequences, permitting phylotype-specific clustering of the sequences without any information regarding the species. In our detailed examination of 12 Drosophila species, the correlation between their phylogenetic classification and the classification on the BLSOMs was observed to visualize oligonucleotides diagnostic for species-specific clustering.

  13. The Persistent Contributions of RNA to Eukaryotic Gen(om)e Architecture and Cellular Function

    Science.gov (United States)

    Brosius, Jürgen

    2014-01-01

    Currently, the best scenario for earliest forms of life is based on RNA molecules as they have the proven ability to catalyze enzymatic reactions and harbor genetic information. Evolutionary principles valid today become apparent in such models already. Furthermore, many features of eukaryotic genome architecture might have their origins in an RNA or RNA/protein (RNP) world, including the onset of a further transition, when DNA replaced RNA as the genetic bookkeeper of the cell. Chromosome maintenance, splicing, and regulatory function via RNA may be deeply rooted in the RNA/RNP worlds. Mostly in eukaryotes, conversion from RNA to DNA is still ongoing, which greatly impacts the plasticity of extant genomes. Raw material for novel genes encoding protein or RNA, or parts of genes including regulatory elements that selection can act on, continues to enter the evolutionary lottery. PMID:25081515

  14. Patterns of prokaryotic lateral gene transfers affecting parasitic microbial eukaryotes

    DEFF Research Database (Denmark)

    Alsmark, Cecilia; Foster, Peter G; Sicheritz-Pontén, Thomas

    2013-01-01

    BACKGROUND: The influence of lateral gene transfer on gene origins and biology in eukaryotes is poorly understood compared with those of prokaryotes. A number of independent investigations focusing on specific genes, individual genomes, or specific functional categories from various eukaryotes have...... approach to systematically investigate lateral gene transfer affecting the proteomes of thirteen, mainly parasitic, microbial eukaryotes, representing four of the six eukaryotic super-groups. All of the genomes investigated have been significantly affected by prokaryote-to-eukaryote lateral gene transfers...... indicated that lateral gene transfer does indeed affect eukaryotic genomes. However, the lack of common methodology and criteria in these studies makes it difficult to assess the general importance and influence of lateral gene transfer on eukaryotic genome evolution. RESULTS: We used a phylogenomic...

  15. Insights into structural variations and genome rearrangements in prokaryotic genomes.

    Science.gov (United States)

    Periwal, Vinita; Scaria, Vinod

    2015-01-01

    Structural variations (SVs) are genomic rearrangements that affect fairly large fragments of DNA. Most of the SVs such as inversions, deletions and translocations have been largely studied in context of genetic diseases in eukaryotes. However, recent studies demonstrate that genome rearrangements can also have profound impact on prokaryotic genomes, leading to altered cell phenotype. In contrast to single-nucleotide variations, SVs provide a much deeper insight into organization of bacterial genomes at a much better resolution. SVs can confer change in gene copy number, creation of new genes, altered gene expression and many other functional consequences. High-throughput technologies have now made it possible to explore SVs at a much refined resolution in bacterial genomes. Through this review, we aim to highlight the importance of the less explored field of SVs in prokaryotic genomes and their impact. We also discuss its potential applicability in the emerging fields of synthetic biology and genome engineering where targeted SVs could serve to create sophisticated and accurate genome editing. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  16. EuGI: a novel resource for studying genomic islands to facilitate horizontal gene transfer detection in eukaryotes.

    Science.gov (United States)

    Clasen, Frederick Johannes; Pierneef, Rian Ewald; Slippers, Bernard; Reva, Oleg

    2018-05-03

    Genomic islands (GIs) are inserts of foreign DNA that have potentially arisen through horizontal gene transfer (HGT). There are evidences that GIs can contribute significantly to the evolution of prokaryotes. The acquisition of GIs through HGT in eukaryotes has, however, been largely unexplored. In this study, the previously developed GI prediction tool, SeqWord Gene Island Sniffer (SWGIS), is modified to predict GIs in eukaryotic chromosomes. Artificial simulations are used to estimate ratios of predicting false positive and false negative GIs by inserting GIs into different test chromosomes and performing the SWGIS v2.0 algorithm. Using SWGIS v2.0, GIs are then identified in 36 fungal, 22 protozoan and 8 invertebrate genomes. SWGIS v2.0 predicts GIs in large eukaryotic chromosomes based on the atypical nucleotide composition of these regions. Averages for predicting false negative and false positive GIs were 20.1% and 11.01% respectively. A total of 10,550 GIs were identified in 66 eukaryotic species with 5299 of these GIs coding for at least one functional protein. The EuGI web-resource, freely accessible at http://eugi.bi.up.ac.za , was developed that allows browsing the database created from identified GIs and genes within GIs through an interactive and visual interface. SWGIS v2.0 along with the EuGI database, which houses GIs identified in 66 different eukaryotic species, and the EuGI web-resource, provide the first comprehensive resource for studying HGT in eukaryotes.

  17. Consistent mutational paths predict eukaryotic thermostability

    Directory of Open Access Journals (Sweden)

    van Noort Vera

    2013-01-01

    Full Text Available Abstract Background Proteomes of thermophilic prokaryotes have been instrumental in structural biology and successfully exploited in biotechnology, however many proteins required for eukaryotic cell function are absent from bacteria or archaea. With Chaetomium thermophilum, Thielavia terrestris and Thielavia heterothallica three genome sequences of thermophilic eukaryotes have been published. Results Studying the genomes and proteomes of these thermophilic fungi, we found common strategies of thermal adaptation across the different kingdoms of Life, including amino acid biases and a reduced genome size. A phylogenetics-guided comparison of thermophilic proteomes with those of other, mesophilic Sordariomycetes revealed consistent amino acid substitutions associated to thermophily that were also present in an independent lineage of thermophilic fungi. The most consistent pattern is the substitution of lysine by arginine, which we could find in almost all lineages but has not been extensively used in protein stability engineering. By exploiting mutational paths towards the thermophiles, we could predict particular amino acid residues in individual proteins that contribute to thermostability and validated some of them experimentally. By determining the three-dimensional structure of an exemplar protein from C. thermophilum (Arx1, we could also characterise the molecular consequences of some of these mutations. Conclusions The comparative analysis of these three genomes not only enhances our understanding of the evolution of thermophily, but also provides new ways to engineer protein stability.

  18. The genome of the polar eukaryotic microalga Coccomyxa subellipsoidea reveals traits of cold adaptation

    Energy Technology Data Exchange (ETDEWEB)

    Blanc, Guillaume; Agarkova, Irina; Grimwood, Jane; Kuo, Alan; Brueggeman, Andrew; Dunigan, David D.; Gurnon, James; Ladunga, Istvan; Lindquist, Erika; Lucas, Susan; Pangilinan, Jasmyn; Proschold, Thomas; Salamov, Asaf; Schmutz, Jeremy; Weeks, Donald; Tamada, Takashi; Lomsadze, Alexandre; Borodovsky, Mark; Claverie, Jean-Michel; Grigoriev, Igor V.; Van Etten, James L.

    2012-02-13

    Background Little is known about the mechanisms of adaptation of life to the extreme environmental conditions encountered in polar regions. Here we present the genome sequence of a unicellular green alga from the division chlorophyta, Coccomyxa subellipsoidea C-169, which we will hereafter refer to as C-169. This is the first eukaryotic microorganism from a polar environment to have its genome sequenced. Results The 48.8 Mb genome contained in 20 chromosomes exhibits significant synteny conservation with the chromosomes of its relatives Chlorella variabilis and Chlamydomonas reinhardtii. The order of the genes is highly reshuffled within synteny blocks, suggesting that intra-chromosomal rearrangements were more prevalent than inter-chromosomal rearrangements. Remarkably, Zepp retrotransposons occur in clusters of nested elements with strictly one cluster per chromosome probably residing at the centromere. Several protein families overrepresented in C. subellipsoidae include proteins involved in lipid metabolism, transporters, cellulose synthases and short alcohol dehydrogenases. Conversely, C-169 lacks proteins that exist in all other sequenced chlorophytes, including components of the glycosyl phosphatidyl inositol anchoring system, pyruvate phosphate dikinase and the photosystem 1 reaction center subunit N (PsaN). Conclusions We suggest that some of these gene losses and gains could have contributed to adaptation to low temperatures. Comparison of these genomic features with the adaptive strategies of psychrophilic microbes suggests that prokaryotes and eukaryotes followed comparable evolutionary routes to adapt to cold environments.

  19. Leucine-Rich repeat receptor kinases are sporadically distributed in eukaryotic genomes

    Directory of Open Access Journals (Sweden)

    Diévart Anne

    2011-12-01

    Full Text Available Abstract Background Plant leucine-rich repeat receptor-like kinases (LRR-RLKs are receptor kinases that contain LRRs in their extracellular domain. In the last 15 years, many research groups have demonstrated major roles played by LRR-RLKs in plants during almost all developmental processes throughout the life of the plant and in defense/resistance against a large range of pathogens. Recently, a breakthrough has been made in this field that challenges the dogma of the specificity of plant LRR-RLKs. Results We analyzed ~1000 complete genomes and show that LRR-RK genes have now been identified in 8 non-plant genomes. We performed an exhaustive phylogenetic analysis of all of these receptors, revealing that all of the LRR-containing receptor subfamilies form lineage-specific clades. Our results suggest that the association of LRRs with RKs appeared independently at least four times in eukaryotic evolutionary history. Moreover, the molecular evolutionary history of the LRR-RKs found in oomycetes is reminiscent of the pattern observed in plants: expansion with amplification/deletion and evolution of the domain organization leading to the functional diversification of members of the gene family. Finally, the expression data suggest that oomycete LRR-RKs may play a role in several stages of the oomycete life cycle. Conclusions In view of the key roles that LRR-RLKs play throughout the entire lifetime of plants and plant-environment interactions, the emergence and expansion of this type of receptor in several phyla along the evolution of eukaryotes, and particularly in oomycete genomes, questions their intrinsic functions in mimicry and/or in the coevolution of receptors between hosts and pathogens.

  20. High throughput platforms for structural genomics of integral membrane proteins.

    Science.gov (United States)

    Mancia, Filippo; Love, James

    2011-08-01

    Structural genomics approaches on integral membrane proteins have been postulated for over a decade, yet specific efforts are lagging years behind their soluble counterparts. Indeed, high throughput methodologies for production and characterization of prokaryotic integral membrane proteins are only now emerging, while large-scale efforts for eukaryotic ones are still in their infancy. Presented here is a review of recent literature on actively ongoing structural genomics of membrane protein initiatives, with a focus on those aimed at implementing interesting techniques aimed at increasing our rate of success for this class of macromolecules. Copyright © 2011 Elsevier Ltd. All rights reserved.

  1. Structure and Mechanism of a Eukaryotic FMN Adenylyltransferase

    OpenAIRE

    Huerta, Carlos; Borek, Dominika; Machius, Mischa; Grishin, Nick V.; Zhang, Hong

    2009-01-01

    Flavin mononucleotide adenylyltransferase (FMNAT) catalyzes the formation of the essential flavocoenzyme FAD and plays an important role in flavocoenzyme homeostasis regulation. By sequence comparison, bacterial and eukaryotic FMNAT enzymes belong to two different protein superfamilies and apparently utilize different set of active site residues to accomplish the same chemistry. Here we report the first structural characterization of a eukaryotic FMNAT from a pathogenic yeast Candida glabrata...

  2. Discrepancy variation of dinucleotide microsatellite repeats in eukaryotic genomes

    Directory of Open Access Journals (Sweden)

    HUAN GAO

    2009-01-01

    Full Text Available To address whether there are differences of variation among repeat motif types and among taxonomic groups, we present here an analysis of variation and correlation of dinucleotide microsatellite repeats in eukaryotic genomes. Ten taxonomic groups were compared, those being primates, mammalia (excluding primates and rodentia, rodentia, birds, fish, amphibians and reptiles, insects, molluscs, plants and fungi, respectively. The data used in the analysis is from the literature published in the Journal of Molecular Ecology Notes. Analysis of variation reveals that there are no significant differences between AC and AG repeat motif types. Moreover, the number of alleles correlates positively with the copy number in both AG and AC repeats. Similar conclusions can be obtained from each taxonomic group. These results strongly suggest that the increase of SSR variation is almost linear with the increase of the copy number of each repeat motif. As well, the results suggest that the variability of SSR in the genomes of low-ranking species seem to be more than that of high-ranking species, excluding primates and fungi.

  3. Initiation of translation in bacteria by a structured eukaryotic IRES RNA.

    Science.gov (United States)

    Colussi, Timothy M; Costantino, David A; Zhu, Jianyu; Donohue, John Paul; Korostelev, Andrei A; Jaafar, Zane A; Plank, Terra-Dawn M; Noller, Harry F; Kieft, Jeffrey S

    2015-03-05

    The central dogma of gene expression (DNA to RNA to protein) is universal, but in different domains of life there are fundamental mechanistic differences within this pathway. For example, the canonical molecular signals used to initiate protein synthesis in bacteria and eukaryotes are mutually exclusive. However, the core structures and conformational dynamics of ribosomes that are responsible for the translation steps that take place after initiation are ancient and conserved across the domains of life. We wanted to explore whether an undiscovered RNA-based signal might be able to use these conserved features, bypassing mechanisms specific to each domain of life, and initiate protein synthesis in both bacteria and eukaryotes. Although structured internal ribosome entry site (IRES) RNAs can manipulate ribosomes to initiate translation in eukaryotic cells, an analogous RNA structure-based mechanism has not been observed in bacteria. Here we report our discovery that a eukaryotic viral IRES can initiate translation in live bacteria. We solved the crystal structure of this IRES bound to a bacterial ribosome to 3.8 Å resolution, revealing that despite differences between bacterial and eukaryotic ribosomes this IRES binds directly to both and occupies the space normally used by transfer RNAs. Initiation in both bacteria and eukaryotes depends on the structure of the IRES RNA, but in bacteria this RNA uses a different mechanism that includes a form of ribosome repositioning after initial recruitment. This IRES RNA bridges billions of years of evolutionary divergence and provides an example of an RNA structure-based translation initiation signal capable of operating in two domains of life.

  4. Universal internucleotide statistics in full genomes: a footprint of the DNA structure and packaging?

    Directory of Open Access Journals (Sweden)

    Mikhail I Bogachev

    Full Text Available Uncovering the fundamental laws that govern the complex DNA structural organization remains challenging and is largely based upon reconstructions from the primary nucleotide sequences. Here we investigate the distributions of the internucleotide intervals and their persistence properties in complete genomes of various organisms from Archaea and Bacteria to H. Sapiens aiming to reveal the manifestation of the universal DNA architecture. We find that in all considered organisms the internucleotide interval distributions exhibit the same [Formula: see text]-exponential form. While in prokaryotes a single [Formula: see text]-exponential function makes the best fit, in eukaryotes the PDF contains additionally a second [Formula: see text]-exponential, which in the human genome makes a perfect approximation over nearly 10 decades. We suggest that this functional form is a footprint of the heterogeneous DNA structure, where the first [Formula: see text]-exponential reflects the universal helical pitch that appears both in pro- and eukaryotic DNA, while the second [Formula: see text]-exponential is a specific marker of the large-scale eukaryotic DNA organization.

  5. C-terminal motif prediction in eukaryotic proteomes using comparative genomics and statistical over-representation across protein families

    Directory of Open Access Journals (Sweden)

    Cutler Sean R

    2007-06-01

    Full Text Available Abstract Background The carboxy termini of proteins are a frequent site of activity for a variety of biologically important functions, ranging from post-translational modification to protein targeting. Several short peptide motifs involved in protein sorting roles and dependent upon their proximity to the C-terminus for proper function have already been characterized. As a limited number of such motifs have been identified, the potential exists for genome-wide statistical analysis and comparative genomics to reveal novel peptide signatures functioning in a C-terminal dependent manner. We have applied a novel methodology to the prediction of C-terminal-anchored peptide motifs involving a simple z-statistic and several techniques for improving the signal-to-noise ratio. Results We examined the statistical over-representation of position-specific C-terminal tripeptides in 7 eukaryotic proteomes. Sequence randomization models and simple-sequence masking were applied to the successful reduction of background noise. Similarly, as C-terminal homology among members of large protein families may artificially inflate tripeptide counts in an irrelevant and obfuscating manner, gene-family clustering was performed prior to the analysis in order to assess tripeptide over-representation across protein families as opposed to across all proteins. Finally, comparative genomics was used to identify tripeptides significantly occurring in multiple species. This approach has been able to predict, to our knowledge, all C-terminally anchored targeting motifs present in the literature. These include the PTS1 peroxisomal targeting signal (SKL*, the ER-retention signal (K/HDEL*, the ER-retrieval signal for membrane bound proteins (KKxx*, the prenylation signal (CC* and the CaaX box prenylation motif. In addition to a high statistical over-representation of these known motifs, a collection of significant tripeptides with a high propensity for biological function exists

  6. Structural studies demonstrating a bacteriophage-like replication cycle of the eukaryote-infecting Paramecium bursaria chlorella virus-1.

    Directory of Open Access Journals (Sweden)

    Elad Milrot

    2017-08-01

    Full Text Available A fundamental stage in viral infection is the internalization of viral genomes in host cells. Although extensively studied, the mechanisms and factors responsible for the genome internalization process remain poorly understood. Here we report our observations, derived from diverse imaging methods on genome internalization of the large dsDNA Paramecium bursaria chlorella virus-1 (PBCV-1. Our studies reveal that early infection stages of this eukaryotic-infecting virus occurs by a bacteriophage-like pathway, whereby PBCV-1 generates a hole in the host cell wall and ejects its dsDNA genome in a linear, base-pair-by-base-pair process, through a membrane tunnel generated by the fusion of the virus internal membrane with the host membrane. Furthermore, our results imply that PBCV-1 DNA condensation that occurs shortly after infection probably plays a role in genome internalization, as hypothesized for the infection of some bacteriophages. The subsequent perforation of the host photosynthetic membranes presumably enables trafficking of viral genomes towards host nuclei. Previous studies established that at late infection stages PBCV-1 generates cytoplasmic organelles, termed viral factories, where viral assembly takes place, a feature characteristic of many large dsDNA viruses that infect eukaryotic organisms. PBCV-1 thus appears to combine a bacteriophage-like mechanism during early infection stages with a eukaryotic-like infection pathway in its late replication cycle.

  7. Neural Network Prediction of Translation Initiation Sites in Eukaryotes: Perspectives for EST and Genome analysis

    DEFF Research Database (Denmark)

    Pedersen, Anders Gorm; Nielsen, Henrik

    1997-01-01

    Translation in eukaryotes does not always start at the first AUG in an mRNA, implying that context information also plays a role.This makes prediction of translation initiation sites a non-trivial task, especially when analysing EST and genome data where the entire mature mRNA sequence is not known...

  8. An HMM-based comparative genomic framework for detecting introgression in eukaryotes.

    Directory of Open Access Journals (Sweden)

    Kevin J Liu

    2014-06-01

    Full Text Available One outcome of interspecific hybridization and subsequent effects of evolutionary forces is introgression, which is the integration of genetic material from one species into the genome of an individual in another species. The evolution of several groups of eukaryotic species has involved hybridization, and cases of adaptation through introgression have been already established. In this work, we report on PhyloNet-HMM-a new comparative genomic framework for detecting introgression in genomes. PhyloNet-HMM combines phylogenetic networks with hidden Markov models (HMMs to simultaneously capture the (potentially reticulate evolutionary history of the genomes and dependencies within genomes. A novel aspect of our work is that it also accounts for incomplete lineage sorting and dependence across loci. Application of our model to variation data from chromosome 7 in the mouse (Mus musculus domesticus genome detected a recently reported adaptive introgression event involving the rodent poison resistance gene Vkorc1, in addition to other newly detected introgressed genomic regions. Based on our analysis, it is estimated that about 9% of all sites within chromosome 7 are of introgressive origin (these cover about 13 Mbp of chromosome 7, and over 300 genes. Further, our model detected no introgression in a negative control data set. We also found that our model accurately detected introgression and other evolutionary processes from synthetic data sets simulated under the coalescent model with recombination, isolation, and migration. Our work provides a powerful framework for systematic analysis of introgression while simultaneously accounting for dependence across sites, point mutations, recombination, and ancestral polymorphism.

  9. Dormant origins as a built-in safeguard in eukaryotic DNA replication against genome instability and disease development.

    Science.gov (United States)

    Shima, Naoko; Pederson, Kayla D

    2017-08-01

    DNA replication is a prerequisite for cell proliferation, yet it can be increasingly challenging for a eukaryotic cell to faithfully duplicate its genome as its size and complexity expands. Dormant origins now emerge as a key component for cells to successfully accomplish such a demanding but essential task. In this perspective, we will first provide an overview of the fundamental processes eukaryotic cells have developed to regulate origin licensing and firing. With a special focus on mammalian systems, we will then highlight the role of dormant origins in preventing replication-associated genome instability and their functional interplay with proteins involved in the DNA damage repair response for tumor suppression. Lastly, deficiencies in the origin licensing machinery will be discussed in relation to their influence on stem cell maintenance and human diseases. Copyright © 2017 Elsevier B.V. All rights reserved.

  10. Structure of the prolyl-tRNA synthetase from the eukaryotic pathogen Giardia lamblia

    Energy Technology Data Exchange (ETDEWEB)

    Larson, Eric T.; Kim, Jessica E.; Napuli, Alberto J.; Verlinde, Christophe L. M. J.; Fan, Erkang; Zucker, Frank H.; Van Voorhis, Wesley C.; Buckner, Frederick S.; Hol, Wim G. J.; Merritt, Ethan A., E-mail: merritt@u.washington.edu [Medical Structural Genomics of Pathogenic Protozoa, (United States); University of Washington, Seattle, WA 98195 (United States)

    2012-09-01

    The structure of Giardia prolyl-tRNA synthetase cocrystallized with proline and ATP shows evidence for half-of-the-sites activity, leading to a corresponding mixture of reaction substrates and product (prolyl-AMP) in the two active sites of the dimer. The genome of the human intestinal parasite Giardia lamblia contains only a single aminoacyl-tRNA synthetase gene for each amino acid. The Giardia prolyl-tRNA synthetase gene product was originally misidentified as a dual-specificity Pro/Cys enzyme, in part owing to its unexpectedly high off-target activation of cysteine, but is now believed to be a normal representative of the class of archaeal/eukaryotic prolyl-tRNA synthetases. The 2.2 Å resolution crystal structure of the G. lamblia enzyme presented here is thus the first structure determination of a prolyl-tRNA synthetase from a eukaryote. The relative occupancies of substrate (proline) and product (prolyl-AMP) in the active site are consistent with half-of-the-sites reactivity, as is the observed biphasic thermal denaturation curve for the protein in the presence of proline and MgATP. However, no corresponding induced asymmetry is evident in the structure of the protein. No thermal stabilization is observed in the presence of cysteine and ATP. The implied low affinity for the off-target activation product cysteinyl-AMP suggests that translational fidelity in Giardia is aided by the rapid release of misactivated cysteine.

  11. Exploring repetitive DNA landscapes using REPCLASS, a tool that automates the classification of transposable elements in eukaryotic genomes.

    Science.gov (United States)

    Feschotte, Cédric; Keswani, Umeshkumar; Ranganathan, Nirmal; Guibotsy, Marcel L; Levine, David

    2009-07-23

    Eukaryotic genomes contain large amount of repetitive DNA, most of which is derived from transposable elements (TEs). Progress has been made to develop computational tools for ab initio identification of repeat families, but there is an urgent need to develop tools to automate the annotation of TEs in genome sequences. Here we introduce REPCLASS, a tool that automates the classification of TE sequences. Using control repeat libraries, we show that the program can classify accurately virtually any known TE types. Combining REPCLASS to ab initio repeat finding in the genomes of Caenorhabditis elegans and Drosophila melanogaster allowed us to recover the contrasting TE landscape characteristic of these species. Unexpectedly, REPCLASS also uncovered several novel TE families in both genomes, augmenting the TE repertoire of these model species. When applied to the genomes of distant Caenorhabditis and Drosophila species, the approach revealed a remarkable conservation of TE composition profile within each genus, despite substantial interspecific covariations in genome size and in the number of TEs and TE families. Lastly, we applied REPCLASS to analyze 10 fungal genomes from a wide taxonomic range, most of which have not been analyzed for TE content previously. The results showed that TE diversity varies widely across the fungi "kingdom" and appears to positively correlate with genome size, in particular for DNA transposons. Together, these data validate REPCLASS as a powerful tool to explore the repetitive DNA landscapes of eukaryotes and to shed light onto the evolutionary forces shaping TE diversity and genome architecture.

  12. Strong eukaryotic IRESs have weak secondary structure.

    Directory of Open Access Journals (Sweden)

    Xuhua Xia

    Full Text Available BACKGROUND: The objective of this work was to investigate the hypothesis that eukaryotic Internal Ribosome Entry Sites (IRES lack secondary structure and to examine the generality of the hypothesis. METHODOLOGY/PRINCIPAL FINDINGS: IRESs of the yeast and the fruit fly are located in the 5'UTR immediately upstream of the initiation codon. The minimum folding energy (MFE of 60 nt RNA segments immediately upstream of the initiation codons was calculated as a proxy of secondary structure stability. MFE of the reverse complements of these 60 nt segments was also calculated. The relationship between MFE and empirically determined IRES activity was investigated to test the hypothesis that strong IRES activity is associated with weak secondary structure. We show that IRES activity in the yeast and the fruit fly correlates strongly with the structural stability, with highest IRES activity found in RNA segments that exhibit the weakest secondary structure. CONCLUSIONS: We found that a subset of eukaryotic IRESs exhibits very low secondary structure in the 5'-UTR sequences immediately upstream of the initiation codon. The consistency in results between the yeast and the fruit fly suggests a possible shared mechanism of cap-independent translation initiation that relies on an unstructured RNA segment.

  13. [Structure and evolution of the eukaryotic FANCJ-like proteins].

    Science.gov (United States)

    Wuhe, Jike; Zefeng, Wu; Sanhong, Fan; Xuguang, Xi

    2015-02-01

    The FANCJ-like protein family is a class of ATP-dependent helicases that can catalytically unwind duplex DNA along the 5'-3' direction. It is involved in the processes of DNA damage repair, homologous recombination and G-quadruplex DNA unwinding, and plays a critical role in maintaining genome integrity. In this study, we systemically analyzed FNACJ-like proteins from 47 eukaryotic species and discussed their sequences diversity, origin and evolution, motif organization patterns and spatial structure differences. Four members of FNACJ-like proteins, including XPD, CHL1, RTEL1 and FANCJ, were found in eukaryotes, but some of them were seriously deficient in most fungi and some insects. For example, the Zygomycota fungi lost RTEL1, Basidiomycota and Ascomycota fungi lost RTEL1 and FANCJ, and Diptera insect lost FANCJ. FANCJ-like proteins contain canonical motor domains HD1 and HD2, and the HD1 domain further integrates with three unique domains Fe-S, Arch and Extra-D. Fe-S and Arch domains are relatively conservative in all members of the family, but the Extra-D domain is lost in XPD and differs from one another in rest members. There are 7, 10 and 2 specific motifs found from the three unique domains respectively, while 5 and 12 specific motifs are found from HD1 and HD2 domains except the conserved motifs reported previously. By analyzing the arrangement pattern of these specific motifs, we found that RTEL1 and FANCJ are more closer and share two specific motifs Vb2 and Vc in HD2 domain, which are likely related with their G-quadruplex DNA unwinding activity. The evidence of evolution showed that FACNJ-like proteins were originated from a helicase, which has a HD1 domain inserted by extra Fe-S domain and Arch domain. By three continuous gene duplication events and followed specialization, eukaryotes finally possessed the current four members of FANCJ-like proteins.

  14. Beyond Agrobacterium-Mediated Transformation: Horizontal Gene Transfer from Bacteria to Eukaryotes.

    Science.gov (United States)

    Lacroix, Benoît; Citovsky, Vitaly

    2018-03-03

    Besides the massive gene transfer from organelles to the nuclear genomes, which occurred during the early evolution of eukaryote lineages, the importance of horizontal gene transfer (HGT) in eukaryotes remains controversial. Yet, increasing amounts of genomic data reveal many cases of bacterium-to-eukaryote HGT that likely represent a significant force in adaptive evolution of eukaryotic species. However, DNA transfer involved in genetic transformation of plants by Agrobacterium species has traditionally been considered as the unique example of natural DNA transfer and integration into eukaryotic genomes. Recent discoveries indicate that the repertoire of donor bacterial species and of recipient eukaryotic hosts potentially are much wider than previously thought, including donor bacterial species, such as plant symbiotic nitrogen-fixing bacteria (e.g., Rhizobium etli) and animal bacterial pathogens (e.g., Bartonella henselae, Helicobacter pylori), and recipient species from virtually all eukaryotic clades. Here, we review the molecular pathways and potential mechanisms of these trans-kingdom HGT events and discuss their utilization in biotechnology and research.

  15. Comparative and functional genomics of Legionella identified eukaryotic like proteins as key players in host-pathogen interactions

    Directory of Open Access Journals (Sweden)

    Laura eGomez-Valero

    2011-10-01

    Full Text Available Although best known for its ability to cause severe pneumonia in people whose immune defenses are weakened, Legionella pneumophila and Legionella longbeachae are two species of a large genus of bacteria that are ubiquitous in nature, where they parasitize protozoa. Adaptation to the host environment and exploitation of host cell functions are critical for the success of these intracellular pathogens. The establishment and publication of the complete genome sequences of L. pneumophila and L. longbeachae isolates paved the way for major breakthroughs in understanding the biology of these organisms. In this review we present the knowledge gained from the analyses and comparison of the complete genome sequences of different L. pneumophila and L. longbeachae strains. Emphasis is given on putative virulence and Legionella life cycle related functions, such as the identification of an extended array of eukaryotic-like proteins, many of which have been shown to modulate host cell functions to the pathogen's advantage. Surprisingly, many of the eukaryotic domain proteins identified in L. pneumophila as well as many substrates of the Dot/Icm type IV secretion system essential for intracellular replication are different between these two species, although they cause the same disease. Finally, evolutionary aspects regarding the eukaryotic like proteins in Legionella are discussed.

  16. Eukaryotic ribonucleases P/MRP: the crystal structure of the P3 domain.

    Science.gov (United States)

    Perederina, Anna; Esakova, Olga; Quan, Chao; Khanova, Elena; Krasilnikov, Andrey S

    2010-02-17

    Ribonuclease (RNase) P is a site-specific endoribonuclease found in all kingdoms of life. Typical RNase P consists of a catalytic RNA component and a protein moiety. In the eukaryotes, the RNase P lineage has split into two, giving rise to a closely related enzyme, RNase MRP, which has similar components but has evolved to have different specificities. The eukaryotic RNases P/MRP have acquired an essential helix-loop-helix protein-binding RNA domain P3 that has an important function in eukaryotic enzymes and distinguishes them from bacterial and archaeal RNases P. Here, we present a crystal structure of the P3 RNA domain from Saccharomyces cerevisiae RNase MRP in a complex with RNase P/MRP proteins Pop6 and Pop7 solved to 2.7 A. The structure suggests similar structural organization of the P3 RNA domains in RNases P/MRP and possible functions of the P3 domains and proteins bound to them in the stabilization of the holoenzymes' structures as well as in interactions with substrates. It provides the first insight into the structural organization of the eukaryotic enzymes of the RNase P/MRP family.

  17. Transfer of DNA from Bacteria to Eukaryotes

    Directory of Open Access Journals (Sweden)

    Benoît Lacroix

    2016-07-01

    Full Text Available Historically, the members of the Agrobacterium genus have been considered the only bacterial species naturally able to transfer and integrate DNA into the genomes of their eukaryotic hosts. Yet, increasing evidence suggests that this ability to genetically transform eukaryotic host cells might be more widespread in the bacterial world. Indeed, analyses of accumulating genomic data reveal cases of horizontal gene transfer from bacteria to eukaryotes and suggest that it represents a significant force in adaptive evolution of eukaryotic species. Specifically, recent reports indicate that bacteria other than Agrobacterium, such as Bartonella henselae (a zoonotic pathogen, Rhizobium etli (a plant-symbiotic bacterium related to Agrobacterium, or even Escherichia coli, have the ability to genetically transform their host cells under laboratory conditions. This DNA transfer relies on type IV secretion systems (T4SSs, the molecular machines that transport macromolecules during conjugative plasmid transfer and also during transport of proteins and/or DNA to the eukaryotic recipient cells. In this review article, we explore the extent of possible transfer of genetic information from bacteria to eukaryotic cells as well as the evolutionary implications and potential applications of this transfer.

  18. DNA to DNA transcription might exist in eukaryotic cells

    OpenAIRE

    Li, Gao-De

    2016-01-01

    Till now, in biological sciences, the term, transcription, mainly refers to DNA to RNA transcription. But our recently published experimental findings obtained from Plasmodium falciparum strongly suggest the existence of DNA to DNA transcription in the genome of eukaryotic cells, which could shed some light on the functions of certain noncoding DNA in the human and other eukaryotic genomes.

  19. Transcription factor IID in the Archaea: sequences in the Thermococcus celer genome would encode a product closely related to the TATA-binding protein of eukaryotes

    Science.gov (United States)

    Marsh, T. L.; Reich, C. I.; Whitelock, R. B.; Olsen, G. J.; Woese, C. R. (Principal Investigator)

    1994-01-01

    The first step in transcription initiation in eukaryotes is mediated by the TATA-binding protein, a subunit of the transcription factor IID complex. We have cloned and sequenced the gene for a presumptive homolog of this eukaryotic protein from Thermococcus celer, a member of the Archaea (formerly archaebacteria). The protein encoded by the archaeal gene is a tandem repeat of a conserved domain, corresponding to the repeated domain in its eukaryotic counterparts. Molecular phylogenetic analyses of the two halves of the repeat are consistent with the duplication occurring before the divergence of the archael and eukaryotic domains. In conjunction with previous observations of similarity in RNA polymerase subunit composition and sequences and the finding of a transcription factor IIB-like sequence in Pyrococcus woesei (a relative of T. celer) it appears that major features of the eukaryotic transcription apparatus were well-established before the origin of eukaryotic cellular organization. The divergence between the two halves of the archael protein is less than that between the halves of the individual eukaryotic sequences, indicating that the average rate of sequence change in the archael protein has been less than in its eukaryotic counterparts. To the extent that this lower rate applies to the genome as a whole, a clearer picture of the early genes (and gene families) that gave rise to present-day genomes is more apt to emerge from the study of sequences from the Archaea than from the corresponding sequences from eukaryotes.

  20. Origins and evolution of viruses of eukaryotes: The ultimate modularity

    International Nuclear Information System (INIS)

    Koonin, Eugene V.; Dolja, Valerian V.; Krupovic, Mart

    2015-01-01

    Viruses and other selfish genetic elements are dominant entities in the biosphere, with respect to both physical abundance and genetic diversity. Various selfish elements parasitize on all cellular life forms. The relative abundances of different classes of viruses are dramatically different between prokaryotes and eukaryotes. In prokaryotes, the great majority of viruses possess double-stranded (ds) DNA genomes, with a substantial minority of single-stranded (ss) DNA viruses and only limited presence of RNA viruses. In contrast, in eukaryotes, RNA viruses account for the majority of the virome diversity although ssDNA and dsDNA viruses are common as well. Phylogenomic analysis yields tangible clues for the origins of major classes of eukaryotic viruses and in particular their likely roots in prokaryotes. Specifically, the ancestral genome of positive-strand RNA viruses of eukaryotes might have been assembled de novo from genes derived from prokaryotic retroelements and bacteria although a primordial origin of this class of viruses cannot be ruled out. Different groups of double-stranded RNA viruses derive either from dsRNA bacteriophages or from positive-strand RNA viruses. The eukaryotic ssDNA viruses apparently evolved via a fusion of genes from prokaryotic rolling circle-replicating plasmids and positive-strand RNA viruses. Different families of eukaryotic dsDNA viruses appear to have originated from specific groups of bacteriophages on at least two independent occasions. Polintons, the largest known eukaryotic transposons, predicted to also form virus particles, most likely, were the evolutionary intermediates between bacterial tectiviruses and several groups of eukaryotic dsDNA viruses including the proposed order “Megavirales” that unites diverse families of large and giant viruses. Strikingly, evolution of all classes of eukaryotic viruses appears to have involved fusion between structural and replicative gene modules derived from different sources

  1. Origins and evolution of viruses of eukaryotes: The ultimate modularity

    Energy Technology Data Exchange (ETDEWEB)

    Koonin, Eugene V., E-mail: koonin@ncbi.nlm.nih.gov [National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894 (United States); Dolja, Valerian V., E-mail: doljav@science.oregonstate.edu [Department of Botany and Plant Pathology, Oregon State University, Corvallis, OR 97331 (United States); Krupovic, Mart, E-mail: krupovic@pasteur.fr [Institut Pasteur, Unité Biologie Moléculaire du Gène chez les Extrêmophiles, Department of Microbiology, Paris 75015 (France)

    2015-05-15

    Viruses and other selfish genetic elements are dominant entities in the biosphere, with respect to both physical abundance and genetic diversity. Various selfish elements parasitize on all cellular life forms. The relative abundances of different classes of viruses are dramatically different between prokaryotes and eukaryotes. In prokaryotes, the great majority of viruses possess double-stranded (ds) DNA genomes, with a substantial minority of single-stranded (ss) DNA viruses and only limited presence of RNA viruses. In contrast, in eukaryotes, RNA viruses account for the majority of the virome diversity although ssDNA and dsDNA viruses are common as well. Phylogenomic analysis yields tangible clues for the origins of major classes of eukaryotic viruses and in particular their likely roots in prokaryotes. Specifically, the ancestral genome of positive-strand RNA viruses of eukaryotes might have been assembled de novo from genes derived from prokaryotic retroelements and bacteria although a primordial origin of this class of viruses cannot be ruled out. Different groups of double-stranded RNA viruses derive either from dsRNA bacteriophages or from positive-strand RNA viruses. The eukaryotic ssDNA viruses apparently evolved via a fusion of genes from prokaryotic rolling circle-replicating plasmids and positive-strand RNA viruses. Different families of eukaryotic dsDNA viruses appear to have originated from specific groups of bacteriophages on at least two independent occasions. Polintons, the largest known eukaryotic transposons, predicted to also form virus particles, most likely, were the evolutionary intermediates between bacterial tectiviruses and several groups of eukaryotic dsDNA viruses including the proposed order “Megavirales” that unites diverse families of large and giant viruses. Strikingly, evolution of all classes of eukaryotic viruses appears to have involved fusion between structural and replicative gene modules derived from different sources

  2. Automatic generation of gene finders for eukaryotic species

    DEFF Research Database (Denmark)

    Terkelsen, Kasper Munch; Krogh, A.

    2006-01-01

    and quality of reliable gene annotation grows. Results We present a procedure, Agene, that automatically generates a species-specific gene predictor from a set of reliable mRNA sequences and a genome. We apply a Hidden Markov model (HMM) that implements explicit length distribution modelling for all gene......Background The number of sequenced eukaryotic genomes is rapidly increasing. This means that over time it will be hard to keep supplying customised gene finders for each genome. This calls for procedures to automatically generate species-specific gene finders and to re-train them as the quantity...... structure blocks using acyclic discrete phase type distributions. The state structure of the each HMM is generated dynamically from an array of sub-models to include only gene features represented in the training set. Conclusion Acyclic discrete phase type distributions are well suited to model sequence...

  3. Origin and evolution of the self-organizing cytoskeleton in the network of eukaryotic organelles.

    Science.gov (United States)

    Jékely, Gáspár

    2014-09-02

    The eukaryotic cytoskeleton evolved from prokaryotic cytomotive filaments. Prokaryotic filament systems show bewildering structural and dynamic complexity and, in many aspects, prefigure the self-organizing properties of the eukaryotic cytoskeleton. Here, the dynamic properties of the prokaryotic and eukaryotic cytoskeleton are compared, and how these relate to function and evolution of organellar networks is discussed. The evolution of new aspects of filament dynamics in eukaryotes, including severing and branching, and the advent of molecular motors converted the eukaryotic cytoskeleton into a self-organizing "active gel," the dynamics of which can only be described with computational models. Advances in modeling and comparative genomics hold promise of a better understanding of the evolution of the self-organizing cytoskeleton in early eukaryotes, and its role in the evolution of novel eukaryotic functions, such as amoeboid motility, mitosis, and ciliary swimming. Copyright © 2014 Cold Spring Harbor Laboratory Press; all rights reserved.

  4. An alternative method for cDNA cloning from surrogate eukaryotic cells transfected with the corresponding genomic DNA.

    Science.gov (United States)

    Hu, Lin-Yong; Cui, Chen-Chen; Song, Yu-Jie; Wang, Xiang-Guo; Jin, Ya-Ping; Wang, Ai-Hua; Zhang, Yong

    2012-07-01

    cDNA is widely used in gene function elucidation and/or transgenics research but often suitable tissues or cells from which to isolate mRNA for reverse transcription are unavailable. Here, an alternative method for cDNA cloning is described and tested by cloning the cDNA of human LALBA (human alpha-lactalbumin) from genomic DNA. First, genomic DNA containing all of the coding exons was cloned from human peripheral blood and inserted into a eukaryotic expression vector. Next, by delivering the plasmids into either 293T or fibroblast cells, surrogate cells were constructed. Finally, the total RNA was extracted from the surrogate cells and cDNA was obtained by RT-PCR. The human LALBA cDNA that was obtained was compared with the corresponding mRNA published in GenBank. The comparison showed that the two sequences were identical. The novel method for cDNA cloning from surrogate eukaryotic cells described here uses well-established techniques that are feasible and simple to use. We anticipate that this alternative method will have widespread applications.

  5. Enzymes involved in organellar DNA replication in photosynthetic eukaryotes.

    Science.gov (United States)

    Moriyama, Takashi; Sato, Naoki

    2014-01-01

    Plastids and mitochondria possess their own genomes. Although the replication mechanisms of these organellar genomes remain unclear in photosynthetic eukaryotes, several organelle-localized enzymes related to genome replication, including DNA polymerase, DNA primase, DNA helicase, DNA topoisomerase, single-stranded DNA maintenance protein, DNA ligase, primer removal enzyme, and several DNA recombination-related enzymes, have been identified. In the reference Eudicot plant Arabidopsis thaliana, the replication-related enzymes of plastids and mitochondria are similar because many of them are dual targeted to both organelles, whereas in the red alga Cyanidioschyzon merolae, plastids and mitochondria contain different replication machinery components. The enzymes involved in organellar genome replication in green plants and red algae were derived from different origins, including proteobacterial, cyanobacterial, and eukaryotic lineages. In the present review, we summarize the available data for enzymes related to organellar genome replication in green plants and red algae. In addition, based on the type and distribution of replication enzymes in photosynthetic eukaryotes, we discuss the transitional history of replication enzymes in the organelles of plants.

  6. RNase MRP and the RNA processing cascade in the eukaryotic ancestor.

    Science.gov (United States)

    Woodhams, Michael D; Stadler, Peter F; Penny, David; Collins, Lesley J

    2007-02-08

    Within eukaryotes there is a complex cascade of RNA-based macromolecules that process other RNA molecules, especially mRNA, tRNA and rRNA. An example is RNase MRP processing ribosomal RNA (rRNA) in ribosome biogenesis. One hypothesis is that this complexity was present early in eukaryotic evolution; an alternative is that an initial simpler network later gained complexity by gene duplication in lineages that led to animals, fungi and plants. Recently there has been a rapid increase in support for the complexity-early theory because the vast majority of these RNA-processing reactions are found throughout eukaryotes, and thus were likely to be present in the last common ancestor of living eukaryotes, herein called the Eukaryotic Ancestor. We present an overview of the RNA processing cascade in the Eukaryotic Ancestor and investigate in particular, RNase MRP which was previously thought to have evolved later in eukaryotes due to its apparent limited distribution in fungi and animals and plants. Recent publications, as well as our own genomic searches, find previously unknown RNase MRP RNAs, indicating that RNase MRP has a wide distribution in eukaryotes. Combining secondary structure and promoter region analysis of RNAs for RNase MRP, along with analysis of the target substrate (rRNA), allows us to discuss this distribution in the light of eukaryotic evolution. We conclude that RNase MRP can now be placed in the RNA-processing cascade of the Eukaryotic Ancestor, highlighting the complexity of RNA-processing in early eukaryotes. Promoter analyses of MRP-RNA suggest that regulation of the critical processes of rRNA cleavage can vary, showing that even these key cellular processes (for which we expect high conservation) show some species-specific variability. We present our consensus MRP-RNA secondary structure as a useful model for further searches.

  7. Evolutionary Inference across Eukaryotes Identifies Specific Pressures Favoring Mitochondrial Gene Retention.

    Science.gov (United States)

    Johnston, Iain G; Williams, Ben P

    2016-02-24

    Since their endosymbiotic origin, mitochondria have lost most of their genes. Although many selective mechanisms underlying the evolution of mitochondrial genomes have been proposed, a data-driven exploration of these hypotheses is lacking, and a quantitatively supported consensus remains absent. We developed HyperTraPS, a methodology coupling stochastic modeling with Bayesian inference, to identify the ordering of evolutionary events and suggest their causes. Using 2015 complete mitochondrial genomes, we inferred evolutionary trajectories of mtDNA gene loss across the eukaryotic tree of life. We find that proteins comprising the structural cores of the electron transport chain are preferentially encoded within mitochondrial genomes across eukaryotes. A combination of high GC content and high protein hydrophobicity is required to explain patterns of mtDNA gene retention; a model that accounts for these selective pressures can also predict the success of artificial gene transfer experiments in vivo. This work provides a general method for data-driven inference of the ordering of evolutionary and progressive events, here identifying the distinct features shaping mitochondrial genomes of present-day species. Copyright © 2016 Elsevier Inc. All rights reserved.

  8. Megabase replication domains along the human genome: relation to chromatin structure and genome organisation.

    Science.gov (United States)

    Audit, Benjamin; Zaghloul, Lamia; Baker, Antoine; Arneodo, Alain; Chen, Chun-Long; d'Aubenton-Carafa, Yves; Thermes, Claude

    2013-01-01

    In higher eukaryotes, the absence of specific sequence motifs, marking the origins of replication has been a serious hindrance to the understanding of (i) the mechanisms that regulate the spatio-temporal replication program, and (ii) the links between origins activation, chromatin structure and transcription. In this chapter, we review the partitioning of the human genome into megabased-size replication domains delineated as N-shaped motifs in the strand compositional asymmetry profiles. They collectively span 28.3% of the genome and are bordered by more than 1,000 putative replication origins. We recapitulate the comparison of this partition of the human genome with high-resolution experimental data that confirms that replication domain borders are likely to be preferential replication initiation zones in the germline. In addition, we highlight the specific distribution of experimental and numerical chromatin marks along replication domains. Domain borders correspond to particular open chromatin regions, possibly encoded in the DNA sequence, and around which replication and transcription are highly coordinated. These regions also present a high evolutionary breakpoint density, suggesting that susceptibility to breakage might be linked to local open chromatin fiber state. Altogether, this chapter presents a compartmentalization of the human genome into replication domains that are landmarks of the human genome organization and are likely to play a key role in genome dynamics during evolution and in pathological situations.

  9. How natural a kind is "eukaryote?".

    Science.gov (United States)

    Doolittle, W Ford

    2014-06-02

    Systematics balances uneasily between realism and nominalism, uncommitted as to whether biological taxa are discoveries or inventions. If the former, they might be taken as natural kinds. I briefly review some philosophers' concepts of natural kinds and then argue that several of these apply well enough to "eukaryote." Although there are some sticky issues around genomic chimerism and when eukaryotes first appeared, if we allow for degrees in the naturalness of kinds, existing eukaryotes rank highly, higher than prokaryotes. Most biologists feel this intuitively: All I attempt to do here is provide some conceptual justification. Copyright © 2014 Cold Spring Harbor Laboratory Press; all rights reserved.

  10. Phylogenetic distribution of large-scale genome patchiness

    Directory of Open Access Journals (Sweden)

    Hackenberg Michael

    2008-04-01

    Full Text Available Abstract Background The phylogenetic distribution of large-scale genome structure (i.e. mosaic compositional patchiness has been explored mainly by analytical ultracentrifugation of bulk DNA. However, with the availability of large, good-quality chromosome sequences, and the recently developed computational methods to directly analyze patchiness on the genome sequence, an evolutionary comparative analysis can be carried out at the sequence level. Results The local variations in the scaling exponent of the Detrended Fluctuation Analysis are used here to analyze large-scale genome structure and directly uncover the characteristic scales present in genome sequences. Furthermore, through shuffling experiments of selected genome regions, computationally-identified, isochore-like regions were identified as the biological source for the uncovered large-scale genome structure. The phylogenetic distribution of short- and large-scale patchiness was determined in the best-sequenced genome assemblies from eleven eukaryotic genomes: mammals (Homo sapiens, Pan troglodytes, Mus musculus, Rattus norvegicus, and Canis familiaris, birds (Gallus gallus, fishes (Danio rerio, invertebrates (Drosophila melanogaster and Caenorhabditis elegans, plants (Arabidopsis thaliana and yeasts (Saccharomyces cerevisiae. We found large-scale patchiness of genome structure, associated with in silico determined, isochore-like regions, throughout this wide phylogenetic range. Conclusion Large-scale genome structure is detected by directly analyzing DNA sequences in a wide range of eukaryotic chromosome sequences, from human to yeast. In all these genomes, large-scale patchiness can be associated with the isochore-like regions, as directly detected in silico at the sequence level.

  11. Eukaryotic snoRNAs: a paradigm for gene expression flexibility.

    Science.gov (United States)

    Dieci, Giorgio; Preti, Milena; Montanini, Barbara

    2009-08-01

    Small nucleolar RNAs (snoRNAs) are one of the most ancient and numerous families of non-protein-coding RNAs (ncRNAs). The main function of snoRNAs - to guide site-specific rRNA modification - is the same in Archaea and all eukaryotic lineages. In contrast, as revealed by recent genomic and RNomic studies, their genomic organization and expression strategies are the most varied. Seemingly snoRNA coding units have adopted, in the course of evolution, all the possible ways of being transcribed, thus providing a unique paradigm of gene expression flexibility. By focusing on representative fungal, plant and animal genomes, we review here all the documented types of snoRNA gene organization and expression, and we provide a comprehensive account of snoRNA expressional freedom by precisely estimating the frequency, in each genome, of each type of genomic organization. We finally discuss the relevance of snoRNA genomic studies for our general understanding of ncRNA family evolution and expression in eukaryotes.

  12. An Interactive Exercise To Learn Eukaryotic Cell Structure and Organelle Function.

    Science.gov (United States)

    Klionsky, Daniel J.; Tomashek, John J.

    1999-01-01

    Describes a cooperative, interactive problem-solving exercise for studying eukaryotic cell structure and function. Highlights the dynamic aspects of movement through the cell. Contains 15 references. (WRM)

  13. Genome-wide computational identification of microRNAs and their targets in the deep-branching eukaryote Giardia lamblia.

    Science.gov (United States)

    Zhang, Yan-Qiong; Chen, Dong-Liang; Tian, Hai-Feng; Zhang, Bao-Hong; Wen, Jian-Fan

    2009-10-01

    Using a combined computational program, we identified 50 potential microRNAs (miRNAs) in Giardia lamblia, one of the most primitive unicellular eukaryotes. These miRNAs are unique to G. lamblia and no homologues have been found in other organisms; miRNAs, currently known in other species, were not found in G. lamblia. This suggests that miRNA biogenesis and miRNA-mediated gene regulation pathway may evolve independently, especially in evolutionarily distant lineages. A majority (43) of the predicted miRNAs are located at one single locus; however, some miRNAs have two or more copies in the genome. Among the 58 miRNA genes, 28 are located in the intergenic regions whereas 30 are present in the anti-sense strands of the protein-coding sequences. Five predicted miRNAs are expressed in G. lamblia trophozoite cells evidenced by expressed sequence tags or RT-PCR. Thirty-seven identified miRNAs may target 50 protein-coding genes, including seven variant-specific surface proteins (VSPs). Our findings provide a clue that miRNA-mediated gene regulation may exist in the early stage of eukaryotic evolution, suggesting that it is an important regulation system ubiquitous in eukaryotes.

  14. Genome-wide Purification of Extrachromosomal Circular DNA from Eukaryotic Cells

    DEFF Research Database (Denmark)

    Møller, Henrik D.; Bojsen, Rasmus Kenneth; Tachibana, Chris

    2016-01-01

    Extrachromosomal circular DNAs (eccDNAs) are common genetic elements in Saccharomyces cerevisiae and are reported in other eukaryotes as well. EccDNAs contribute to genetic variation among somatic cells in multicellular organisms and to evolution of unicellular eukaryotes. Sensitive methods...

  15. Genome-wide Purification of Extrachromosomal Circular DNA from Eukaryotic Cells

    DEFF Research Database (Denmark)

    Møller, Henrik D.; Bojsen, Rasmus Kenneth; Tachibana, Chris

    2016-01-01

    Extrachromosomal circular DNAs (eccDNAs) are common genetic elements in Saccharomyces cerevisiae and are reported in other eukaryotes as well. EccDNAs contribute to genetic variation among somatic cells in multicellular organisms and to evolution of unicellular eukaryotes. Sensitive methods for d...

  16. Genomic evolution of 11 type strains within family Planctomycetaceae.

    Directory of Open Access Journals (Sweden)

    Min Guo

    Full Text Available The species in family Planctomycetaceae are ideal groups for investigating the origin of eukaryotes. Their cells are divided by a lipidic intracytoplasmic membrane and they share a number of eukaryote-like molecular characteristics. However, their genomic structures, potential abilities, and evolutionary status are still unknown. In this study, we searched for common protein families and a core genome/pan genome based on 11 sequenced species in family Planctomycetaceae. Then, we constructed phylogenetic tree based on their 832 common protein families. We also annotated the 11 genomes using the Clusters of Orthologous Groups database. Moreover, we predicted and reconstructed their core/pan metabolic pathways using the KEGG (Kyoto Encyclopedia of Genes and Genomes orthology system. Subsequently, we identified genomic islands (GIs and structural variations (SVs among the five complete genomes and we specifically investigated the integration of two Planctomycetaceae plasmids in all 11 genomes. The results indicate that Planctomycetaceae species share diverse genomic variations and unique genomic characteristics, as well as have huge potential for human applications.

  17. Maximum likelihood phylogenetic reconstruction from high-resolution whole-genome data and a tree of 68 eukaryotes.

    Science.gov (United States)

    Lin, Yu; Hu, Fei; Tang, Jijun; Moret, Bernard M E

    2013-01-01

    The rapid accumulation of whole-genome data has renewed interest in the study of the evolution of genomic architecture, under such events as rearrangements, duplications, losses. Comparative genomics, evolutionary biology, and cancer research all require tools to elucidate the mechanisms, history, and consequences of those evolutionary events, while phylogenetics could use whole-genome data to enhance its picture of the Tree of Life. Current approaches in the area of phylogenetic analysis are limited to very small collections of closely related genomes using low-resolution data (typically a few hundred syntenic blocks); moreover, these approaches typically do not include duplication and loss events. We describe a maximum likelihood (ML) approach for phylogenetic analysis that takes into account genome rearrangements as well as duplications, insertions, and losses. Our approach can handle high-resolution genomes (with 40,000 or more markers) and can use in the same analysis genomes with very different numbers of markers. Because our approach uses a standard ML reconstruction program (RAxML), it scales up to large trees. We present the results of extensive testing on both simulated and real data showing that our approach returns very accurate results very quickly. In particular, we analyze a dataset of 68 high-resolution eukaryotic genomes, with from 3,000 to 42,000 genes, from the eGOB database; the analysis, including bootstrapping, takes just 3 hours on a desktop system and returns a tree in agreement with all well supported branches, while also suggesting resolutions for some disputed placements.

  18. The candidate phylum Poribacteria by single-cell genomics: new insights into phylogeny, cell-compartmentation, eukaryote-like repeat proteins, and other genomic features.

    Directory of Open Access Journals (Sweden)

    Janine Kamke

    Full Text Available The candidate phylum Poribacteria is one of the most dominant and widespread members of the microbial communities residing within marine sponges. Cell compartmentalization had been postulated along with their discovery about a decade ago and their phylogenetic association to the Planctomycetes, Verrucomicrobia, Chlamydiae superphylum was proposed soon thereafter. In the present study we revised these features based on genomic data obtained from six poribacterial single cells. We propose that Poribacteria form a distinct monophyletic phylum contiguous to the PVC superphylum together with other candidate phyla. Our genomic analyses supported the possibility of cell compartmentalization in form of bacterial microcompartments. Further analyses of eukaryote-like protein domains stressed the importance of such proteins with features including tetratricopeptide repeats, leucin rich repeats as well as low density lipoproteins receptor repeats, the latter of which are reported here for the first time from a sponge symbiont. Finally, examining the most abundant protein domain family on poribacterial genomes revealed diverse phyH family proteins, some of which may be related to dissolved organic posphorus uptake.

  19. Unicellular eukaryotes as models in cell and molecular biology: critical appraisal of their past and future value.

    Science.gov (United States)

    Simon, Martin; Plattner, Helmut

    2014-01-01

    Unicellular eukaryotes have been appreciated as model systems for the analysis of crucial questions in cell and molecular biology. This includes Dictyostelium (chemotaxis, amoeboid movement, phagocytosis), Tetrahymena (telomere structure, telomerase function), Paramecium (variant surface antigens, exocytosis, phagocytosis cycle) or both ciliates (ciliary beat regulation, surface pattern formation), Chlamydomonas (flagellar biogenesis and beat), and yeast (S. cerevisiae) for innumerable aspects. Nowadays many problems may be tackled with "higher" eukaryotic/metazoan cells for which full genomic information as well as domain databases, etc., were available long before protozoa. Established molecular tools, commercial antibodies, and established pharmacology are additional advantages available for higher eukaryotic cells. Moreover, an increasing number of inherited genetic disturbances in humans have become elucidated and can serve as new models. Among lower eukaryotes, yeast will remain a standard model because of its peculiarities, including its reduced genome and availability in the haploid form. But do protists still have a future as models? This touches not only the basic understanding of biology but also practical aspects of research, such as fund raising. As we try to scrutinize, due to specific advantages some protozoa should and will remain favorable models for analyzing novel genes or specific aspects of cell structure and function. Outstanding examples are epigenetic phenomena-a field of rising interest. © 2014 Elsevier Inc. All rights reserved.

  20. Genomic minimalism in the early diverging intestinal parasite Giardia lamblia.

    Science.gov (United States)

    Morrison, Hilary G; McArthur, Andrew G; Gillin, Frances D; Aley, Stephen B; Adam, Rodney D; Olsen, Gary J; Best, Aaron A; Cande, W Zacheus; Chen, Feng; Cipriano, Michael J; Davids, Barbara J; Dawson, Scott C; Elmendorf, Heidi G; Hehl, Adrian B; Holder, Michael E; Huse, Susan M; Kim, Ulandt U; Lasek-Nesselquist, Erica; Manning, Gerard; Nigam, Anuranjini; Nixon, Julie E J; Palm, Daniel; Passamaneck, Nora E; Prabhu, Anjali; Reich, Claudia I; Reiner, David S; Samuelson, John; Svard, Staffan G; Sogin, Mitchell L

    2007-09-28

    The genome of the eukaryotic protist Giardia lamblia, an important human intestinal parasite, is compact in structure and content, contains few introns or mitochondrial relics, and has simplified machinery for DNA replication, transcription, RNA processing, and most metabolic pathways. Protein kinases comprise the single largest protein class and reflect Giardia's requirement for a complex signal transduction network for coordinating differentiation. Lateral gene transfer from bacterial and archaeal donors has shaped Giardia's genome, and previously unknown gene families, for example, cysteine-rich structural proteins, have been discovered. Unexpectedly, the genome shows little evidence of heterozygosity, supporting recent speculations that this organism is sexual. This genome sequence will not only be valuable for investigating the evolution of eukaryotes, but will also be applied to the search for new therapeutics for this parasite.

  1. Positive selection for unpreferred codon usage in eukaryotic genomes

    Directory of Open Access Journals (Sweden)

    Galagan James E

    2007-07-01

    Full Text Available Abstract Background Natural selection has traditionally been understood as a force responsible for pushing genes to states of higher translational efficiency, whereas lower translational efficiency has been explained by neutral mutation and genetic drift. We looked for evidence of directional selection resulting in increased unpreferred codon usage (and presumably reduced translational efficiency in three divergent clusters of eukaryotic genomes using a simple optimal-codon-based metric (Kp/Ku. Results Here we show that for some genes natural selection is indeed responsible for causing accelerated unpreferred codon substitution, and document the scope of this selection. In Cryptococcus and to a lesser extent Drosophila, we find many genes showing a statistically significant signal of selection for unpreferred codon usage in one or more lineages. We did not find evidence for this type of selection in Saccharomyces. The signal of positive selection observed from unpreferred synonymous codon substitutions is coincident in Cryptococcus and Drosophila with the distribution of upstream open reading frames (uORFs, another genic feature known to reduce translational efficiency. Functional enrichment analysis of genes exhibiting low Kp/Ku ratios reveals that genes in regulatory roles are particularly subject to this type of selection. Conclusion Through genome-wide scans, we find recent selection for unpreferred codon usage at approximately 1% of genetic loci in a Cryptococcus and several genes in Drosophila. Unpreferred codons can impede translation efficiency, and we find that genes with translation-impeding uORFs are enriched for this selection signal. We find that regulatory genes are particularly likely to be subject to selection for unpreferred codon usage. Given that expression noise can propagate through regulatory cascades, and that low translational efficiency can reduce expression noise, this finding supports the hypothesis that translational

  2. Structural basis for the initiation of eukaryotic transcription-coupled DNA repair.

    Science.gov (United States)

    Xu, Jun; Lahiri, Indrajit; Wang, Wei; Wier, Adam; Cianfrocco, Michael A; Chong, Jenny; Hare, Alissa A; Dervan, Peter B; DiMaio, Frank; Leschziner, Andres E; Wang, Dong

    2017-11-30

    Eukaryotic transcription-coupled repair (TCR) is an important and well-conserved sub-pathway of nucleotide excision repair that preferentially removes DNA lesions from the template strand that block translocation of RNA polymerase II (Pol II). Cockayne syndrome group B (CSB, also known as ERCC6) protein in humans (or its yeast orthologues, Rad26 in Saccharomyces cerevisiae and Rhp26 in Schizosaccharomyces pombe) is among the first proteins to be recruited to the lesion-arrested Pol II during the initiation of eukaryotic TCR. Mutations in CSB are associated with the autosomal-recessive neurological disorder Cockayne syndrome, which is characterized by progeriod features, growth failure and photosensitivity. The molecular mechanism of eukaryotic TCR initiation remains unclear, with several long-standing unanswered questions. How cells distinguish DNA lesion-arrested Pol II from other forms of arrested Pol II, the role of CSB in TCR initiation, and how CSB interacts with the arrested Pol II complex are all unknown. The lack of structures of CSB or the Pol II-CSB complex has hindered our ability to address these questions. Here we report the structure of the S. cerevisiae Pol II-Rad26 complex solved by cryo-electron microscopy. The structure reveals that Rad26 binds to the DNA upstream of Pol II, where it markedly alters its path. Our structural and functional data suggest that the conserved Swi2/Snf2-family core ATPase domain promotes the forward movement of Pol II, and elucidate key roles for Rad26 in both TCR and transcription elongation.

  3. Intermediary metabolism in protists: a sequence-based view of facultative anaerobic metabolism in evolutionarily diverse eukaryotes.

    Science.gov (United States)

    Ginger, Michael L; Fritz-Laylin, Lillian K; Fulton, Chandler; Cande, W Zacheus; Dawson, Scott C

    2010-12-01

    Protists account for the bulk of eukaryotic diversity. Through studies of gene and especially genome sequences the molecular basis for this diversity can be determined. Evident from genome sequencing are examples of versatile metabolism that go far beyond the canonical pathways described for eukaryotes in textbooks. In the last 2-3 years, genome sequencing and transcript profiling has unveiled several examples of heterotrophic and phototrophic protists that are unexpectedly well-equipped for ATP production using a facultative anaerobic metabolism, including some protists that can (Chlamydomonas reinhardtii) or are predicted (Naegleria gruberi, Acanthamoeba castellanii, Amoebidium parasiticum) to produce H(2) in their metabolism. It is possible that some enzymes of anaerobic metabolism were acquired and distributed among eukaryotes by lateral transfer, but it is also likely that the common ancestor of eukaryotes already had far more metabolic versatility than was widely thought a few years ago. The discussion of core energy metabolism in unicellular eukaryotes is the subject of this review. Since genomic sequencing has so far only touched the surface of protist diversity, it is anticipated that sequences of additional protists may reveal an even wider range of metabolic capabilities, while simultaneously enriching our understanding of the early evolution of eukaryotes. Copyright © 2010 Elsevier GmbH. All rights reserved.

  4. Camps 2.0: exploring the sequence and structure space of prokaryotic, eukaryotic, and viral membrane proteins.

    Science.gov (United States)

    Neumann, Sindy; Hartmann, Holger; Martin-Galiano, Antonio J; Fuchs, Angelika; Frishman, Dmitrij

    2012-03-01

    Structural bioinformatics of membrane proteins is still in its infancy, and the picture of their fold space is only beginning to emerge. Because only a handful of three-dimensional structures are available, sequence comparison and structure prediction remain the main tools for investigating sequence-structure relationships in membrane protein families. Here we present a comprehensive analysis of the structural families corresponding to α-helical membrane proteins with at least three transmembrane helices. The new version of our CAMPS database (CAMPS 2.0) covers nearly 1300 eukaryotic, prokaryotic, and viral genomes. Using an advanced classification procedure, which is based on high-order hidden Markov models and considers both sequence similarity as well as the number of transmembrane helices and loop lengths, we identified 1353 structurally homogeneous clusters roughly corresponding to membrane protein folds. Only 53 clusters are associated with experimentally determined three-dimensional structures, and for these clusters CAMPS is in reasonable agreement with structure-based classification approaches such as SCOP and CATH. We therefore estimate that ∼1300 structures would need to be determined to provide a sufficient structural coverage of polytopic membrane proteins. CAMPS 2.0 is available at http://webclu.bio.wzw.tum.de/CAMPS2.0/. Copyright © 2011 Wiley Periodicals, Inc.

  5. Extensive gene content variation in the Brachypodium distachyon pan-genome correlates with population structure.

    Science.gov (United States)

    Gordon, Sean P; Contreras-Moreira, Bruno; Woods, Daniel P; Des Marais, David L; Burgess, Diane; Shu, Shengqiang; Stritt, Christoph; Roulin, Anne C; Schackwitz, Wendy; Tyler, Ludmila; Martin, Joel; Lipzen, Anna; Dochy, Niklas; Phillips, Jeremy; Barry, Kerrie; Geuten, Koen; Budak, Hikmet; Juenger, Thomas E; Amasino, Richard; Caicedo, Ana L; Goodstein, David; Davidson, Patrick; Mur, Luis A J; Figueroa, Melania; Freeling, Michael; Catalan, Pilar; Vogel, John P

    2017-12-19

    While prokaryotic pan-genomes have been shown to contain many more genes than any individual organism, the prevalence and functional significance of differentially present genes in eukaryotes remains poorly understood. Whole-genome de novo assembly and annotation of 54 lines of the grass Brachypodium distachyon yield a pan-genome containing nearly twice the number of genes found in any individual genome. Genes present in all lines are enriched for essential biological functions, while genes present in only some lines are enriched for conditionally beneficial functions (e.g., defense and development), display faster evolutionary rates, lie closer to transposable elements and are less likely to be syntenic with orthologous genes in other grasses. Our data suggest that differentially present genes contribute substantially to phenotypic variation within a eukaryote species, these genes have a major influence in population genetics, and transposable elements play a key role in pan-genome evolution.

  6. Three-dimensional structural analysis of eukaryotic flagella/cilia by electron cryo-tomography

    International Nuclear Information System (INIS)

    Bui, Khanh Huy; Pigino, Gaia; Ishikawa, Takashi

    2011-01-01

    Based on the molecular architecture revealed by electron cryo-tomography, the mechanism of the bending motion of eukaryotic flagella/cilia is discussed. Electron cryo-tomography is a potential approach to analyzing the three-dimensional conformation of frozen hydrated biological macromolecules using electron microscopy. Since projections of each individual object illuminated from different orientations are merged, electron tomography is capable of structural analysis of such heterogeneous environments as in vivo or with polymorphism, although radiation damage and the missing wedge are severe problems. Here, recent results on the structure of eukaryotic flagella, which is an ATP-driven bending organelle, from green algae Chlamydomonas are presented. Tomographic analysis reveals asymmetric molecular arrangements, especially that of the dynein motor proteins, in flagella, giving insight into the mechanism of planar asymmetric bending motion. Methodological challenges to obtaining higher-resolution structures from this technique are also discussed

  7. Eukaryotic acquisition of a bacterial operon

    Science.gov (United States)

    The yeast Saccharomyces cerevisiae is one of the champions of basic biomedical research due to its compact eukaryotic genome and ease of experimental manipulation. Despite these immense strengths, its impact on understanding the genetic basis of natural phenotypic variation has been limited by strai...

  8. Insights into the red algae and eukaryotic evolution from the genome of Porphyra umbilicalis (Bangiophyceae, Rhodophyta).

    Science.gov (United States)

    Brawley, Susan H; Blouin, Nicolas A; Ficko-Blean, Elizabeth; Wheeler, Glen L; Lohr, Martin; Goodson, Holly V; Jenkins, Jerry W; Blaby-Haas, Crysten E; Helliwell, Katherine E; Chan, Cheong Xin; Marriage, Tara N; Bhattacharya, Debashish; Klein, Anita S; Badis, Yacine; Brodie, Juliet; Cao, Yuanyu; Collén, Jonas; Dittami, Simon M; Gachon, Claire M M; Green, Beverley R; Karpowicz, Steven J; Kim, Jay W; Kudahl, Ulrich Johan; Lin, Senjie; Michel, Gurvan; Mittag, Maria; Olson, Bradley J S C; Pangilinan, Jasmyn L; Peng, Yi; Qiu, Huan; Shu, Shengqiang; Singer, John T; Smith, Alison G; Sprecher, Brittany N; Wagner, Volker; Wang, Wenfei; Wang, Zhi-Yong; Yan, Juying; Yarish, Charles; Zäuner-Riek, Simone; Zhuang, Yunyun; Zou, Yong; Lindquist, Erika A; Grimwood, Jane; Barry, Kerrie W; Rokhsar, Daniel S; Schmutz, Jeremy; Stiller, John W; Grossman, Arthur R; Prochnik, Simon E

    2017-08-01

    Porphyra umbilicalis (laver) belongs to an ancient group of red algae (Bangiophyceae), is harvested for human food, and thrives in the harsh conditions of the upper intertidal zone. Here we present the 87.7-Mbp haploid Porphyra genome (65.8% G + C content, 13,125 gene loci) and elucidate traits that inform our understanding of the biology of red algae as one of the few multicellular eukaryotic lineages. Novel features of the Porphyra genome shared by other red algae relate to the cytoskeleton, calcium signaling, the cell cycle, and stress-tolerance mechanisms including photoprotection. Cytoskeletal motor proteins in Porphyra are restricted to a small set of kinesins that appear to be the only universal cytoskeletal motors within the red algae. Dynein motors are absent, and most red algae, including Porphyra , lack myosin. This surprisingly minimal cytoskeleton offers a potential explanation for why red algal cells and multicellular structures are more limited in size than in most multicellular lineages. Additional discoveries further relating to the stress tolerance of bangiophytes include ancestral enzymes for sulfation of the hydrophilic galactan-rich cell wall, evidence for mannan synthesis that originated before the divergence of green and red algae, and a high capacity for nutrient uptake. Our analyses provide a comprehensive understanding of the red algae, which are both commercially important and have played a major role in the evolution of other algal groups through secondary endosymbioses.

  9. Anaerobic energy metabolism in unicellular photosynthetic eukaryotes.

    Science.gov (United States)

    Atteia, Ariane; van Lis, Robert; Tielens, Aloysius G M; Martin, William F

    2013-02-01

    Anaerobic metabolic pathways allow unicellular organisms to tolerate or colonize anoxic environments. Over the past ten years, genome sequencing projects have brought a new light on the extent of anaerobic metabolism in eukaryotes. A surprising development has been that free-living unicellular algae capable of photoautotrophic lifestyle are, in terms of their enzymatic repertoire, among the best equipped eukaryotes known when it comes to anaerobic energy metabolism. Some of these algae are marine organisms, common in the oceans, others are more typically soil inhabitants. All these species are important from the ecological (O(2)/CO(2) budget), biotechnological, and evolutionary perspectives. In the unicellular algae surveyed here, mixed-acid type fermentations are widespread while anaerobic respiration, which is more typical of eukaryotic heterotrophs, appears to be rare. The presence of a core anaerobic metabolism among the algae provides insights into its evolutionary origin, which traces to the eukaryote common ancestor. The predicted fermentative enzymes often exhibit an amino acid extension at the N-terminus, suggesting that these proteins might be compartmentalized in the cell, likely in the chloroplast or the mitochondrion. The green algae Chlamydomonas reinhardtii and Chlorella NC64 have the most extended set of fermentative enzymes reported so far. Among the eukaryotes with secondary plastids, the diatom Thalassiosira pseudonana has the most pronounced anaerobic capabilities as yet. From the standpoints of genomic, transcriptomic, and biochemical studies, anaerobic energy metabolism in C. reinhardtii remains the best characterized among photosynthetic protists. This article is part of a Special Issue entitled: The evolutionary aspects of bioenergetic systems. Copyright © 2012 Elsevier B.V. All rights reserved.

  10. Eukaryotic systematics: a user's guide for cell biologists and parasitologists.

    Science.gov (United States)

    Walker, Giselle; Dorrell, Richard G; Schlacht, Alexander; Dacks, Joel B

    2011-11-01

    Single-celled parasites like Entamoeba, Trypanosoma, Phytophthora and Plasmodium wreak untold havoc on human habitat and health. Understanding the position of the various protistan pathogens in the larger context of eukaryotic diversity informs our study of how these parasites operate on a cellular level, as well as how they have evolved. Here, we review the literature that has brought our understanding of eukaryotic relationships from an idea of parasites as primitive cells to a crystallized view of diversity that encompasses 6 major divisions, or supergroups, of eukaryotes. We provide an updated taxonomic scheme (for 2011), based on extensive genomic, ultrastructural and phylogenetic evidence, with three differing levels of taxonomic detail for ease of referencing and accessibility (see supplementary material at Cambridge Journals On-line). Two of the most pressing issues in cellular evolution, the root of the eukaryotic tree and the evolution of photosynthesis in complex algae, are also discussed along with ideas about what the new generation of genome sequencing technologies may contribute to the field of eukaryotic systematics. We hope that, armed with this user's guide, cell biologists and parasitologists will be encouraged about taking an increasingly evolutionary point of view in the battle against parasites representing real dangers to our livelihoods and lives.

  11. Remarkable interkingdom conservation of intron positions and massive, lineage-specific intron loss and gain in eukaryotic evolution.

    Science.gov (United States)

    Rogozin, Igor B; Wolf, Yuri I; Sorokin, Alexander V; Mirkin, Boris G; Koonin, Eugene V

    2003-09-02

    Sequencing of eukaryotic genomes allows one to address major evolutionary problems, such as the evolution of gene structure. We compared the intron positions in 684 orthologous gene sets from 8 complete genomes of animals, plants, fungi, and protists and constructed parsimonious scenarios of evolution of the exon-intron structure for the respective genes. Approximately one-third of the introns in the malaria parasite Plasmodium falciparum are shared with at least one crown group eukaryote; this number indicates that these introns have been conserved through >1.5 billion years of evolution that separate Plasmodium from the crown group. Paradoxically, humans share many more introns with the plant Arabidopsis thaliana than with the fly or nematode. The inferred evolutionary scenario holds that the common ancestor of Plasmodium and the crown group and, especially, the common ancestor of animals, plants, and fungi had numerous introns. Most of these ancestral introns, which are retained in the genomes of vertebrates and plants, have been lost in fungi, nematodes, arthropods, and probably Plasmodium. In addition, numerous introns have been inserted into vertebrate and plant genes, whereas, in other lineages, intron gain was much less prominent.

  12. Horizontal transfer of a eukaryotic plastid-targeted protein gene to cyanobacteria

    Directory of Open Access Journals (Sweden)

    Keeling Patrick J

    2007-06-01

    Full Text Available Abstract Background Horizontal or lateral transfer of genetic material between distantly related prokaryotes has been shown to play a major role in the evolution of bacterial and archaeal genomes, but exchange of genes between prokaryotes and eukaryotes is not as well understood. In particular, gene flow from eukaryotes to prokaryotes is rarely documented with strong support, which is unusual since prokaryotic genomes appear to readily accept foreign genes. Results Here, we show that abundant marine cyanobacteria in the related genera Synechococcus and Prochlorococcus acquired a key Calvin cycle/glycolytic enzyme from a eukaryote. Two non-homologous forms of fructose bisphosphate aldolase (FBA are characteristic of eukaryotes and prokaryotes respectively. However, a eukaryotic gene has been inserted immediately upstream of the ancestral prokaryotic gene in several strains (ecotypes of Synechococcus and Prochlorococcus. In one lineage this new gene has replaced the ancestral gene altogether. The eukaryotic gene is most closely related to the plastid-targeted FBA from red algae. This eukaryotic-type FBA once replaced the plastid/cyanobacterial type in photosynthetic eukaryotes, hinting at a possible functional advantage in Calvin cycle reactions. The strains that now possess this eukaryotic FBA are scattered across the tree of Synechococcus and Prochlorococcus, perhaps because the gene has been transferred multiple times among cyanobacteria, or more likely because it has been selectively retained only in certain lineages. Conclusion A gene for plastid-targeted FBA has been transferred from red algae to cyanobacteria, where it has inserted itself beside its non-homologous, functional analogue. Its current distribution in Prochlorococcus and Synechococcus is punctate, suggesting a complex history since its introduction to this group.

  13. Multi-scale coding of genomic information: From DNA sequence to genome structure and function

    International Nuclear Information System (INIS)

    Arneodo, Alain; Vaillant, Cedric; Audit, Benjamin; Argoul, Francoise; D'Aubenton-Carafa, Yves; Thermes, Claude

    2011-01-01

    Understanding how chromatin is spatially and dynamically organized in the nucleus of eukaryotic cells and how this affects genome functions is one of the main challenges of cell biology. Since the different orders of packaging in the hierarchical organization of DNA condition the accessibility of DNA sequence elements to trans-acting factors that control the transcription and replication processes, there is actually a wealth of structural and dynamical information to learn in the primary DNA sequence. In this review, we show that when using concepts, methodologies, numerical and experimental techniques coming from statistical mechanics and nonlinear physics combined with wavelet-based multi-scale signal processing, we are able to decipher the multi-scale sequence encoding of chromatin condensation-decondensation mechanisms that play a fundamental role in regulating many molecular processes involved in nuclear functions.

  14. INVESTIGATIONS INTO MOLECULAR PATHWAYS IN THE POST GENOME ERA: CROSS SPECIES COMPARATIVE GENOMICS APPROACH

    Science.gov (United States)

    Genome sequencing efforts in the past decade were aimed at generating draft sequences of many prokaryotic and eukaryotic model organisms. Successful completion of unicellular eukaryotes, worm, fly and human genome have opened up the new field of molecular biology and function...

  15. Multiple roles of genome-attached bacteriophage terminal proteins

    International Nuclear Information System (INIS)

    Redrejo-Rodríguez, Modesto; Salas, Margarita

    2014-01-01

    Protein-primed replication constitutes a generalized mechanism to initiate DNA or RNA synthesis in linear genomes, including viruses, gram-positive bacteria, linear plasmids and mobile elements. By this mechanism a specific amino acid primes replication and becomes covalently linked to the genome ends. Despite the fact that TPs lack sequence homology, they share a similar structural arrangement, with the priming residue in the C-terminal half of the protein and an accumulation of positively charged residues at the N-terminal end. In addition, various bacteriophage TPs have been shown to have DNA-binding capacity that targets TPs and their attached genomes to the host nucleoid. Furthermore, a number of bacteriophage TPs from different viral families and with diverse hosts also contain putative nuclear localization signals and localize in the eukaryotic nucleus, which could lead to the transport of the attached DNA. This suggests a possible role of bacteriophage TPs in prokaryote-to-eukaryote horizontal gene transfer. - Highlights: • Protein-primed genome replication constitutes a strategy to initiate DNA or RNA synthesis in linear genomes. • Bacteriophage terminal proteins (TPs) are covalently attached to viral genomes by their primary function priming DNA replication. • TPs are also DNA-binding proteins and target phage genomes to the host nucleoid. • TPs can also localize in the eukaryotic nucleus and may have a role in phage-mediated interkingdom gene transfer

  16. Multiple roles of genome-attached bacteriophage terminal proteins

    Energy Technology Data Exchange (ETDEWEB)

    Redrejo-Rodríguez, Modesto; Salas, Margarita, E-mail: msalas@cbm.csic.es

    2014-11-15

    Protein-primed replication constitutes a generalized mechanism to initiate DNA or RNA synthesis in linear genomes, including viruses, gram-positive bacteria, linear plasmids and mobile elements. By this mechanism a specific amino acid primes replication and becomes covalently linked to the genome ends. Despite the fact that TPs lack sequence homology, they share a similar structural arrangement, with the priming residue in the C-terminal half of the protein and an accumulation of positively charged residues at the N-terminal end. In addition, various bacteriophage TPs have been shown to have DNA-binding capacity that targets TPs and their attached genomes to the host nucleoid. Furthermore, a number of bacteriophage TPs from different viral families and with diverse hosts also contain putative nuclear localization signals and localize in the eukaryotic nucleus, which could lead to the transport of the attached DNA. This suggests a possible role of bacteriophage TPs in prokaryote-to-eukaryote horizontal gene transfer. - Highlights: • Protein-primed genome replication constitutes a strategy to initiate DNA or RNA synthesis in linear genomes. • Bacteriophage terminal proteins (TPs) are covalently attached to viral genomes by their primary function priming DNA replication. • TPs are also DNA-binding proteins and target phage genomes to the host nucleoid. • TPs can also localize in the eukaryotic nucleus and may have a role in phage-mediated interkingdom gene transfer.

  17. SINEs, evolution and genome structure in the opossum.

    Science.gov (United States)

    Gu, Wanjun; Ray, David A; Walker, Jerilyn A; Barnes, Erin W; Gentles, Andrew J; Samollow, Paul B; Jurka, Jerzy; Batzer, Mark A; Pollock, David D

    2007-07-01

    Short INterspersed Elements (SINEs) are non-autonomous retrotransposons, usually between 100 and 500 base pairs (bp) in length, which are ubiquitous components of eukaryotic genomes. Their activity, distribution, and evolution can be highly informative on genomic structure and evolutionary processes. To determine recent activity, we amplified more than one hundred SINE1 loci in a panel of 43 M. domestica individuals derived from five diverse geographic locations. The SINE1 family has expanded recently enough that many loci were polymorphic, and the SINE1 insertion-based genetic distances among populations reflected geographic distance. Genome-wide comparisons of SINE1 densities and GC content revealed that high SINE1 density is associated with high GC content in a few long and many short spans. Young SINE1s, whether fixed or polymorphic, showed an unbiased GC content preference for insertion, indicating that the GC preference accumulates over long time periods, possibly in periodic bursts. SINE1 evolution is thus broadly similar to human Alu evolution, although it has an independent origin. High GC content adjacent to SINE1s is strongly correlated with bias towards higher AT to GC substitutions and lower GC to AT substitutions. This is consistent with biased gene conversion, and also indicates that like chickens, but unlike eutherian mammals, GC content heterogeneity (isochore structure) is reinforced by substitution processes in the M. domestica genome. Nevertheless, both high and low GC content regions are apparently headed towards lower GC content equilibria, possibly due to a relative shift to lower recombination rates in the recent Monodelphis ancestral lineage. Like eutherians, metatherian (marsupial) mammals have evolved high CpG substitution rates, but this is apparently a convergence in process rather than a shared ancestral state.

  18. Lateral gene transfer between prokaryotes and multicellular eukaryotes: ongoing and significant?

    NARCIS (Netherlands)

    Ros, V.I.D.; Hurst, G.D.D.

    2009-01-01

    The expansion of genome sequencing projects has produced accumulating evidence for lateral transfer of genes between prokaryotic and eukaryotic genomes. However, it remains controversial whether these genes are of functional importance in their recipient host. Nikoh and Nakabachi, in a recent paper

  19. GOBASE: an organelle genome database

    OpenAIRE

    O?Brien, Emmet A.; Zhang, Yue; Wang, Eric; Marie, Veronique; Badejoko, Wole; Lang, B. Franz; Burger, Gertraud

    2008-01-01

    The organelle genome database GOBASE, now in its 21st release (June 2008), contains all published mitochondrion-encoded sequences (?913 000) and chloroplast-encoded sequences (?250 000) from a wide range of eukaryotic taxa. For all sequences, information on related genes, exons, introns, gene products and taxonomy is available, as well as selected genome maps and RNA secondary structures. Recent major enhancements to database functionality include: (i) addition of an interface for RNA editing...

  20. The Eukaryotic Pathogen Databases: a functional genomic resource integrating data from human and veterinary parasites.

    Science.gov (United States)

    Harb, Omar S; Roos, David S

    2015-01-01

    Over the past 20 years, advances in high-throughput biological techniques and the availability of computational resources including fast Internet access have resulted in an explosion of large genome-scale data sets "big data." While such data are readily available for download and personal use and analysis from a variety of repositories, often such analysis requires access to seldom-available computational skills. As a result a number of databases have emerged to provide scientists with online tools enabling the interrogation of data without the need for sophisticated computational skills beyond basic knowledge of Internet browser utility. This chapter focuses on the Eukaryotic Pathogen Databases (EuPathDB: http://eupathdb.org) Bioinformatic Resource Center (BRC) and illustrates some of the available tools and methods.

  1. Evidence of ancient genome reduction in red algae (Rhodophyta).

    Science.gov (United States)

    Qiu, Huan; Price, Dana C; Yang, Eun Chan; Yoon, Hwan Su; Bhattacharya, Debashish

    2015-08-01

    Red algae (Rhodophyta) comprise a monophyletic eukaryotic lineage of ~6,500 species with a fossil record that extends back 1.2 billion years. A surprising aspect of red algal evolution is that sequenced genomes encode a relatively limited gene inventory (~5-10 thousand genes) when compared with other free-living algae or to other eukaryotes. This suggests that the common ancestor of red algae may have undergone extensive genome reduction, which can result from lineage specialization to a symbiotic or parasitic lifestyle or adaptation to an extreme or oligotrophic environment. We gathered genome and transcriptome data from a total of 14 red algal genera that represent the major branches of this phylum to study genome evolution in Rhodophyta. Analysis of orthologous gene gains and losses identifies two putative major phases of genome reduction: (i) in the stem lineage leading to all red algae resulting in the loss of major functions such as flagellae and basal bodies, the glycosyl-phosphatidylinositol anchor biosynthesis pathway, and the autophagy regulation pathway; and (ii) in the common ancestor of the extremophilic Cyanidiophytina. Red algal genomes are also characterized by the recruitment of hundreds of bacterial genes through horizontal gene transfer that have taken on multiple functions in shared pathways and have replaced eukaryotic gene homologs. Our results suggest that Rhodophyta may trace their origin to a gene depauperate ancestor. Unlike plants, it appears that a limited gene inventory is sufficient to support the diversification of a major eukaryote lineage that possesses sophisticated multicellular reproductive structures and an elaborate triphasic sexual cycle. © 2015 Phycological Society of America.

  2. Uncoupling of Sister Replisomes during Eukaryotic DNA Replication

    NARCIS (Netherlands)

    Yardimci, Hasan; Loveland, Anna B.; Habuchi, Satoshi; van Oijen, Antoine M.; Walter, Johannes C.

    2010-01-01

    The duplication of eukaryotic genomes involves the replication of DNA from multiple origins of replication. In S phase, two sister replisomes assemble at each active origin, and they replicate DNA in opposite directions. Little is known about the functional relationship between sister replisomes.

  3. Archaeal Genome Guardians Give Insights into Eukaryotic DNA Replication and Damage Response Proteins

    Directory of Open Access Journals (Sweden)

    David S. Shin

    2014-01-01

    Full Text Available As the third domain of life, archaea, like the eukarya and bacteria, must have robust DNA replication and repair complexes to ensure genome fidelity. Archaea moreover display a breadth of unique habitats and characteristics, and structural biologists increasingly appreciate these features. As archaea include extremophiles that can withstand diverse environmental stresses, they provide fundamental systems for understanding enzymes and pathways critical to genome integrity and stress responses. Such archaeal extremophiles provide critical data on the periodic table for life as well as on the biochemical, geochemical, and physical limitations to adaptive strategies allowing organisms to thrive under environmental stress relevant to determining the boundaries for life as we know it. Specifically, archaeal enzyme structures have informed the architecture and mechanisms of key DNA repair proteins and complexes. With added abilities to temperature-trap flexible complexes and reveal core domains of transient and dynamic complexes, these structures provide insights into mechanisms of maintaining genome integrity despite extreme environmental stress. The DNA damage response protein structures noted in this review therefore inform the basis for genome integrity in the face of environmental stress, with implications for all domains of life as well as for biomanufacturing, astrobiology, and medicine.

  4. Radiotaxons and reliability of a genome

    International Nuclear Information System (INIS)

    Korogodin, V.I.

    1982-01-01

    Radiosensitivity of cells (D 0 ) is considered with regard to the structural organization of the genome. The following terms are introduced: ''karyotaxon'', organisms with identical structural organization of the genome, and ''specific genome stability'' K=D 0 C, where C is the quantity of DNA in the cell nucleus; K is the amount of energy (eV) the sorption of which in DNA is necessary and sufficient for one elementary damage to occur. It was shown that Ksub(i)=const. within every karyotaxon ''i''. K 1 =100 eV for viruses, and K 4 =61000 eV for the highest level of genome organization (diploid eukaryotes including man). Potential mechanisms of increasing Ksub(i) with increasing level of genome organization and the role of this factor in evolution are discussed [ru

  5. Atypical mitochondrial inheritance patterns in eukaryotes.

    Science.gov (United States)

    Breton, Sophie; Stewart, Donald T

    2015-10-01

    Mitochondrial DNA (mtDNA) is predominantly maternally inherited in eukaryotes. Diverse molecular mechanisms underlying the phenomenon of strict maternal inheritance (SMI) of mtDNA have been described, but the evolutionary forces responsible for its predominance in eukaryotes remain to be elucidated. Exceptions to SMI have been reported in diverse eukaryotic taxa, leading to the prediction that several distinct molecular mechanisms controlling mtDNA transmission are present among the eukaryotes. We propose that these mechanisms will be better understood by studying the deviations from the predominating pattern of SMI. This minireview summarizes studies on eukaryote species with unusual or rare mitochondrial inheritance patterns, i.e., other than the predominant SMI pattern, such as maternal inheritance of stable heteroplasmy, paternal leakage of mtDNA, biparental and strictly paternal inheritance, and doubly uniparental inheritance of mtDNA. The potential genes and mechanisms involved in controlling mitochondrial inheritance in these organisms are discussed. The linkage between mitochondrial inheritance and sex determination is also discussed, given that the atypical systems of mtDNA inheritance examined in this minireview are frequently found in organisms with uncommon sexual systems such as gynodioecy, monoecy, or andromonoecy. The potential of deviations from SMI for facilitating a better understanding of a number of fundamental questions in biology, such as the evolution of mtDNA inheritance, the coevolution of nuclear and mitochondrial genomes, and, perhaps, the role of mitochondria in sex determination, is considerable.

  6. The Chlamydomonas Genome Reveals the Evolution of Key Animal and Plant Functions

    Energy Technology Data Exchange (ETDEWEB)

    Merchant, Sabeeha S

    2007-04-09

    Chlamydomonas reinhardtii is a unicellular green alga whose lineage diverged from land plants over 1 billion years ago. It is a model system for studying chloroplast-based photosynthesis, as well as the structure, assembly, and function of eukaryotic flagella (cilia), which were inherited from the common ancestor of plants and animals, but lost in land plants. We sequenced the 120-megabase nuclear genome of Chlamydomonas and performed comparative phylogenomic analyses, identifying genes encoding uncharacterized proteins that are likely associated with the function and biogenesis of chloroplasts or eukaryotic flagella. Analyses of the Chlamydomonas genome advance our understanding of the ancestral eukaryotic cell, reveal previously unknown genes associated with photosynthetic and flagellar functions, and establish links between ciliopathy and the composition and function of flagella.

  7. Insight into structure and assembly of the nuclear pore complex by utilizing the genome of a eukaryotic thermophile

    DEFF Research Database (Denmark)

    Amlacher, Stefan; Sarges, Phillip; Flemming, Dirk

    2011-01-01

    is composed of two large Nups, Nup192 and Nup170, which are flexibly bridged by short linear motifs made up of linker Nups, Nic96 and Nup53. This assembly illustrates how Nup interactions can generate structural plasticity within the NPC scaffold. Our findings therefore demonstrate the utility of the genome...

  8. Pervasive, Genome-Wide Transcription in the Organelle Genomes of Diverse Plastid-Bearing Protists

    Directory of Open Access Journals (Sweden)

    Matheus Sanitá Lima

    2017-11-01

    Full Text Available Organelle genomes are among the most sequenced kinds of chromosome. This is largely because they are small and widely used in molecular studies, but also because next-generation sequencing technologies made sequencing easier, faster, and cheaper. However, studies of organelle RNA have not kept pace with those of DNA, despite huge amounts of freely available eukaryotic RNA-sequencing (RNA-seq data. Little is known about organelle transcription in nonmodel species, and most of the available eukaryotic RNA-seq data have not been mined for organelle transcripts. Here, we use publicly available RNA-seq experiments to investigate organelle transcription in 30 diverse plastid-bearing protists with varying organelle genomic architectures. Mapping RNA-seq data to organelle genomes revealed pervasive, genome-wide transcription, regardless of the taxonomic grouping, gene organization, or noncoding content. For every species analyzed, transcripts covered ≥85% of the mitochondrial and/or plastid genomes (all of which were ≤105 kb, indicating that most of the organelle DNA—coding and noncoding—is transcriptionally active. These results follow earlier studies of model species showing that organellar transcription is coupled and ubiquitous across the genome, requiring significant downstream processing of polycistronic transcripts. Our findings suggest that noncoding organelle DNA can be transcriptionally active, raising questions about the underlying function of these transcripts and underscoring the utility of publicly available RNA-seq data for recovering complete genome sequences. If pervasive transcription is also found in bigger organelle genomes (>105 kb and across a broader range of eukaryotes, this could indicate that noncoding organelle RNAs are regulating fundamental processes within eukaryotic cells.

  9. Structural-Functional Organization of the Eukaryotic Cell Nucleus and Transcription Regulation: Introduction to This Special Issue of Biochemistry (Moscow).

    Science.gov (United States)

    Razin, S V

    2018-04-01

    This issue of Biochemistry (Moscow) is devoted to the cell nucleus and mechanisms of transcription regulation. Over the years, biochemical processes in the cell nucleus have been studied in isolation, outside the context of their spatial organization. Now it is clear that segregation of functional processes within a compartmentalized cell nucleus is very important for the implementation of basic genetic processes. The functional compartmentalization of the cell nucleus is closely related to the spatial organization of the genome, which in turn plays a key role in the operation of epigenetic mechanisms. In this issue of Biochemistry (Moscow), we present a selection of review articles covering the functional architecture of the eukaryotic cell nucleus, the mechanisms of genome folding, the role of stochastic processes in establishing 3D architecture of the genome, and the impact of genome spatial organization on transcription regulation.

  10. Phylogenetic analysis of P5 P-type ATPases, a eukaryotic lineage of secretory pathway pumps

    DEFF Research Database (Denmark)

    Møller, Annette; Asp, Torben; Holm, Preben Bach

    2008-01-01

    prokaryotic genome. Based on a protein alignment we could group the P5 ATPases into two subfamilies, P5A and P5B that, based on the number of negative charges in conserved trans-membrane segment 4, are likely to have different ion specificities. P5A ATPases are present in all eukaryotic genomes sequenced so......Eukaryotes encompass a remarkable variety of organisms and unresolved lineages. Different phylogenetic analyses have lead to conflicting conclusions as to the origin and associations between lineages and species. In this work, we investigated evolutionary relationship of a family of cation pumps...... exclusive for the secretory pathway of eukaryotes by combining the identification of lineage-specific genes with phylogenetic evolution of common genes. Sequences of P5 ATPases, which are regarded to be cation pumps in the endoplasmic reticulum (ER), were identified in all eukaryotic lineages but not in any...

  11. The ARTT motif and a unified structural understanding of substraterecognition in ADP ribosylating bacterial toxins and eukaryotic ADPribosyltransferases

    Energy Technology Data Exchange (ETDEWEB)

    Han, S.; Tainer, J.A.

    2001-08-01

    ADP-ribosylation is a widely occurring and biologically critical covalent chemical modification process in pathogenic mechanisms, intracellular signaling systems, DNA repair, and cell division. The reaction is catalyzed by ADP-ribosyltransferases, which transfer the ADP-ribose moiety of NAD to a target protein with nicotinamide release. A family of bacterial toxins and eukaryotic enzymes has been termed the mono-ADP-ribosyltransferases, in distinction to the poly-ADP-ribosyltransferases, which catalyze the addition of multiple ADP-ribose groups to the carboxyl terminus of eukaryotic nucleoproteins. Despite the limited primary sequence homology among the different ADP-ribosyltransferases, a central cleft bearing NAD-binding pocket formed by the two perpendicular b-sheet core has been remarkably conserved between bacterial toxins and eukaryotic mono- and poly-ADP-ribosyltransferases. The majority of bacterial toxins and eukaryotic mono-ADP-ribosyltransferases are characterized by conserved His and catalytic Glu residues. In contrast, Diphtheria toxin, Pseudomonas exotoxin A, and eukaryotic poly-ADP-ribosyltransferases are characterized by conserved Arg and catalytic Glu residues. The NAD-binding core of a binary toxin and a C3-like toxin family identified an ARTT motif (ADP-ribosylating turn-turn motif) that is implicated in substrate specificity and recognition by structural and mutagenic studies. Here we apply structure-based sequence alignment and comparative structural analyses of all known structures of ADP-ribosyltransfeases to suggest that this ARTT motif is functionally important in many ADP-ribosylating enzymes that bear a NAD binding cleft as characterized by conserved Arg and catalytic Glu residues. Overall, structure-based sequence analysis reveals common core structures and conserved active sites of ADP-ribosyltransferases to support similar NAD binding mechanisms but differing mechanisms of target protein binding via sequence variations within the ARTT

  12. Genome Improvement at JGI-HAGSC

    Energy Technology Data Exchange (ETDEWEB)

    Grimwood, Jane; Schmutz, Jeremy J.; Myers, Richard M.

    2012-03-03

    Since the completion of the sequencing of the human genome, the Joint Genome Institute (JGI) has rapidly expanded its scientific goals in several DOE mission-relevant areas. At the JGI-HAGSC, we have kept pace with this rapid expansion of projects with our focus on assessing, assembling, improving and finishing eukaryotic whole genome shotgun (WGS) projects for which the shotgun sequence is generated at the Production Genomic Facility (JGI-PGF). We follow this by combining the draft WGS with genomic resources generated at JGI-HAGSC or in collaborator laboratories (including BAC end sequences, genetic maps and FLcDNA sequences) to produce an improved draft sequence. For eukaryotic genomes important to the DOE mission, we then add further information from directed experiments to produce reference genomic sequences that are publicly available for any scientific researcher. Also, we have continued our program for producing BAC-based finished sequence, both for adding information to JGI genome projects and for small BAC-based sequencing projects proposed through any of the JGI sequencing programs. We have now built our computational expertise in WGS assembly and analysis and have moved eukaryotic genome assembly from the JGI-PGF to JGI-HAGSC. We have concentrated our assembly development work on large plant genomes and complex fungal and algal genomes.

  13. Diversity of Eukaryotic Translational Initiation Factor eIF4E in Protists.

    Science.gov (United States)

    Jagus, Rosemary; Bachvaroff, Tsvetan R; Joshi, Bhavesh; Place, Allen R

    2012-01-01

    The greatest diversity of eukaryotic species is within the microbial eukaryotes, the protists, with plants and fungi/metazoa representing just two of the estimated seventy five lineages of eukaryotes. Protists are a diverse group characterized by unusual genome features and a wide range of genome sizes from 8.2 Mb in the apicomplexan parasite Babesia bovis to 112,000-220,050 Mb in the dinoflagellate Prorocentrum micans. Protists possess numerous cellular, molecular and biochemical traits not observed in "text-book" model organisms. These features challenge some of the concepts and assumptions about the regulation of gene expression in eukaryotes. Like multicellular eukaryotes, many protists encode multiple eIF4Es, but few functional studies have been undertaken except in parasitic species. An earlier phylogenetic analysis of protist eIF4Es indicated that they cannot be grouped within the three classes that describe eIF4E family members from multicellular organisms. Many more protist sequences are now available from which three clades can be recognized that are distinct from the plant/fungi/metazoan classes. Understanding of the protist eIF4Es will be facilitated as more sequences become available particularly for the under-represented opisthokonts and amoebozoa. Similarly, a better understanding of eIF4Es within each clade will develop as more functional studies of protist eIF4Es are completed.

  14. On the Diversification of the Translation Apparatus across Eukaryotes

    Directory of Open Access Journals (Sweden)

    Greco Hernández

    2012-01-01

    Full Text Available Diversity is one of the most remarkable features of living organisms. Current assessments of eukaryote biodiversity reaches 1.5 million species, but the true figure could be several times that number. Diversity is ingrained in all stages and echelons of life, namely, the occupancy of ecological niches, behavioral patterns, body plans and organismal complexity, as well as metabolic needs and genetics. In this review, we will discuss that diversity also exists in a key biochemical process, translation, across eukaryotes. Translation is a fundamental process for all forms of life, and the basic components and mechanisms of translation in eukaryotes have been largely established upon the study of traditional, so-called model organisms. By using modern genome-wide, high-throughput technologies, recent studies of many nonmodel eukaryotes have unveiled a surprising diversity in the configuration of the translation apparatus across eukaryotes, showing that this apparatus is far from being evolutionarily static. For some of the components of this machinery, functional differences between different species have also been found. The recent research reviewed in this article highlights the molecular and functional diversification the translational machinery has undergone during eukaryotic evolution. A better understanding of all aspects of organismal diversity is key to a more profound knowledge of life.

  15. Pervasive, Genome-Wide Transcription in the Organelle Genomes of Diverse Plastid-Bearing Protists.

    Science.gov (United States)

    Sanitá Lima, Matheus; Smith, David Roy

    2017-11-06

    Organelle genomes are among the most sequenced kinds of chromosome. This is largely because they are small and widely used in molecular studies, but also because next-generation sequencing technologies made sequencing easier, faster, and cheaper. However, studies of organelle RNA have not kept pace with those of DNA, despite huge amounts of freely available eukaryotic RNA-sequencing (RNA-seq) data. Little is known about organelle transcription in nonmodel species, and most of the available eukaryotic RNA-seq data have not been mined for organelle transcripts. Here, we use publicly available RNA-seq experiments to investigate organelle transcription in 30 diverse plastid-bearing protists with varying organelle genomic architectures. Mapping RNA-seq data to organelle genomes revealed pervasive, genome-wide transcription, regardless of the taxonomic grouping, gene organization, or noncoding content. For every species analyzed, transcripts covered ≥85% of the mitochondrial and/or plastid genomes (all of which were ≤105 kb), indicating that most of the organelle DNA-coding and noncoding-is transcriptionally active. These results follow earlier studies of model species showing that organellar transcription is coupled and ubiquitous across the genome, requiring significant downstream processing of polycistronic transcripts. Our findings suggest that noncoding organelle DNA can be transcriptionally active, raising questions about the underlying function of these transcripts and underscoring the utility of publicly available RNA-seq data for recovering complete genome sequences. If pervasive transcription is also found in bigger organelle genomes (>105 kb) and across a broader range of eukaryotes, this could indicate that noncoding organelle RNAs are regulating fundamental processes within eukaryotic cells. Copyright © 2017 Sanitá Lima and Smith.

  16. An algorithm for detecting eukaryotic sequences in metagenomic ...

    Indian Academy of Sciences (India)

    species but also from accidental contamination from the genome of eukaryotic host cells. The latter scenario generally occurs in the case of host-associated metagenomes, e.g. microbes living in human gut. In such cases, one needs to identify and remove contaminating host DNA sequences, since the latter sequences will ...

  17. Solution structure of an archaeal DNA binding protein with an eukaryotic zinc finger fold.

    Directory of Open Access Journals (Sweden)

    Florence Guillière

    Full Text Available While the basal transcription machinery in archaea is eukaryal-like, transcription factors in archaea and their viruses are usually related to bacterial transcription factors. Nevertheless, some of these organisms show predicted classical zinc fingers motifs of the C2H2 type, which are almost exclusively found in proteins of eukaryotes and most often associated with transcription regulators. In this work, we focused on the protein AFV1p06 from the hyperthermophilic archaeal virus AFV1. The sequence of the protein consists of the classical eukaryotic C2H2 motif with the fourth histidine coordinating zinc missing, as well as of N- and C-terminal extensions. We showed that the protein AFV1p06 binds zinc and solved its solution structure by NMR. AFV1p06 displays a zinc finger fold with a novel structure extension and disordered N- and C-termini. Structure calculations show that a glutamic acid residue that coordinates zinc replaces the fourth histidine of the C2H2 motif. Electromobility gel shift assays indicate that the protein binds to DNA with different affinities depending on the DNA sequence. AFV1p06 is the first experimentally characterised archaeal zinc finger protein with a DNA binding activity. The AFV1p06 protein family has homologues in diverse viruses of hyperthermophilic archaea. A phylogenetic analysis points out a common origin of archaeal and eukaryotic C2H2 zinc fingers.

  18. xGDBvm: A Web GUI-Driven Workflow for Annotating Eukaryotic Genomes in the Cloud.

    Science.gov (United States)

    Duvick, Jon; Standage, Daniel S; Merchant, Nirav; Brendel, Volker P

    2016-04-01

    Genome-wide annotation of gene structure requires the integration of numerous computational steps. Currently, annotation is arguably best accomplished through collaboration of bioinformatics and domain experts, with broad community involvement. However, such a collaborative approach is not scalable at today's pace of sequence generation. To address this problem, we developed the xGDBvm software, which uses an intuitive graphical user interface to access a number of common genome analysis and gene structure tools, preconfigured in a self-contained virtual machine image. Once their virtual machine instance is deployed through iPlant's Atmosphere cloud services, users access the xGDBvm workflow via a unified Web interface to manage inputs, set program parameters, configure links to high-performance computing (HPC) resources, view and manage output, apply analysis and editing tools, or access contextual help. The xGDBvm workflow will mask the genome, compute spliced alignments from transcript and/or protein inputs (locally or on a remote HPC cluster), predict gene structures and gene structure quality, and display output in a public or private genome browser complete with accessory tools. Problematic gene predictions are flagged and can be reannotated using the integrated yrGATE annotation tool. xGDBvm can also be configured to append or replace existing data or load precomputed data. Multiple genomes can be annotated and displayed, and outputs can be archived for sharing or backup. xGDBvm can be adapted to a variety of use cases including de novo genome annotation, reannotation, comparison of different annotations, and training or teaching. © 2016 American Society of Plant Biologists. All rights reserved.

  19. National Academy of Sciences and Academy of Sciences of the USSR workshop on structure of the eucaryotic genome and regulation of its expression

    Energy Technology Data Exchange (ETDEWEB)

    1990-01-01

    This report provides a brief overview of the Workshop on Structure of the Eukaryotic Genome and Regulation of its Expression held in Tbilisi, Georgia, USSR. The report describes the presentations made at the meeting but also goes on to describe the state of molecular biology and genetics research in the Soviet Union and makes recommendations on how to improve future such meetings.

  20. In silico ionomics segregates parasitic from free-living eukaryotes.

    Science.gov (United States)

    Greganova, Eva; Steinmann, Michael; Mäser, Pascal; Fankhauser, Niklaus

    2013-01-01

    Ion transporters are fundamental to life. Due to their ancient origin and conservation in sequence, ion transporters are also particularly well suited for comparative genomics of distantly related species. Here, we perform genome-wide ion transporter profiling as a basis for comparative genomics of eukaryotes. From a given predicted proteome, we identify all bona fide ion channels, ion porters, and ion pumps. Concentrating on unicellular eukaryotes (n = 37), we demonstrate that clustering of species according to their repertoire of ion transporters segregates obligate endoparasites (n = 23) on the one hand, from free-living species and facultative parasites (n = 14) on the other hand. This surprising finding indicates strong convergent evolution of the parasites regarding the acquisition and homeostasis of inorganic ions. Random forest classification identifies transporters of ammonia, plus transporters of iron and other transition metals, as the most informative for distinguishing the obligate parasites. Thus, in silico ionomics further underscores the importance of iron in infection biology and suggests access to host sources of nitrogen and transition metals to be selective forces in the evolution of parasitism. This finding is in agreement with the phenomenon of iron withholding as a primordial antimicrobial strategy of infected mammals.

  1. A Taste of Algal Genomes from the Joint Genome Institute

    Energy Technology Data Exchange (ETDEWEB)

    Kuo, Alan; Grigoriev, Igor

    2012-06-17

    Algae play profound roles in aquatic food chains and the carbon cycle, can impose health and economic costs through toxic blooms, provide models for the study of symbiosis, photosynthesis, and eukaryotic evolution, and are candidate sources for bio-fuels; all of these research areas are part of the mission of DOE's Joint Genome Institute (JGI). To date JGI has sequenced, assembled, annotated, and released to the public the genomes of 18 species and strains of algae, sampling almost all of the major clades of photosynthetic eukaryotes. With more algal genomes currently undergoing analysis, JGI continues its commitment to driving forward basic and applied algal science. Among these ongoing projects are the pan-genome of the dominant coccolithophore Emiliania huxleyi, the interrelationships between the 4 genomes in the nucleomorph-containing Bigelowiella natans and Guillardia theta, and the search for symbiosis genes of lichens.

  2. Functional Coverage of the Human Genome by Existing Structures, Structural Genomics Targets, and Homology Models.

    Directory of Open Access Journals (Sweden)

    2005-08-01

    Full Text Available The bias in protein structure and function space resulting from experimental limitations and targeting of particular functional classes of proteins by structural biologists has long been recognized, but never continuously quantified. Using the Enzyme Commission and the Gene Ontology classifications as a reference frame, and integrating structure data from the Protein Data Bank (PDB, target sequences from the structural genomics projects, structure homology derived from the SUPERFAMILY database, and genome annotations from Ensembl and NCBI, we provide a quantified view, both at the domain and whole-protein levels, of the current and projected coverage of protein structure and function space relative to the human genome. Protein structures currently provide at least one domain that covers 37% of the functional classes identified in the genome; whole structure coverage exists for 25% of the genome. If all the structural genomics targets were solved (twice the current number of structures in the PDB, it is estimated that structures of one domain would cover 69% of the functional classes identified and complete structure coverage would be 44%. Homology models from existing experimental structures extend the 37% coverage to 56% of the genome as single domains and 25% to 31% for complete structures. Coverage from homology models is not evenly distributed by protein family, reflecting differing degrees of sequence and structure divergence within families. While these data provide coverage, conversely, they also systematically highlight functional classes of proteins for which structures should be determined. Current key functional families without structure representation are highlighted here; updated information on the "most wanted list" that should be solved is available on a weekly basis from http://function.rcsb.org:8080/pdb/function_distribution/index.html.

  3. Complete Sequence and Analysis of the Mitochondrial Genome of Hemiselmis andersenii CCMP644 (Cryptophyceae

    Directory of Open Access Journals (Sweden)

    Bowman Sharen

    2008-05-01

    Full Text Available Abstract Background Cryptophytes are an enigmatic group of unicellular eukaryotes with plastids derived by secondary (i.e., eukaryote-eukaryote endosymbiosis. Cryptophytes are unusual in that they possess four genomes–a host cell-derived nuclear and mitochondrial genome and an endosymbiont-derived plastid and 'nucleomorph' genome. The evolutionary origins of the host and endosymbiont components of cryptophyte algae are at present poorly understood. Thus far, a single complete mitochondrial genome sequence has been determined for the cryptophyte Rhodomonas salina. Here, the second complete mitochondrial genome of the cryptophyte alga Hemiselmis andersenii CCMP644 is presented. Results The H. andersenii mtDNA is 60,553 bp in size and encodes 30 structural RNAs and 36 protein-coding genes, all located on the same strand. A prominent feature of the genome is the presence of a ~20 Kbp long intergenic region comprised of numerous tandem and dispersed repeat units of between 22–336 bp. Adjacent to these repeats are 27 copies of palindromic sequences predicted to form stable DNA stem-loop structures. One such stem-loop is located near a GC-rich and GC-poor region and may have a regulatory function in replication or transcription. The H. andersenii mtDNA shares a number of features in common with the genome of the cryptophyte Rhodomonas salina, including general architecture, gene content, and the presence of a large repeat region. However, the H. andersenii mtDNA is devoid of inverted repeats and introns, which are present in R. salina. Comparative analyses of the suite of tRNAs encoded in the two genomes reveal that the H. andersenii mtDNA has lost or converted its original trnK(uuu gene and possesses a trnS-derived 'trnK(uuu', which appears unable to produce a functional tRNA. Mitochondrial protein coding gene phylogenies strongly support a variety of previously established eukaryotic groups, but fail to resolve the relationships among higher

  4. On the Archaeal Origins of Eukaryotes and the Challenges of Inferring Phenotype from Genotype.

    Science.gov (United States)

    Dey, Gautam; Thattai, Mukund; Baum, Buzz

    2016-07-01

    If eukaryotes arose through a merger between archaea and bacteria, what did the first true eukaryotic cell look like? A major step toward an answer came with the discovery of Lokiarchaeum, an archaeon whose genome encodes small GTPases related to those used by eukaryotes to regulate membrane traffic. Although 'Loki' cells have yet to be seen, their existence has prompted the suggestion that the archaeal ancestor of eukaryotes engulfed the future mitochondrion by phagocytosis. We propose instead that the archaeal ancestor was a relatively simple cell, and that eukaryotic cellular organization arose as the result of a gradual transfer of bacterial genes and membranes driven by an ever-closer symbiotic partnership between a bacterium and an archaeon. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.

  5. Patterns of intron gain and conservation in eukaryotic genes

    Directory of Open Access Journals (Sweden)

    Wolf Yuri I

    2007-10-01

    Full Text Available Abstract Background: The presence of introns in protein-coding genes is a universal feature of eukaryotic genome organization, and the genes of multicellular eukaryotes, typically, contain multiple introns, a substantial fraction of which share position in distant taxa, such as plants and animals. Depending on the methods and data sets used, researchers have reached opposite conclusions on the causes of the high fraction of shared introns in orthologous genes from distant eukaryotes. Some studies conclude that shared intron positions reflect, almost entirely, a remarkable evolutionary conservation, whereas others attribute it to parallel gain of introns. To resolve these contradictions, it is crucial to analyze the evolution of introns by using a model that minimally relies on arbitrary assumptions. Results: We developed a probabilistic model of evolution that allows for variability of intron gain and loss rates over branches of the phylogenetic tree, individual genes, and individual sites. Applying this model to an extended set of conserved eukaryotic genes, we find that parallel gain, on average, accounts for only ~8% of the shared intron positions. However, the distribution of parallel gains over the phylogenetic tree of eukaryotes is highly non-uniform. There are, practically, no parallel gains in closely related lineages, whereas for distant lineages, such as animals and plants, parallel gains appear to contribute up to 20% of the shared intron positions. In accord with these findings, we estimated that ancestral introns have a high probability to be retained in extant genomes, and conversely, that a substantial fraction of extant introns have retained their positions since the early stages of eukaryotic evolution. In addition, the density of sites that are available for intron insertion is estimated to be, approximately, one in seven basepairs. Conclusion: We obtained robust estimates of the contribution of parallel gain to the observed

  6. The Jigsaw Puzzle of mRNA Translation Initiation in Eukaryotes: A Decade of Structures Unraveling the Mechanics of the Process.

    Science.gov (United States)

    Hashem, Yaser; Frank, Joachim

    2018-03-01

    Translation initiation in eukaryotes is a highly regulated and rate-limiting process. It results in the assembly and disassembly of numerous transient and intermediate complexes involving over a dozen eukaryotic initiation factors (eIFs). This process culminates in the accommodation of a start codon marking the beginning of an open reading frame at the appropriate ribosomal site. Although this process has been extensively studied by hundreds of groups for nearly half a century, it has been only recently, especially during the last decade, that we have gained deeper insight into the mechanics of the eukaryotic translation initiation process. This advance in knowledge is due in part to the contributions of structural biology, which have shed light on the molecular mechanics underlying the different functions of various eukaryotic initiation factors. In this review, we focus exclusively on the contribution of structural biology to the understanding of the eukaryotic initiation process, a long-standing jigsaw puzzle that is just starting to yield the bigger picture. Expected final online publication date for the Annual Review of Biophysics Volume 47 is May 20, 2018. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.

  7. Genome-wide analysis of the phosphoinositide kinome from two ciliates reveals novel evolutionary links for phosphoinositide kinases in eukaryotic cells.

    Directory of Open Access Journals (Sweden)

    George Leondaritis

    Full Text Available BACKGROUND: The complexity of phosphoinositide signaling in higher eukaryotes is partly due to expansion of specific families and types of phosphoinositide kinases (PIKs that can generate all phosphoinositides via multiple routes. This is particularly evident in the PI3Ks and PIPKs, and it is considered an evolutionary trait associated with metazoan diversification. Yet, there are limited comprehensive studies on the PIK repertoire of free living unicellular organisms. METHODOLOGY/PRINCIPAL FINDINGS: We undertook a genome-wide analysis of putative PIK genes in two free living ciliated cells, Tetrahymena and Paramecium. The Tetrahymena thermophila and Paramecium tetraurelia genomes were probed with representative kinases from all families and types. Putative homologs were verified by EST, microarray and deep RNA sequencing database searches and further characterized for domain structure, catalytic efficiency, expression patterns and phylogenetic relationships. In total, we identified and characterized 22 genes in the Tetrahymena thermophila genome and 62 highly homologues genes in Paramecium tetraurelia suggesting a tight evolutionary conservation in the ciliate lineage. Comparison to the kinome of fungi reveals a significant expansion of PIK genes in ciliates. CONCLUSIONS/SIGNIFICANCE: Our study highlights four important aspects concerning ciliate and other unicellular PIKs. First, ciliate-specific expansion of PI4KIII-like genes. Second, presence of class I PI3Ks which, at least in Tetrahymena, are associated with a metazoan-type machinery for PIP3 signaling. Third, expansion of divergent PIPK enzymes such as the recently described type IV transmembrane PIPKs. Fourth, presence of possible type II PIPKs and presumably inactive PIKs (hence, pseudo-PIKs not previously described. Taken together, our results provide a solid framework for future investigation of the roles of PIKs in ciliates and indicate that novel functions and novel regulatory

  8. National Academy of Sciences and Academy of Sciences of the USSR workshop on structure of the eucaryotic genome and regulation of its expression. Final report

    Energy Technology Data Exchange (ETDEWEB)

    1990-12-31

    This report provides a brief overview of the Workshop on Structure of the Eukaryotic Genome and Regulation of its Expression held in Tbilisi, Georgia, USSR. The report describes the presentations made at the meeting but also goes on to describe the state of molecular biology and genetics research in the Soviet Union and makes recommendations on how to improve future such meetings.

  9. Local chromatin structure of heterochromatin regulates repeated DNA stability, nucleolus structure, and genome integrity

    Energy Technology Data Exchange (ETDEWEB)

    Peng, Jamy C. [Univ. of California, Berkeley, CA (United States)

    2007-01-01

    Heterochromatin constitutes a significant portion of the genome in higher eukaryotes; approximately 30% in Drosophila and human. Heterochromatin contains a high repeat DNA content and a low density of protein-encoding genes. In contrast, euchromatin is composed mostly of unique sequences and contains the majority of single-copy genes. Genetic and cytological studies demonstrated that heterochromatin exhibits regulatory roles in chromosome organization, centromere function and telomere protection. As an epigenetically regulated structure, heterochromatin formation is not defined by any DNA sequence consensus. Heterochromatin is characterized by its association with nucleosomes containing methylated-lysine 9 of histone H3 (H3K9me), heterochromatin protein 1 (HP1) that binds H3K9me, and Su(var)3-9, which methylates H3K9 and binds HP1. Heterochromatin formation and functions are influenced by HP1, Su(var)3-9, and the RNA interference (RNAi) pathway. My thesis project investigates how heterochromatin formation and function impact nuclear architecture, repeated DNA organization, and genome stability in Drosophila melanogaster. H3K9me-based chromatin reduces extrachromosomal DNA formation; most likely by restricting the access of repair machineries to repeated DNAs. Reducing extrachromosomal ribosomal DNA stabilizes rDNA repeats and the nucleolus structure. H3K9me-based chromatin also inhibits DNA damage in heterochromatin. Cells with compromised heterochromatin structure, due to Su(var)3-9 or dcr-2 (a component of the RNAi pathway) mutations, display severe DNA damage in heterochromatin compared to wild type. In these mutant cells, accumulated DNA damage leads to chromosomal defects such as translocations, defective DNA repair response, and activation of the G2-M DNA repair and mitotic checkpoints that ensure cellular and animal viability. My thesis research suggests that DNA replication, repair, and recombination mechanisms in heterochromatin differ from those in

  10. xGDBvm: A Web GUI-Driven Workflow for Annotating Eukaryotic Genomes in the Cloud[OPEN

    Science.gov (United States)

    Merchant, Nirav

    2016-01-01

    Genome-wide annotation of gene structure requires the integration of numerous computational steps. Currently, annotation is arguably best accomplished through collaboration of bioinformatics and domain experts, with broad community involvement. However, such a collaborative approach is not scalable at today’s pace of sequence generation. To address this problem, we developed the xGDBvm software, which uses an intuitive graphical user interface to access a number of common genome analysis and gene structure tools, preconfigured in a self-contained virtual machine image. Once their virtual machine instance is deployed through iPlant’s Atmosphere cloud services, users access the xGDBvm workflow via a unified Web interface to manage inputs, set program parameters, configure links to high-performance computing (HPC) resources, view and manage output, apply analysis and editing tools, or access contextual help. The xGDBvm workflow will mask the genome, compute spliced alignments from transcript and/or protein inputs (locally or on a remote HPC cluster), predict gene structures and gene structure quality, and display output in a public or private genome browser complete with accessory tools. Problematic gene predictions are flagged and can be reannotated using the integrated yrGATE annotation tool. xGDBvm can also be configured to append or replace existing data or load precomputed data. Multiple genomes can be annotated and displayed, and outputs can be archived for sharing or backup. xGDBvm can be adapted to a variety of use cases including de novo genome annotation, reannotation, comparison of different annotations, and training or teaching. PMID:27020957

  11. Macronuclear genome sequence of the ciliate Tetrahymena thermophila, a model eukaryote.

    Directory of Open Access Journals (Sweden)

    Jonathan A Eisen

    2006-09-01

    Full Text Available The ciliate Tetrahymena thermophila is a model organism for molecular and cellular biology. Like other ciliates, this species has separate germline and soma functions that are embodied by distinct nuclei within a single cell. The germline-like micronucleus (MIC has its genome held in reserve for sexual reproduction. The soma-like macronucleus (MAC, which possesses a genome processed from that of the MIC, is the center of gene expression and does not directly contribute DNA to sexual progeny. We report here the shotgun sequencing, assembly, and analysis of the MAC genome of T. thermophila, which is approximately 104 Mb in length and composed of approximately 225 chromosomes. Overall, the gene set is robust, with more than 27,000 predicted protein-coding genes, 15,000 of which have strong matches to genes in other organisms. The functional diversity encoded by these genes is substantial and reflects the complexity of processes required for a free-living, predatory, single-celled organism. This is highlighted by the abundance of lineage-specific duplications of genes with predicted roles in sensing and responding to environmental conditions (e.g., kinases, using diverse resources (e.g., proteases and transporters, and generating structural complexity (e.g., kinesins and dyneins. In contrast to the other lineages of alveolates (apicomplexans and dinoflagellates, no compelling evidence could be found for plastid-derived genes in the genome. UGA, the only T. thermophila stop codon, is used in some genes to encode selenocysteine, thus making this organism the first known with the potential to translate all 64 codons in nuclear genes into amino acids. We present genomic evidence supporting the hypothesis that the excision of DNA from the MIC to generate the MAC specifically targets foreign DNA as a form of genome self-defense. The combination of the genome sequence, the functional diversity encoded therein, and the presence of some pathways missing from

  12. The origin of the eukaryotic cell

    Science.gov (United States)

    Hartman, H.

    1984-01-01

    The endosymbiotic hypothesis for the origin of the eukaryotic cell has been applied to the origin of the mitochondria and chloroplasts. However as has been pointed out by Mereschowsky in 1905, it should also be applied to the nucleus as well. If the nucleus, mitochondria and chloroplasts are endosymbionts, then it is likely that the organism that did the engulfing was not a DNA-based organism. In fact, it is useful to postulate that this organism was a primitive RNA-based organism. This hypothesis would explain the preponderance of RNA viruses found in eukaryotic cells. The centriole and basal body do not have a double membrane or DNA. Like all MTOCs (microtubule organising centres), they have a structural or morphic RNA implicated in their formation. This would argue for their origin in the early RNA-based organism rather than in an endosymbiotic event involving bacteria. Finally, the eukaryotic cell uses RNA in ways quite unlike bacteria, thus pointing to a greater emphasis of RNA in both control and structure in the cell. The origin of the eukaryotic cell may tell us why it rather than its prokaryotic relative evolved into the metazoans who are reading this paper.

  13. Archaeal “Dark Matter” and the Origin of Eukaryotes

    Science.gov (United States)

    Williams, Tom A.; Embley, T. Martin

    2014-01-01

    Current hypotheses about the history of cellular life are mainly based on analyses of cultivated organisms, but these represent only a small fraction of extant biodiversity. The sequencing of new environmental lineages therefore provides an opportunity to test, revise, or reject existing ideas about the tree of life and the origin of eukaryotes. According to the textbook three domains hypothesis, the eukaryotes emerge as the sister group to a monophyletic Archaea. However, recent analyses incorporating better phylogenetic models and an improved sampling of the archaeal domain have generally supported the competing eocyte hypothesis, in which core genes of eukaryotic cells originated from within the Archaea, with important implications for eukaryogenesis. Given this trend, it was surprising that a recent analysis incorporating new genomes from uncultivated Archaea recovered a strongly supported three domains tree. Here, we show that this result was due in part to the use of a poorly fitting phylogenetic model and also to the inclusion by an automated pipeline of genes of putative bacterial origin rather than nucleocytosolic versions for some of the eukaryotes analyzed. When these issues were resolved, analyses including the new archaeal lineages placed core eukaryotic genes within the Archaea. These results are consistent with a number of recent studies in which improved archaeal sampling and better phylogenetic models agree in supporting the eocyte tree over the three domains hypothesis. PMID:24532674

  14. Organizational heterogeneity of vertebrate genomes.

    Science.gov (United States)

    Frenkel, Svetlana; Kirzhner, Valery; Korol, Abraham

    2012-01-01

    Genomes of higher eukaryotes are mosaics of segments with various structural, functional, and evolutionary properties. The availability of whole-genome sequences allows the investigation of their structure as "texts" using different statistical and computational methods. One such method, referred to as Compositional Spectra (CS) analysis, is based on scoring the occurrences of fixed-length oligonucleotides (k-mers) in the target DNA sequence. CS analysis allows generating species- or region-specific characteristics of the genome, regardless of their length and the presence of coding DNA. In this study, we consider the heterogeneity of vertebrate genomes as a joint effect of regional variation in sequence organization superimposed on the differences in nucleotide composition. We estimated compositional and organizational heterogeneity of genome and chromosome sequences separately and found that both heterogeneity types vary widely among genomes as well as among chromosomes in all investigated taxonomic groups. The high correspondence of heterogeneity scores obtained on three genome fractions, coding, repetitive, and the remaining part of the noncoding DNA (the genome dark matter--GDM) allows the assumption that CS-heterogeneity may have functional relevance to genome regulation. Of special interest for such interpretation is the fact that natural GDM sequences display the highest deviation from the corresponding reshuffled sequences.

  15. Organizational heterogeneity of vertebrate genomes.

    Directory of Open Access Journals (Sweden)

    Svetlana Frenkel

    Full Text Available Genomes of higher eukaryotes are mosaics of segments with various structural, functional, and evolutionary properties. The availability of whole-genome sequences allows the investigation of their structure as "texts" using different statistical and computational methods. One such method, referred to as Compositional Spectra (CS analysis, is based on scoring the occurrences of fixed-length oligonucleotides (k-mers in the target DNA sequence. CS analysis allows generating species- or region-specific characteristics of the genome, regardless of their length and the presence of coding DNA. In this study, we consider the heterogeneity of vertebrate genomes as a joint effect of regional variation in sequence organization superimposed on the differences in nucleotide composition. We estimated compositional and organizational heterogeneity of genome and chromosome sequences separately and found that both heterogeneity types vary widely among genomes as well as among chromosomes in all investigated taxonomic groups. The high correspondence of heterogeneity scores obtained on three genome fractions, coding, repetitive, and the remaining part of the noncoding DNA (the genome dark matter--GDM allows the assumption that CS-heterogeneity may have functional relevance to genome regulation. Of special interest for such interpretation is the fact that natural GDM sequences display the highest deviation from the corresponding reshuffled sequences.

  16. pico-PLAZA, a genome database of microbial photosynthetic eukaryotes.

    Science.gov (United States)

    Vandepoele, Klaas; Van Bel, Michiel; Richard, Guilhem; Van Landeghem, Sofie; Verhelst, Bram; Moreau, Hervé; Van de Peer, Yves; Grimsley, Nigel; Piganeau, Gwenael

    2013-08-01

    With the advent of next generation genome sequencing, the number of sequenced algal genomes and transcriptomes is rapidly growing. Although a few genome portals exist to browse individual genome sequences, exploring complete genome information from multiple species for the analysis of user-defined sequences or gene lists remains a major challenge. pico-PLAZA is a web-based resource (http://bioinformatics.psb.ugent.be/pico-plaza/) for algal genomics that combines different data types with intuitive tools to explore genomic diversity, perform integrative evolutionary sequence analysis and study gene functions. Apart from homologous gene families, multiple sequence alignments, phylogenetic trees, Gene Ontology, InterPro and text-mining functional annotations, different interactive viewers are available to study genome organization using gene collinearity and synteny information. Different search functions, documentation pages, export functions and an extensive glossary are available to guide non-expert scientists. To illustrate the versatility of the platform, different case studies are presented demonstrating how pico-PLAZA can be used to functionally characterize large-scale EST/RNA-Seq data sets and to perform environmental genomics. Functional enrichments analysis of 16 Phaeodactylum tricornutum transcriptome libraries offers a molecular view on diatom adaptation to different environments of ecological relevance. Furthermore, we show how complementary genomic data sources can easily be combined to identify marker genes to study the diversity and distribution of algal species, for example in metagenomes, or to quantify intraspecific diversity from environmental strains. © 2013 John Wiley & Sons Ltd and Society for Applied Microbiology.

  17. Evolution of glutamate dehydrogenase genes: evidence for lateral gene transfer within and between prokaryotes and eukaryotes

    Directory of Open Access Journals (Sweden)

    Roger Andrew J

    2003-06-01

    Full Text Available Abstract Background Lateral gene transfer can introduce genes with novel functions into genomes or replace genes with functionally similar orthologs or paralogs. Here we present a study of the occurrence of the latter gene replacement phenomenon in the four gene families encoding different classes of glutamate dehydrogenase (GDH, to evaluate and compare the patterns and rates of lateral gene transfer (LGT in prokaryotes and eukaryotes. Results We extend the taxon sampling of gdh genes with nine new eukaryotic sequences and examine the phylogenetic distribution pattern of the various GDH classes in combination with maximum likelihood phylogenetic analyses. The distribution pattern analyses indicate that LGT has played a significant role in the evolution of the four gdh gene families. Indeed, a number of gene transfer events are identified by phylogenetic analyses, including numerous prokaryotic intra-domain transfers, some prokaryotic inter-domain transfers and several inter-domain transfers between prokaryotes and microbial eukaryotes (protists. Conclusion LGT has apparently affected eukaryotes and prokaryotes to a similar extent within the gdh gene families. In the absence of indications that the evolution of the gdh gene families is radically different from other families, these results suggest that gene transfer might be an important evolutionary mechanism in microbial eukaryote genome evolution.

  18. Comparative Genomic Analysis Reveals a Diverse Repertoire of Genes Involved in Prokaryote-Eukaryote Interactions within the Pseudovibrio Genus.

    Science.gov (United States)

    Romano, Stefano; Fernàndez-Guerra, Antonio; Reen, F Jerry; Glöckner, Frank O; Crowley, Susan P; O'Sullivan, Orla; Cotter, Paul D; Adams, Claire; Dobson, Alan D W; O'Gara, Fergal

    2016-01-01

    Strains of the Pseudovibrio genus have been detected worldwide, mainly as part of bacterial communities associated with marine invertebrates, particularly sponges. This recurrent association has been considered as an indication of a symbiotic relationship between these microbes and their host. Until recently, the availability of only two genomes, belonging to closely related strains, has limited the knowledge on the genomic and physiological features of the genus to a single phylogenetic lineage. Here we present 10 newly sequenced genomes of Pseudovibrio strains isolated from marine sponges from the west coast of Ireland, and including the other two publicly available genomes we performed an extensive comparative genomic analysis. Homogeneity was apparent in terms of both the orthologous genes and the metabolic features shared amongst the 12 strains. At the genomic level, a key physiological difference observed amongst the isolates was the presence only in strain P. axinellae AD2 of genes encoding proteins involved in assimilatory nitrate reduction, which was then proved experimentally. We then focused on studying those systems known to be involved in the interactions with eukaryotic and prokaryotic cells. This analysis revealed that the genus harbors a large diversity of toxin-like proteins, secretion systems and their potential effectors. Their distribution in the genus was not always consistent with the phylogenetic relationship of the strains. Finally, our analyses identified new genomic islands encoding potential toxin-immunity systems, previously unknown in the genus. Our analyses shed new light on the Pseudovibrio genus, indicating a large diversity of both metabolic features and systems for interacting with the host. The diversity in both distribution and abundance of these systems amongst the strains underlines how metabolically and phylogenetically similar bacteria may use different strategies to interact with the host and find a niche within its

  19. Structural similarities between prokaryotic and eukaryotic 5S ribosomal RNAs

    International Nuclear Information System (INIS)

    Welfle, H.; Boehm, S.; Damaschun, G.; Fabian, H.; Gast, K.; Misselwitz, R.; Mueller, J.J.; Zirwer, D.; Filimonov, V.V.; Venyaminov, S.Yu.; Zalkova, T.N.

    1986-01-01

    5S RNAs from rat liver and E. coli have been studied by diffuse X-ray and dynamic light scattering and by infrared and Raman spectroscopy. Identical structures at a resolution of 1 nm can be deduced from the comparison of the experimental X-ray scattering curves and electron distance distribution functions and from the agreement of the shape parameters. A flat shape model with a compact central region and two protruding arms was derived. Double helical stems are eleven-fold helices with a mean base pair distance of 0.28 nm. The number of base pairs (26 GC, 9 AU for E. coli; 27 GC, 9 AU for rat liver) and the degree of base stacking are the same within the experimental error. A very high regularity in the ribophosphate backbone is indicated for both 5S RNAs. The observed structural similarity and the consensus secondary structure pattern derived from comparative sequence analyses suggest the conclusion that prokaryotic and eukaryotic 5S RNAs are in general very similar with respect to their fundamental structural features. (author)

  20. The reduced kinome of Ostreococcus tauri: core eukaryotic signalling components in a tractable model species.

    Science.gov (United States)

    Hindle, Matthew M; Martin, Sarah F; Noordally, Zeenat B; van Ooijen, Gerben; Barrios-Llerena, Martin E; Simpson, T Ian; Le Bihan, Thierry; Millar, Andrew J

    2014-08-02

    The current knowledge of eukaryote signalling originates from phenotypically diverse organisms. There is a pressing need to identify conserved signalling components among eukaryotes, which will lead to the transfer of knowledge across kingdoms. Two useful properties of a eukaryote model for signalling are (1) reduced signalling complexity, and (2) conservation of signalling components. The alga Ostreococcus tauri is described as the smallest free-living eukaryote. With less than 8,000 genes, it represents a highly constrained genomic palette. Our survey revealed 133 protein kinases and 34 protein phosphatases (1.7% and 0.4% of the proteome). We conducted phosphoproteomic experiments and constructed domain structures and phylogenies for the catalytic protein-kinases. For each of the major kinases families we review the completeness and divergence of O. tauri representatives in comparison to the well-studied kinomes of the laboratory models Arabidopsis thaliana and Saccharomyces cerevisiae, and of Homo sapiens. Many kinase clades in O. tauri were reduced to a single member, in preference to the loss of family diversity, whereas TKL and ABC1 clades were expanded. We also identified kinases that have been lost in A. thaliana but retained in O. tauri. For three, contrasting eukaryotic pathways - TOR, MAPK, and the circadian clock - we established the subset of conserved components and demonstrate conserved sites of substrate phosphorylation and kinase motifs. We conclude that O. tauri satisfies our two central requirements. Several of its kinases are more closely related to H. sapiens orthologs than S. cerevisiae is to H. sapiens. The greatly reduced kinome of O. tauri is therefore a suitable model for signalling in free-living eukaryotes.

  1. topIb, a phylogenetic hallmark gene of Thaumarchaeota encodes a functional eukaryote-like topoisomerase IB.

    Science.gov (United States)

    Dahmane, Narimane; Gadelle, Danièle; Delmas, Stéphane; Criscuolo, Alexis; Eberhard, Stephan; Desnoues, Nicole; Collin, Sylvie; Zhang, Hongliang; Pommier, Yves; Forterre, Patrick; Sezonov, Guennadi

    2016-04-07

    Type IB DNA topoisomerases can eliminate torsional stresses produced during replication and transcription. These enzymes are found in all eukaryotes and a short version is present in some bacteria and viruses. Among prokaryotes, the long eukaryotic version is only observed in archaea of the phylum Thaumarchaeota. However, the activities and the roles of these topoisomerases have remained an open question. Here, we demonstrate that all available thaumarchaeal genomes contain a topoisomerase IB gene that defines a monophyletic group closely related to the eukaryotic enzymes. We show that the topIB gene is expressed in the model thaumarchaeon Nitrososphaera viennensis and we purified the recombinant enzyme from the uncultivated thaumarchaeon Candidatus Caldiarchaeum subterraneum. This enzyme is active in vitro at high temperature, making it the first thermophilic topoisomerase IB characterized so far. We have compared this archaeal type IB enzyme to its human mitochondrial and nuclear counterparts. The archaeal enzyme relaxes both negatively and positively supercoiled DNA like the eukaryotic enzymes. However, its pattern of DNA cleavage specificity is different and it is resistant to camptothecins (CPTs) and non-CPT Top1 inhibitors, LMP744 and lamellarin D. This newly described thermostable topoisomerases IB should be a promising new model for evolutionary, mechanistic and structural studies. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  2. De novo assembly of a 40 Mb eukaryotic genome from short sequence reads: Sordaria macrospora, a model organism for fungal morphogenesis.

    Science.gov (United States)

    Nowrousian, Minou; Stajich, Jason E; Chu, Meiling; Engh, Ines; Espagne, Eric; Halliday, Karen; Kamerewerd, Jens; Kempken, Frank; Knab, Birgit; Kuo, Hsiao-Che; Osiewacz, Heinz D; Pöggeler, Stefanie; Read, Nick D; Seiler, Stephan; Smith, Kristina M; Zickler, Denise; Kück, Ulrich; Freitag, Michael

    2010-04-08

    Filamentous fungi are of great importance in ecology, agriculture, medicine, and biotechnology. Thus, it is not surprising that genomes for more than 100 filamentous fungi have been sequenced, most of them by Sanger sequencing. While next-generation sequencing techniques have revolutionized genome resequencing, e.g. for strain comparisons, genetic mapping, or transcriptome and ChIP analyses, de novo assembly of eukaryotic genomes still presents significant hurdles, because of their large size and stretches of repetitive sequences. Filamentous fungi contain few repetitive regions in their 30-90 Mb genomes and thus are suitable candidates to test de novo genome assembly from short sequence reads. Here, we present a high-quality draft sequence of the Sordaria macrospora genome that was obtained by a combination of Illumina/Solexa and Roche/454 sequencing. Paired-end Solexa sequencing of genomic DNA to 85-fold coverage and an additional 10-fold coverage by single-end 454 sequencing resulted in approximately 4 Gb of DNA sequence. Reads were assembled to a 40 Mb draft version (N50 of 117 kb) with the Velvet assembler. Comparative analysis with Neurospora genomes increased the N50 to 498 kb. The S. macrospora genome contains even fewer repeat regions than its closest sequenced relative, Neurospora crassa. Comparison with genomes of other fungi showed that S. macrospora, a model organism for morphogenesis and meiosis, harbors duplications of several genes involved in self/nonself-recognition. Furthermore, S. macrospora contains more polyketide biosynthesis genes than N. crassa. Phylogenetic analyses suggest that some of these genes may have been acquired by horizontal gene transfer from a distantly related ascomycete group. Our study shows that, for typical filamentous fungi, de novo assembly of genomes from short sequence reads alone is feasible, that a mixture of Solexa and 454 sequencing substantially improves the assembly, and that the resulting data can be used for

  3. De novo assembly of a 40 Mb eukaryotic genome from short sequence reads: Sordaria macrospora, a model organism for fungal morphogenesis.

    Directory of Open Access Journals (Sweden)

    Minou Nowrousian

    2010-04-01

    Full Text Available Filamentous fungi are of great importance in ecology, agriculture, medicine, and biotechnology. Thus, it is not surprising that genomes for more than 100 filamentous fungi have been sequenced, most of them by Sanger sequencing. While next-generation sequencing techniques have revolutionized genome resequencing, e.g. for strain comparisons, genetic mapping, or transcriptome and ChIP analyses, de novo assembly of eukaryotic genomes still presents significant hurdles, because of their large size and stretches of repetitive sequences. Filamentous fungi contain few repetitive regions in their 30-90 Mb genomes and thus are suitable candidates to test de novo genome assembly from short sequence reads. Here, we present a high-quality draft sequence of the Sordaria macrospora genome that was obtained by a combination of Illumina/Solexa and Roche/454 sequencing. Paired-end Solexa sequencing of genomic DNA to 85-fold coverage and an additional 10-fold coverage by single-end 454 sequencing resulted in approximately 4 Gb of DNA sequence. Reads were assembled to a 40 Mb draft version (N50 of 117 kb with the Velvet assembler. Comparative analysis with Neurospora genomes increased the N50 to 498 kb. The S. macrospora genome contains even fewer repeat regions than its closest sequenced relative, Neurospora crassa. Comparison with genomes of other fungi showed that S. macrospora, a model organism for morphogenesis and meiosis, harbors duplications of several genes involved in self/nonself-recognition. Furthermore, S. macrospora contains more polyketide biosynthesis genes than N. crassa. Phylogenetic analyses suggest that some of these genes may have been acquired by horizontal gene transfer from a distantly related ascomycete group. Our study shows that, for typical filamentous fungi, de novo assembly of genomes from short sequence reads alone is feasible, that a mixture of Solexa and 454 sequencing substantially improves the assembly, and that the resulting data

  4. Phylogenetic analysis of the core histone doublet and DNA topo II genes of Marseilleviridae: evidence of proto-eukaryotic provenance.

    Science.gov (United States)

    Erives, Albert J

    2017-11-28

    While the genomes of eukaryotes and Archaea both encode the histone-fold domain, only eukaryotes encode the core histone paralogs H2A, H2B, H3, and H4. With DNA, these core histones assemble into the nucleosomal octamer underlying eukaryotic chromatin. Importantly, core histones for H2A and H3 are maintained as neofunctionalized paralogs adapted for general bulk chromatin (canonical H2 and H3) or specialized chromatin (H2A.Z enriched at gene promoters and cenH3s enriched at centromeres). In this context, the identification of core histone-like "doublets" in the cytoplasmic replication factories of the Marseilleviridae (MV) is a novel finding with possible relevance to understanding the origin of eukaryotic chromatin. Here, we analyze and compare the core histone doublet genes from all known MV genomes as well as other MV genes relevant to the origin of the eukaryotic replisome. Using different phylogenetic approaches, we show that MV histone domains encode obligate H2B-H2A and H4-H3 dimers of possible proto-eukaryotic origin. MV core histone moieties form sister clades to each of the four eukaryotic clades of canonical and variant core histones. This suggests that MV core histone moieties diverged prior to eukaryotic neofunctionalizations associated with paired linear chromosomes and variant histone octamer assembly. We also show that MV genomes encode a proto-eukaryotic DNA topoisomerase II enzyme that forms a sister clade to eukaryotes. This is a relevant finding given that DNA topo II influences histone deposition and chromatin compaction and is the second most abundant nuclear protein after histones. The combined domain architecture and phylogenomic analyses presented here suggest that a primitive origin for MV histone genes is a more parsimonious explanation than horizontal gene transfers + gene fusions + sufficient divergence to eliminate relatedness to eukaryotic neofunctionalizations within the H2A and H3 clades without loss of relatedness to each of

  5. Roadmap for annotating transposable elements in eukaryote genomes.

    Science.gov (United States)

    Permal, Emmanuelle; Flutre, Timothée; Quesneville, Hadi

    2012-01-01

    Current high-throughput techniques have made it feasible to sequence even the genomes of non-model organisms. However, the annotation process now represents a bottleneck to genome analysis, especially when dealing with transposable elements (TE). Combined approaches, using both de novo and knowledge-based methods to detect TEs, are likely to produce reasonably comprehensive and sensitive results. This chapter provides a roadmap for researchers involved in genome projects to address this issue. At each step of the TE annotation process, from the identification of TE families to the annotation of TE copies, we outline the tools and good practices to be used.

  6. Molecular fossils in modern genomes provide physiological and geochemical insights to the ancient earth (Invited)

    Science.gov (United States)

    Dupont, C.; Caetano-Anolles, G.

    2010-12-01

    The genomes of extant organisms are ultimately derived from ancient life, thus theoretically contain insight to ancient physiology, ecology, and environments. In particular, metalloenzymes may be particularly insightful. The fundamental chemistry of trace elements dictates the molecular speciation and reactivity both within cells and the environment at large. Using protein structure and comparative genomics, we elucidate several major influences this chemistry has had upon biology. All of life exhibits the same proteome size-dependent scaling for the number of metal-binding proteins within a proteome. This fundamental evolutionary constant shows that the selection of one element occurs at the exclusion of another, with the eschewal of Fe for Zn and Ca being a defining feature of eukaryotic pro- teomes. Early life lacked both the structures required to control intracellular metal concentrations and the metal-binding proteins that catalyze electron transport and redox transformations. The development of protein structures for metal homeostasis coincided with the emergence of metal-specific structures, which predomi- nantly bound metals abundant in the Archean ocean. Potentially, this promoted the diversification of emerging lineages of Archaea and Bacteria through the establishment of biogeochemical cycles. In contrast, structures binding Cu and Zn evolved much later, pro- viding further evidence that environmental availability influenced the selection of the elements. The late evolving Zn-binding proteins are fundamental to eukaryotic cellular biology, and Zn bioavailabil- ity may have been a limiting factor in eukaryotic evolution. The results presented here provide an evolutionary timeline based on genomic characteristics, and key hypotheses can be tested by alternative geochemical methods.

  7. A novel component of the mitochondrial genome segregation machinery in trypanosomes

    Directory of Open Access Journals (Sweden)

    Anneliese Hoffmann

    2016-07-01

    Full Text Available We recently described a new component (TAC102 of the mitochondrial genome segregation machinery (mtGSM in the protozoan parasite Trypanosoma brucei. T. brucei belongs to a group of organisms that contain a single mitochondrial organelle with a single mitochondrial genome (mt-genome per cell. The mt-genome consists of 5000 minicircles (1 kb and 25 maxicircles (23 kb that are catenated into a large network. After replication of the network its segregation is driven by the separating basal bodies, which are homologous structures to the centrioles organizing the spindle apparatus in many eukaryotes. The structure connecting the basal body to the mt-genome was named the Tripartite Attachment Complex (TAC owing its name to the distribution across three areas in the cell including the two mitochondrial membranes.

  8. Massive expansion of the calpain gene family in unicellular eukaryotes

    Directory of Open Access Journals (Sweden)

    Zhao Sen

    2012-09-01

    Full Text Available Abstract Background Calpains are Ca2+-dependent cysteine proteases that participate in a range of crucial cellular processes. Dysfunction of these enzymes may cause, for instance, life-threatening diseases in humans, the loss of sex determination in nematodes and embryo lethality in plants. Although the calpain family is well characterized in animal and plant model organisms, there is a great lack of knowledge about these genes in unicellular eukaryote species (i.e. protists. Here, we study the distribution and evolution of calpain genes in a wide range of eukaryote genomes from major branches in the tree of life. Results Our investigations reveal 24 types of protein domains that are combined with the calpain-specific catalytic domain CysPc. In total we identify 41 different calpain domain architectures, 28 of these domain combinations have not been previously described. Based on our phylogenetic inferences, we propose that at least four calpain variants were established in the early evolution of eukaryotes, most likely before the radiation of all the major supergroups of eukaryotes. Many domains associated with eukaryotic calpain genes can be found among eubacteria or archaebacteria but never in combination with the CysPc domain. Conclusions The analyses presented here show that ancient modules present in prokaryotes, and a few de novo eukaryote domains, have been assembled into many novel domain combinations along the evolutionary history of eukaryotes. Some of the new calpain genes show a narrow distribution in a few branches in the tree of life, likely representing lineage-specific innovations. Hence, the functionally important classical calpain genes found among humans and vertebrates make up only a tiny fraction of the calpain family. In fact, a massive expansion of the calpain family occurred by domain shuffling among unicellular eukaryotes and contributed to a wealth of functionally different genes.

  9. Massive expansion of the calpain gene family in unicellular eukaryotes.

    Science.gov (United States)

    Zhao, Sen; Liang, Zhe; Demko, Viktor; Wilson, Robert; Johansen, Wenche; Olsen, Odd-Arne; Shalchian-Tabrizi, Kamran

    2012-09-29

    Calpains are Ca2+-dependent cysteine proteases that participate in a range of crucial cellular processes. Dysfunction of these enzymes may cause, for instance, life-threatening diseases in humans, the loss of sex determination in nematodes and embryo lethality in plants. Although the calpain family is well characterized in animal and plant model organisms, there is a great lack of knowledge about these genes in unicellular eukaryote species (i.e. protists). Here, we study the distribution and evolution of calpain genes in a wide range of eukaryote genomes from major branches in the tree of life. Our investigations reveal 24 types of protein domains that are combined with the calpain-specific catalytic domain CysPc. In total we identify 41 different calpain domain architectures, 28 of these domain combinations have not been previously described. Based on our phylogenetic inferences, we propose that at least four calpain variants were established in the early evolution of eukaryotes, most likely before the radiation of all the major supergroups of eukaryotes. Many domains associated with eukaryotic calpain genes can be found among eubacteria or archaebacteria but never in combination with the CysPc domain. The analyses presented here show that ancient modules present in prokaryotes, and a few de novo eukaryote domains, have been assembled into many novel domain combinations along the evolutionary history of eukaryotes. Some of the new calpain genes show a narrow distribution in a few branches in the tree of life, likely representing lineage-specific innovations. Hence, the functionally important classical calpain genes found among humans and vertebrates make up only a tiny fraction of the calpain family. In fact, a massive expansion of the calpain family occurred by domain shuffling among unicellular eukaryotes and contributed to a wealth of functionally different genes.

  10. Ensembl Genomes 2013: scaling up access to genome-wide data.

    Science.gov (United States)

    Kersey, Paul Julian; Allen, James E; Christensen, Mikkel; Davis, Paul; Falin, Lee J; Grabmueller, Christoph; Hughes, Daniel Seth Toney; Humphrey, Jay; Kerhornou, Arnaud; Khobova, Julia; Langridge, Nicholas; McDowall, Mark D; Maheswari, Uma; Maslen, Gareth; Nuhn, Michael; Ong, Chuang Kee; Paulini, Michael; Pedro, Helder; Toneva, Iliana; Tuli, Mary Ann; Walts, Brandon; Williams, Gareth; Wilson, Derek; Youens-Clark, Ken; Monaco, Marcela K; Stein, Joshua; Wei, Xuehong; Ware, Doreen; Bolser, Daniel M; Howe, Kevin Lee; Kulesha, Eugene; Lawson, Daniel; Staines, Daniel Michael

    2014-01-01

    Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species. The project exploits and extends technologies for genome annotation, analysis and dissemination, developed in the context of the vertebrate-focused Ensembl project, and provides a complementary set of resources for non-vertebrate species through a consistent set of programmatic and interactive interfaces. These provide access to data including reference sequence, gene models, transcriptional data, polymorphisms and comparative analysis. This article provides an update to the previous publications about the resource, with a focus on recent developments. These include the addition of important new genomes (and related data sets) including crop plants, vectors of human disease and eukaryotic pathogens. In addition, the resource has scaled up its representation of bacterial genomes, and now includes the genomes of over 9000 bacteria. Specific extensions to the web and programmatic interfaces have been developed to support users in navigating these large data sets. Looking forward, analytic tools to allow targeted selection of data for visualization and download are likely to become increasingly important in future as the number of available genomes increases within all domains of life, and some of the challenges faced in representing bacterial data are likely to become commonplace for eukaryotes in future.

  11. Impact of nuclear organization and chromatin structure on DNA repair and genome stability

    International Nuclear Information System (INIS)

    Batte, Amandine

    2016-01-01

    The non-random organization of the eukaryotic cell nucleus and the folding of genome in chromatin more or less condensed can influence many functions related to DNA metabolism, including genome stability. Double-strand breaks (DSBs) are the most deleterious DNA damages for the cells. To preserve genome integrity, eukaryotic cells thus developed DSB repair mechanisms conserved from yeast to human, among which homologous recombination (HR) that uses an intact homologous sequence to repair a broken chromosome. HR can be separated in two sub-pathways: Gene Conversion (GC) transfers genetic information from one molecule to its homologous and Break Induced Replication (BIR) establishes a replication fork than can proceed until the chromosome end. My doctorate work was focused on the contribution of the chromatin context and 3D genome organization on DSB repair. In S. cerevisiae, nuclear organization and heterochromatin spreading at sub-telomeres can be modified through the overexpression of the Sir3 or sir3A2Q mutant proteins. We demonstrated that reducing the physical distance between homologous sequences increased GC rates, reinforcing the notion that homology search is a limiting step for recombination. We also showed that hetero-chromatinization of DSB site fine-tunes DSB resection, limiting the loss of the DSB ends required to perform homology search and complete HR. Finally, we noticed that the presence of heterochromatin at the donor locus decreased both GC and BIR efficiencies, probably by affecting strand invasion. This work highlights new regulatory pathways of DNA repair. (author) [fr

  12. Genetic exchange in eukaryotes through horizontal transfer: connected by the mobilome.

    Science.gov (United States)

    Wallau, Gabriel Luz; Vieira, Cristina; Loreto, Élgion Lúcio Silva

    2018-01-01

    All living species contain genetic information that was once shared by their common ancestor. DNA is being inherited through generations by vertical transmission (VT) from parents to offspring and from ancestor to descendant species. This process was considered the sole pathway by which biological entities exchange inheritable information. However, Horizontal Transfer (HT), the exchange of genetic information by other means than parents to offspring, was discovered in prokaryotes along with strong evidence showing that it is a very important process by which prokaryotes acquire new genes. For some time now, it has been a scientific consensus that HT events were rare and non-relevant for evolution of eukaryotic species, but there is growing evidence supporting that HT is an important and frequent phenomenon in eukaryotes as well. Here, we will discuss the latest findings regarding HT among eukaryotes, mainly HT of transposons (HTT), establishing HTT once and for all as an important phenomenon that should be taken into consideration to fully understand eukaryotes genome evolution. In addition, we will discuss the latest development methods to detect such events in a broader scale and highlight the new approaches which should be pursued by researchers to fill the knowledge gaps regarding HTT among eukaryotes.

  13. Potential and pitfalls of eukaryotic metagenome skimming: a test case for lichens.

    Science.gov (United States)

    Greshake, Bastian; Zehr, Simonida; Dal Grande, Francesco; Meiser, Anjuli; Schmitt, Imke; Ebersberger, Ingo

    2016-03-01

    Whole-genome shotgun sequencing of multispecies communities using only a single library layout is commonly used to assess taxonomic and functional diversity of microbial assemblages. Here, we investigate to what extent such metagenome skimming approaches are applicable for in-depth genomic characterizations of eukaryotic communities, for example lichens. We address how to best assemble a particular eukaryotic metagenome skimming data, what pitfalls can occur, and what genome quality can be expected from these data. To facilitate a project-specific benchmarking, we introduce the concept of twin sets, simulated data resembling the outcome of a particular metagenome sequencing study. We show that the quality of genome reconstructions depends essentially on assembler choice. Individual tools, including the metagenome assemblers Omega and MetaVelvet, are surprisingly sensitive to low and uneven coverages. In combination with the routine of assembly parameter choice to optimize the assembly N50 size, these tools can preclude an entire genome from the assembly. In contrast, MIRA, an all-purpose overlap assembler, and SPAdes, a multisized de Bruijn graph assembler, facilitate a comprehensive view on the individual genomes across a wide range of coverage ratios. Testing assemblers on a real-world metagenome skimming data from the lichen Lasallia pustulata demonstrates the applicability of twin sets for guiding method selection. Furthermore, it reveals that the assembly outcome for the photobiont Trebouxia sp. falls behind the a priori expectation given the simulations. Although the underlying reasons remain still unclear, this highlights that further studies on this organism require special attention during sequence data generation and downstream analysis. © 2015 John Wiley & Sons Ltd.

  14. Prokaryotic and eukaryotic features observed on the secondary structures of Giardia SSU rRNAs and its phylogenetic implications.

    Science.gov (United States)

    Hwang, Ui Wook

    2007-04-01

    Phylogenetic position of a diplomonad protist Giardia, a principle cause of diarrhea, among eukaryotes has been vigorously debated so far. Through the comparisons of primary and secondary structures of SSU rRNAs of G. intestinalis, G. microti, G. ardeae, and G. muris, I found two major indel regions (a 6-nt indel and a 22-26-nt indel), which correspond to the helix 10 of the V2 region and helices E23-8 to E23-9 of the V4 region, respectively. As generally shown in eukaryotes, G. intestinalis and G. microti have commonly a relatively longer helix 10 (a 7-bp stem and a 4-nt loop), and also the eukaryote-specific helices E23-6 to E23-9. On the other hand, G. muris and G. ardeae have a shorter helix 10: a 2-bp stem and a 6-nt loop in G. ardeae and a 3-bp stem and a 6-nt loop in G. muris. In the V4, they have a single long helix (like the P23-1 helix in prokaryotes) instead of the helices E23-6 to E23-9. Among the four Giardia species, co-appearance of prokaryote- and eukaryote-typical features might be significant evidence to suggest that Giardia (Archezoa) is a living fossil showing an "intermediate stage" during the evolution from prokaryotes to eukaryotes.

  15. Classification and Lineage Tracing of SH2 Domains Throughout Eukaryotes.

    Science.gov (United States)

    Liu, Bernard A

    2017-01-01

    Today there exists a rapidly expanding number of sequenced genomes. Cataloging protein interaction domains such as the Src Homology 2 (SH2) domain across these various genomes can be accomplished with ease due to existing algorithms and predictions models. An evolutionary analysis of SH2 domains provides a step towards understanding how SH2 proteins integrated with existing signaling networks to position phosphotyrosine signaling as a crucial driver of robust cellular communication networks in metazoans. However organizing and tracing SH2 domain across organisms and understanding their evolutionary trajectory remains a challenge. This chapter describes several methodologies towards analyzing the evolutionary trajectory of SH2 domains including a global SH2 domain classification system, which facilitates annotation of new SH2 sequences essential for tracing the lineage of SH2 domains throughout eukaryote evolution. This classification utilizes a combination of sequence homology, protein domain architecture and the boundary positions between introns and exons within the SH2 domain or genes encoding these domains. Discrete SH2 families can then be traced across various genomes to provide insight into its origins. Furthermore, additional methods for examining potential mechanisms for divergence of SH2 domains from structural changes to alterations in the protein domain content and genome duplication will be discussed. Therefore a better understanding of SH2 domain evolution may enhance our insight into the emergence of phosphotyrosine signaling and the expansion of protein interaction domains.

  16. The three-dimensional genome organization of Drosophila melanogaster through data integration.

    Science.gov (United States)

    Li, Qingjiao; Tjong, Harianto; Li, Xiao; Gong, Ke; Zhou, Xianghong Jasmine; Chiolo, Irene; Alber, Frank

    2017-07-31

    Genome structures are dynamic and non-randomly organized in the nucleus of higher eukaryotes. To maximize the accuracy and coverage of three-dimensional genome structural models, it is important to integrate all available sources of experimental information about a genome's organization. It remains a major challenge to integrate such data from various complementary experimental methods. Here, we present an approach for data integration to determine a population of complete three-dimensional genome structures that are statistically consistent with data from both genome-wide chromosome conformation capture (Hi-C) and lamina-DamID experiments. Our structures resolve the genome at the resolution of topological domains, and reproduce simultaneously both sets of experimental data. Importantly, this data deconvolution framework allows for structural heterogeneity between cells, and hence accounts for the expected plasticity of genome structures. As a case study we choose Drosophila melanogaster embryonic cells, for which both data types are available. Our three-dimensional genome structures have strong predictive power for structural features not directly visible in the initial data sets, and reproduce experimental hallmarks of the D. melanogaster genome organization from independent and our own imaging experiments. Also they reveal a number of new insights about genome organization and its functional relevance, including the preferred locations of heterochromatic satellites of different chromosomes, and observations about homologous pairing that cannot be directly observed in the original Hi-C or lamina-DamID data. Our approach allows systematic integration of Hi-C and lamina-DamID data for complete three-dimensional genome structure calculation, while also explicitly considering genome structural variability.

  17. Genome-wide analyses and functional classification of proline repeat-rich proteins: potential role of eIF5A in eukaryotic evolution.

    Directory of Open Access Journals (Sweden)

    Ajeet Mandal

    Full Text Available The eukaryotic translation factor, eIF5A has been recently reported as a sequence-specific elongation factor that facilitates peptide bond formation at consecutive prolines in Saccharomyces cerevisiae, as its ortholog elongation factor P (EF-P does in bacteria. We have searched the genome databases of 35 representative organisms from six kingdoms of life for PPP (Pro-Pro-Pro and/or PPG (Pro-Pro-Gly-encoding genes whose expression is expected to depend on eIF5A. We have made detailed analyses of proteome data of 5 selected species, Escherichia coli, Saccharomyces cerevisiae, Drosophila melanogaster, Mus musculus and Homo sapiens. The PPP and PPG motifs are low in the prokaryotic proteomes. However, their frequencies markedly increase with the biological complexity of eukaryotic organisms, and are higher in newly derived proteins than in those orthologous proteins commonly shared in all species. Ontology classifications of S. cerevisiae and human genes encoding the highest level of polyprolines reveal their strong association with several specific biological processes, including actin/cytoskeletal associated functions, RNA splicing/turnover, DNA binding/transcription and cell signaling. Previously reported phenotypic defects in actin polarity and mRNA decay of eIF5A mutant strains are consistent with the proposed role for eIF5A in the translation of the polyproline-containing proteins. Of all the amino acid tandem repeats (≥3 amino acids, only the proline repeat frequency correlates with functional complexity of the five organisms examined. Taken together, these findings suggest the importance of proline repeat-rich proteins and a potential role for eIF5A and its hypusine modification pathway in the course of eukaryotic evolution.

  18. Clusters of orthologous genes for 41 archaeal genomes and implications for evolutionary genomics of archaea

    OpenAIRE

    Wolf Yuri I; Novichkov Pavel S; Sorokin Alexander V; Makarova Kira S; Koonin Eugene V

    2007-01-01

    Abstract Background An evolutionary classification of genes from sequenced genomes that distinguishes between orthologs and paralogs is indispensable for genome annotation and evolutionary reconstruction. Shortly after multiple genome sequences of bacteria, archaea, and unicellular eukaryotes became available, an attempt on such a classification was implemented in Clusters of Orthologous Groups of proteins (COGs). Rapid accumulation of genome sequences creates opportunities for refining COGs ...

  19. How and why DNA barcodes underestimate the diversity of microbial eukaryotes.

    Directory of Open Access Journals (Sweden)

    Gwenael Piganeau

    Full Text Available BACKGROUND: Because many picoplanktonic eukaryotic species cannot currently be maintained in culture, direct sequencing of PCR-amplified 18S ribosomal gene DNA fragments from filtered sea-water has been successfully used to investigate the astounding diversity of these organisms. The recognition of many novel planktonic organisms is thus based solely on their 18S rDNA sequence. However, a species delimited by its 18S rDNA sequence might contain many cryptic species, which are highly differentiated in their protein coding sequences. PRINCIPAL FINDINGS: Here, we investigate the issue of species identification from one gene to the whole genome sequence. Using 52 whole genome DNA sequences, we estimated the global genetic divergence in protein coding genes between organisms from different lineages and compared this to their ribosomal gene sequence divergences. We show that this relationship between proteome divergence and 18S divergence is lineage dependent. Unicellular lineages have especially low 18S divergences relative to their protein sequence divergences, suggesting that 18S ribosomal genes are too conservative to assess planktonic eukaryotic diversity. We provide an explanation for this lineage dependency, which suggests that most species with large effective population sizes will show far less divergence in 18S than protein coding sequences. CONCLUSIONS: There is therefore a trade-off between using genes that are easy to amplify in all species, but which by their nature are highly conserved and underestimate the true number of species, and using genes that give a better description of the number of species, but which are more difficult to amplify. We have shown that this trade-off differs between unicellular and multicellular organisms as a likely consequence of differences in effective population sizes. We anticipate that biodiversity of microbial eukaryotic species is underestimated and that numerous "cryptic species" will become

  20. A Collection of Algal Genomes from the JGI

    Energy Technology Data Exchange (ETDEWEB)

    Kuo, Alan; Grigoriev, Igor

    2012-03-19

    Algae, defined as photosynthetic eukaryotes other than plants, constitute a major component of fundamental eukaryotic diversity. Acquisition of the ability to conduct oxygenic photosynthesis through endosymbiotic events has been a principal driver of eukaryotic evolution, and today algae continue to underpin aquatic food chains as primary producers. Algae play profound roles in the carbon cycle, can impose health and economic costs through toxic blooms, and are candidate sources for bio-fuels; all of these research areas are part of the mission of DOE?s Joint Genome Institute (JGI). A collection of algal projects ongoing at JGI contributes to each of these areas and illustrates analyses employed in their genome exploration.

  1. Conservation and Variability of Meiosis Across the Eukaryotes.

    Science.gov (United States)

    Loidl, Josef

    2016-11-23

    Comparisons among a variety of eukaryotes have revealed considerable variability in the structures and processes involved in their meiosis. Nevertheless, conventional forms of meiosis occur in all major groups of eukaryotes, including early-branching protists. This finding confirms that meiosis originated in the common ancestor of all eukaryotes and suggests that primordial meiosis may have had many characteristics in common with conventional extant meiosis. However, it is possible that the synaptonemal complex and the delicate crossover control related to its presence were later acquisitions. Later still, modifications to meiotic processes occurred within different groups of eukaryotes. Better knowledge on the spectrum of derived and uncommon forms of meiosis will improve our understanding of many still mysterious aspects of the meiotic process and help to explain the evolutionary basis of functional adaptations to the meiotic program.

  2. A second pathway to degrade pyrimidine nucleic acid precursors in eukaryotes

    DEFF Research Database (Denmark)

    Andersen, Gorm; Bjornberg, Olof; Polakova, Silvia

    2008-01-01

    Pyrimidine bases are the central precursors for RNA and DNA, and their intracellular pools are determined by de novo, salvage and catabolic pathways. In eukaryotes, degradation of uracil has been believed to proceed only via the reduction to dihydrouracil. Using a yeast model, Saccharomyces kluyv...... of the eukaryotic or prokaryotic genes involved in pyrimidine degradation described to date.......Pyrimidine bases are the central precursors for RNA and DNA, and their intracellular pools are determined by de novo, salvage and catabolic pathways. In eukaryotes, degradation of uracil has been believed to proceed only via the reduction to dihydrouracil. Using a yeast model, Saccharomyces......, respectively. The gene products of URC1 and URC4 are highly conserved proteins with so far unknown functions and they are present in a variety of prokaryotes and fungi. In bacteria and in some fungi, URC1 and URC4 are linked on the genome together with the gene for uracil phosphoribosyltransferase (URC6). Urc1...

  3. Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites

    DEFF Research Database (Denmark)

    Nielsen, Henrik; Engelbrecht, Jacob; Brunak, Søren

    1997-01-01

    We have developed a new method for the identification of signal peptides and their cleavage based on neural networks trained on separate sets of prokaryotic and eukaryotic sequence. The method performs significantly better than previous prediction schemes and can easily be applied on genome...

  4. Elevated Rate of Genome Rearrangements in Radiation-Resistant Bacteria.

    Science.gov (United States)

    Repar, Jelena; Supek, Fran; Klanjscek, Tin; Warnecke, Tobias; Zahradka, Ksenija; Zahradka, Davor

    2017-04-01

    A number of bacterial, archaeal, and eukaryotic species are known for their resistance to ionizing radiation. One of the challenges these species face is a potent environmental source of DNA double-strand breaks, potential drivers of genome structure evolution. Efficient and accurate DNA double-strand break repair systems have been demonstrated in several unrelated radiation-resistant species and are putative adaptations to the DNA damaging environment. Such adaptations are expected to compensate for the genome-destabilizing effect of environmental DNA damage and may be expected to result in a more conserved gene order in radiation-resistant species. However, here we show that rates of genome rearrangements, measured as loss of gene order conservation with time, are higher in radiation-resistant species in multiple, phylogenetically independent groups of bacteria. Comparison of indicators of selection for genome organization between radiation-resistant and phylogenetically matched, nonresistant species argues against tolerance to disruption of genome structure as a strategy for radiation resistance. Interestingly, an important mechanism affecting genome rearrangements in prokaryotes, the symmetrical inversions around the origin of DNA replication, shapes genome structure of both radiation-resistant and nonresistant species. In conclusion, the opposing effects of environmental DNA damage and DNA repair result in elevated rates of genome rearrangements in radiation-resistant bacteria. Copyright © 2017 Repar et al.

  5. Informational laws of genome structures

    Science.gov (United States)

    Bonnici, Vincenzo; Manca, Vincenzo

    2016-06-01

    In recent years, the analysis of genomes by means of strings of length k occurring in the genomes, called k-mers, has provided important insights into the basic mechanisms and design principles of genome structures. In the present study, we focus on the proper choice of the value of k for applying information theoretic concepts that express intrinsic aspects of genomes. The value k = lg2(n), where n is the genome length, is determined to be the best choice in the definition of some genomic informational indexes that are studied and computed for seventy genomes. These indexes, which are based on information entropies and on suitable comparisons with random genomes, suggest five informational laws, to which all of the considered genomes obey. Moreover, an informational genome complexity measure is proposed, which is a generalized logistic map that balances entropic and anti-entropic components of genomes and is related to their evolutionary dynamics. Finally, applications to computational synthetic biology are briefly outlined.

  6. Genome-wide analysis of putative peroxiredoxin in unicellular and filamentous cyanobacteria

    Directory of Open Access Journals (Sweden)

    Cui Hongli

    2012-11-01

    Full Text Available Abstract Background Cyanobacteria are photoautotrophic prokaryotes with wide variations in genome sizes and ecological habitats. Peroxiredoxin (PRX is an important protein that plays essential roles in protecting own cells against reactive oxygen species (ROS. PRXs have been identified from mammals, fungi and higher plants. However, knowledge on cyanobacterial PRXs still remains obscure. With the availability of 37 sequenced cyanobacterial genomes, we performed a comprehensive comparative analysis of PRXs and explored their diversity, distribution, domain structure and evolution. Results Overall 244 putative prx genes were identified, which were abundant in filamentous diazotrophic cyanobacteria, Acaryochloris marina MBIC 11017, and unicellular cyanobacteria inhabiting freshwater and hot-springs, while poor in all Prochlorococcus and marine Synechococcus strains. Among these putative genes, 25 open reading frames (ORFs encoding hypothetical proteins were identified as prx gene family members and the others were already annotated as prx genes. All 244 putative PRXs were classified into five major subfamilies (1-Cys, 2-Cys, BCP, PRX5_like, and PRX-like according to their domain structures. The catalytic motifs of the cyanobacterial PRXs were similar to those of eukaryotic PRXs and highly conserved in all but the PRX-like subfamily. Classical motif (CXXC of thioredoxin was detected in protein sequences from the PRX-like subfamily. Phylogenetic tree constructed of catalytic domains coincided well with the domain structures of PRXs and the phylogenies based on 16s rRNA. Conclusions The distribution of genes encoding PRXs in different unicellular and filamentous cyanobacteria especially those sub-families like PRX-like or 1-Cys PRX correlate with the genome size, eco-physiology, and physiological properties of the organisms. Cyanobacterial and eukaryotic PRXs share similar conserved motifs, indicating that cyanobacteria adopt similar catalytic

  7. Genome-wide analysis of putative peroxiredoxin in unicellular and filamentous cyanobacteria.

    Science.gov (United States)

    Cui, Hongli; Wang, Yipeng; Wang, Yinchu; Qin, Song

    2012-11-16

    Cyanobacteria are photoautotrophic prokaryotes with wide variations in genome sizes and ecological habitats. Peroxiredoxin (PRX) is an important protein that plays essential roles in protecting own cells against reactive oxygen species (ROS). PRXs have been identified from mammals, fungi and higher plants. However, knowledge on cyanobacterial PRXs still remains obscure. With the availability of 37 sequenced cyanobacterial genomes, we performed a comprehensive comparative analysis of PRXs and explored their diversity, distribution, domain structure and evolution. Overall 244 putative prx genes were identified, which were abundant in filamentous diazotrophic cyanobacteria, Acaryochloris marina MBIC 11017, and unicellular cyanobacteria inhabiting freshwater and hot-springs, while poor in all Prochlorococcus and marine Synechococcus strains. Among these putative genes, 25 open reading frames (ORFs) encoding hypothetical proteins were identified as prx gene family members and the others were already annotated as prx genes. All 244 putative PRXs were classified into five major subfamilies (1-Cys, 2-Cys, BCP, PRX5_like, and PRX-like) according to their domain structures. The catalytic motifs of the cyanobacterial PRXs were similar to those of eukaryotic PRXs and highly conserved in all but the PRX-like subfamily. Classical motif (CXXC) of thioredoxin was detected in protein sequences from the PRX-like subfamily. Phylogenetic tree constructed of catalytic domains coincided well with the domain structures of PRXs and the phylogenies based on 16s rRNA. The distribution of genes encoding PRXs in different unicellular and filamentous cyanobacteria especially those sub-families like PRX-like or 1-Cys PRX correlate with the genome size, eco-physiology, and physiological properties of the organisms. Cyanobacterial and eukaryotic PRXs share similar conserved motifs, indicating that cyanobacteria adopt similar catalytic mechanisms as eukaryotes. All cyanobacterial PRX proteins

  8. Once in a lifetime: strategies for preventing re-replication in prokaryotic and eukaryotic cells

    DEFF Research Database (Denmark)

    Nielsen, Olaf; Løbner-Olesen, Anders

    2008-01-01

    DNA replication is an extremely accurate process and cells have evolved intricate control mechanisms to ensure that each region of their genome is replicated only once during S phase. Here, we compare what is known about the processes that prevent re-replication in prokaryotic and eukaryotic cells...... prokaryotes and eukaryotes are inactivated until the next cell cycle. Furthermore, in both systems the beta-clamp of the replicative polymerase associates with enzymatic activities that contribute to the inactivation of the helicase loaders. Finally, recent studies suggest that the control mechanism...

  9. Dynamic Architecture of Eukaryotic DNA Replication Forks In Vivo, Visualized by Electron Microscopy.

    Science.gov (United States)

    Zellweger, Ralph; Lopes, Massimo

    2018-01-01

    The DNA replication process can be heavily perturbed by several different conditions of genotoxic stress, particularly relevant for cancer onset and therapy. The combination of psoralen crosslinking and electron microscopy has proven instrumental to reveal the fine architecture of in vivo DNA replication intermediates and to uncover their remodeling upon specific conditions of genotoxic stress. The replication structures are stabilized in vivo (by psoralen crosslinking) prior to extraction and enrichment procedures, allowing their visualization at the transmission electron microscope. This chapter outlines the procedures required to visualize and interpret in vivo replication intermediates of eukaryotic genomic DNA, and includes an improved method for enrichment of replication intermediates, compared to previously used BND-cellulose columns.

  10. Hybrid and rogue kinases encoded in the genomes of model eukaryotes.

    Directory of Open Access Journals (Sweden)

    Ramaswamy Rakshambikai

    Full Text Available The highly modular nature of protein kinases generates diverse functional roles mediated by evolutionary events such as domain recombination, insertion and deletion of domains. Usually domain architecture of a kinase is related to the subfamily to which the kinase catalytic domain belongs. However outlier kinases with unusual domain architectures serve in the expansion of the functional space of the protein kinase family. For example, Src kinases are made-up of SH2 and SH3 domains in addition to the kinase catalytic domain. A kinase which lacks these two domains but retains sequence characteristics within the kinase catalytic domain is an outlier that is likely to have modes of regulation different from classical src kinases. This study defines two types of outlier kinases: hybrids and rogues depending on the nature of domain recombination. Hybrid kinases are those where the catalytic kinase domain belongs to a kinase subfamily but the domain architecture is typical of another kinase subfamily. Rogue kinases are those with kinase catalytic domain characteristic of a kinase subfamily but the domain architecture is typical of neither that subfamily nor any other kinase subfamily. This report provides a consolidated set of such hybrid and rogue kinases gleaned from six eukaryotic genomes-S.cerevisiae, D. melanogaster, C.elegans, M.musculus, T.rubripes and H.sapiens-and discusses their functions. The presence of such kinases necessitates a revisiting of the classification scheme of the protein kinase family using full length sequences apart from classical classification using solely the sequences of kinase catalytic domains. The study of these kinases provides a good insight in engineering signalling pathways for a desired output. Lastly, identification of hybrids and rogues in pathogenic protozoa such as P.falciparum sheds light on possible strategies in host-pathogen interactions.

  11. 2004 Structural, Function and Evolutionary Genomics

    Energy Technology Data Exchange (ETDEWEB)

    Douglas L. Brutlag Nancy Ryan Gray

    2005-03-23

    This Gordon conference will cover the areas of structural, functional and evolutionary genomics. It will take a systematic approach to genomics, examining the evolution of proteins, protein functional sites, protein-protein interactions, regulatory networks, and metabolic networks. Emphasis will be placed on what we can learn from comparative genomics and entire genomes and proteomes.

  12. A Three-Dimensional Model of the Yeast Genome

    Science.gov (United States)

    Noble, William; Duan, Zhi-Jun; Andronescu, Mirela; Schutz, Kevin; McIlwain, Sean; Kim, Yoo Jung; Lee, Choli; Shendure, Jay; Fields, Stanley; Blau, C. Anthony

    Layered on top of information conveyed by DNA sequence and chromatin are higher order structures that encompass portions of chromosomes, entire chromosomes, and even whole genomes. Interphase chromosomes are not positioned randomly within the nucleus, but instead adopt preferred conformations. Disparate DNA elements co-localize into functionally defined aggregates or factories for transcription and DNA replication. In budding yeast, Drosophila and many other eukaryotes, chromosomes adopt a Rabl configuration, with arms extending from centromeres adjacent to the spindle pole body to telomeres that abut the nuclear envelope. Nonetheless, the topologies and spatial relationships of chromosomes remain poorly understood. Here we developed a method to globally capture intra- and inter-chromosomal interactions, and applied it to generate a map at kilobase resolution of the haploid genome of Saccharomyces cerevisiae. The map recapitulates known features of genome organization, thereby validating the method, and identifies new features. Extensive regional and higher order folding of individual chromosomes is observed. Chromosome XII exhibits a striking conformation that implicates the nucleolus as a formidable barrier to interaction between DNA sequences at either end. Inter-chromosomal contacts are anchored by centromeres and include interactions among transfer RNA genes, among origins of early DNA replication and among sites where chromosomal breakpoints occur. Finally, we constructed a three-dimensional model of the yeast genome. Our findings provide a glimpse of the interface between the form and function of a eukaryotic genome.

  13. Distinguishing friends, foes, and freeloaders in giant genomes.

    Science.gov (United States)

    Bennetzen, Jeffrey L; Park, Minkyu

    2018-04-01

    Most annotations of large eukaryotic genomes initially find transposable elements (TEs) and other repeats, then mask them so that subsequent efforts can be concentrated on the annotation and study of non-TE genes. However, TEs often contribute to host biology, and their community biologies are of intrinsic interest. This review discusses the challenges, rationale and technologies for comprehensive TE annotation in the commonly giant genomes of animals and plants. Complete discovery of the TEs in a fully sequenced genome is laborious, but feasible, with current strategies in the hands of a careful researcher. These deep TE studies have begun to provide important perspectives on how genomes evolve and the degree to which genome changes do and do not affect eukaryotic biology. Copyright © 2018 The Authors. Published by Elsevier Ltd.. All rights reserved.

  14. Gene Transfer in Eukaryotic Cells Using Activated Dendrimers

    Science.gov (United States)

    Dennig, Jörg

    Gene transfer into eukaryotic cells plays an important role in cell biology. Over the last 30 years a number of transfection methods have been developed to mediate gene transfer into eukaryotic cells. Classical methods include co-precipitation of DNA with calcium phosphate, charge-dependent precipitation of DNA with DEAE-dextran, electroporation of nucleic acids, and formation of transfection complexes between DNA and cationic liposomes. Gene transfer technologies based on activated PAMAM-dendrimers provide another class of transfection reagents. PAMAM-dendrimers are highly branched, spherical molecules. Activation of newly synthesized dendrimers involves hydrolytic removal of some of the branches, and results in a molecule with a higher degree of flexibility. Activated dendrimers assemble DNA into compact structures via charge interactions. Activated dendrimer - DNA complexes bind to the cell membrane of eukaryotic cells, and are transported into the cell by non-specific endocytosis. A structural model of the activated dendrimer - DNA complex and a potential mechanism for its uptake into cells will be discussed.

  15. Archaeal MCM Proteins as an Analog for the Eukaryotic Mcm2–7 Helicase to Reveal Essential Features of Structure and Function

    Science.gov (United States)

    Miller, Justin M.; Enemark, Eric J.

    2015-01-01

    In eukaryotes, the replicative helicase is the large multisubunit CMG complex consisting of the Mcm2–7 hexameric ring, Cdc45, and the tetrameric GINS complex. The Mcm2–7 ring assembles from six different, related proteins and forms the core of this complex. In archaea, a homologous MCM hexameric ring functions as the replicative helicase at the replication fork. Archaeal MCM proteins form thermostable homohexamers, facilitating their use as models of the eukaryotic Mcm2–7 helicase. Here we review archaeal MCM helicase structure and function and how the archaeal findings relate to the eukaryotic Mcm2–7 ring. PMID:26539061

  16. Crystal Structure of the Homing Endonuclease I-CvuI Provides a New Template for Genome Modification

    DEFF Research Database (Denmark)

    Molina, Rafael; Redondo, Pilar; López-Méndez, Blanca

    2015-01-01

    Homing endonucleases recognize and generate a DNA double-strand break, which has been used to promote gene targeting. These enzymes recognize long DNA stretches; they are highly sequence-specific enzymes and display a very low frequency of cleavage even in complete genomes. Although a large number...... of homing endonucleases have been identified, the landscape of possible target sequences is still very limited to cover the complexity of the whole eukaryotic genome. Therefore, the finding and molecular analysis of homing endonucleases identified but not yet characterized may widen the landscape...

  17. Genome of Phaeocystis globosa virus PgV-16T highlights the common ancestry of the largest known DNA viruses infecting eukaryotes

    Science.gov (United States)

    Santini, Sebastien; Jeudy, Sandra; Bartoli, Julia; Poirot, Olivier; Lescot, Magali; Abergel, Chantal; Barbe, Valérie; Wommack, K. Eric; Noordeloos, Anna A. M.; Brussaard, Corina P. D.; Claverie, Jean-Michel

    2013-01-01

    Large dsDNA viruses are involved in the population control of many globally distributed species of eukaryotic phytoplankton and have a prominent role in bloom termination. The genus Phaeocystis (Haptophyta, Prymnesiophyceae) includes several high-biomass-forming phytoplankton species, such as Phaeocystis globosa, the blooms of which occur mostly in the coastal zone of the North Atlantic and the North Sea. Here, we report the 459,984-bp-long genome sequence of P. globosa virus strain PgV-16T, encoding 434 proteins and eight tRNAs and, thus, the largest fully sequenced genome to date among viruses infecting algae. Surprisingly, PgV-16T exhibits no phylogenetic affinity with other viruses infecting microalgae (e.g., phycodnaviruses), including those infecting Emiliania huxleyi, another ubiquitous bloom-forming haptophyte. Rather, PgV-16T belongs to an emerging clade (the Megaviridae) clustering the viruses endowed with the largest known genomes, including Megavirus, Mimivirus (both infecting acanthamoeba), and a virus infecting the marine microflagellate grazer Cafeteria roenbergensis. Seventy-five percent of the best matches of PgV-16T–predicted proteins correspond to two viruses [Organic Lake phycodnavirus (OLPV)1 and OLPV2] from a hypersaline lake in Antarctica (Organic Lake), the hosts of which are unknown. As for OLPVs and other Megaviridae, the PgV-16T sequence data revealed the presence of a virophage-like genome. However, no virophage particle was detected in infected P. globosa cultures. The presence of many genes found only in Megaviridae in its genome and the presence of an associated virophage strongly suggest that PgV-16T shares a common ancestry with the largest known dsDNA viruses, the host range of which already encompasses the earliest diverging branches of domain Eukarya. PMID:23754393

  18. [Compartmentalization of the cell nucleus and spatial organization of the genome].

    Science.gov (United States)

    Gavrilov, A A; Razin, S V

    2015-01-01

    The eukaryotic cell nucleus is one of the most complex cell organelles. Despite the absence of membranes, the nuclear space is divided into numerous compartments where different processes in- volved in the genome activity take place. The most important nuclear compartments include nucleoli, nuclear speckles, PML bodies, Cajal bodies, histone locus bodies, Polycomb bodies, insulator bodies, transcription and replication factories. The structural basis for the nuclear compartmentalization is provided by genomic DNA that occupies most of the nuclear volume. Nuclear compartments, in turn, guide the chromosome folding by providing a platform for the spatial interaction of individual genomic loci. In this review, we discuss fundamental principles of higher order genome organization with a focus on chromosome territories and chromosome domains, as well as consider the structure and function of the key nuclear compartments. We show that the func- tional compartmentalization of the cell nucleus and genome spatial organization are tightly interconnected, and that this form of organization is highly dynamic and is based on stochastic processes.

  19. Functional 5' UTR mRNA structures in eukaryotic translation regulation and how to find them.

    Science.gov (United States)

    Leppek, Kathrin; Das, Rhiju; Barna, Maria

    2018-03-01

    RNA molecules can fold into intricate shapes that can provide an additional layer of control of gene expression beyond that of their sequence. In this Review, we discuss the current mechanistic understanding of structures in 5' untranslated regions (UTRs) of eukaryotic mRNAs and the emerging methodologies used to explore them. These structures may regulate cap-dependent translation initiation through helicase-mediated remodelling of RNA structures and higher-order RNA interactions, as well as cap-independent translation initiation through internal ribosome entry sites (IRESs), mRNA modifications and other specialized translation pathways. We discuss known 5' UTR RNA structures and how new structure probing technologies coupled with prospective validation, particularly compensatory mutagenesis, are likely to identify classes of structured RNA elements that shape post-transcriptional control of gene expression and the development of multicellular organisms.

  20. Genomic hypomethylation in the human germline associates with selective structural mutability in the human genome.

    Directory of Open Access Journals (Sweden)

    Jian Li

    Full Text Available The hotspots of structural polymorphisms and structural mutability in the human genome remain to be explained mechanistically. We examine associations of structural mutability with germline DNA methylation and with non-allelic homologous recombination (NAHR mediated by low-copy repeats (LCRs. Combined evidence from four human sperm methylome maps, human genome evolution, structural polymorphisms in the human population, and previous genomic and disease studies consistently points to a strong association of germline hypomethylation and genomic instability. Specifically, methylation deserts, the ~1% fraction of the human genome with the lowest methylation in the germline, show a tenfold enrichment for structural rearrangements that occurred in the human genome since the branching of chimpanzee and are highly enriched for fast-evolving loci that regulate tissue-specific gene expression. Analysis of copy number variants (CNVs from 400 human samples identified using a custom-designed array comparative genomic hybridization (aCGH chip, combined with publicly available structural variation data, indicates that association of structural mutability with germline hypomethylation is comparable in magnitude to the association of structural mutability with LCR-mediated NAHR. Moreover, rare CNVs occurring in the genomes of individuals diagnosed with schizophrenia, bipolar disorder, and developmental delay and de novo CNVs occurring in those diagnosed with autism are significantly more concentrated within hypomethylated regions. These findings suggest a new connection between the epigenome, selective mutability, evolution, and human disease.

  1. Intermittency as a universal characteristic of the complete chromosome DNA sequences of eukaryotes: From protozoa to human genomes

    Science.gov (United States)

    Rybalko, S.; Larionov, S.; Poptsova, M.; Loskutov, A.

    2011-10-01

    Large-scale dynamical properties of complete chromosome DNA sequences of eukaryotes are considered. Using the proposed deterministic models with intermittency and symbolic dynamics we describe a wide spectrum of large-scale patterns inherent in these sequences, such as segmental duplications, tandem repeats, and other complex sequence structures. It is shown that the recently discovered gene number balance on the strands is not of a random nature, and certain subsystems of a complete chromosome DNA sequence exhibit the properties of deterministic chaos.

  2. Genomics technologies to study structural variations in the grapevine genome

    Directory of Open Access Journals (Sweden)

    Cardone Maria Francesca

    2016-01-01

    Full Text Available Grapevine is one of the most important crop plants in the world. Recently there was great expansion of genomics resources about grapevine genome, thus providing increasing efforts for molecular breeding. Current cultivars display a great level of inter-specific differentiation that needs to be investigated to reach a comprehensive understanding of the genetic basis of phenotypic differences, and to find responsible genes selected by cross breeding programs. While there have been significant advances in resolving the pattern and nature of single nucleotide polymorphisms (SNPs on plant genomes, few data are available on copy number variation (CNV. Furthermore association between structural variations and phenotypes has been described in only a few cases. We combined high throughput biotechnologies and bioinformatics tools, to reveal the first inter-varietal atlas of structural variation (SV for the grapevine genome. We sequenced and compared four table grape cultivars with the Pinot noir inbred line PN40024 genome as the reference. We detected roughly 8% of the grapevine genome affected by genomic variations. Taken into account phenotypic differences existing among the studied varieties we performed comparison of SVs among them and the reference and next we performed an in-depth analysis of gene content of polymorphic regions. This allowed us to identify genes showing differences in copy number as putative functional candidates for important traits in grapevine cultivation.

  3. Recognition of extremophilic archaeal viruses by eukaryotic cells

    DEFF Research Database (Denmark)

    Uldahl, Kristine Buch; Wu, Linping; Hall, Arnaldur

    2016-01-01

    followed viral uptake, intracellular trafficking and cell viability in human endothelial cells of brain (hCMEC/D3 cells) and umbilical vein (HUVEC) origin. Whereas SMV1 is efficiently internalized into both types of human cells, SSV2 differentiates between HUVECs and hCMEC/D3 cells, thus opening a path......Viruses from the third domain of life, Archaea, exhibit unusual features including extreme stability that allow their survival in harsh environments. In addition, these species have never been reported to integrate into human or any other eukaryotic genomes, and could thus serve for exploration...

  4. Requirements and standards for organelle genome databases

    Energy Technology Data Exchange (ETDEWEB)

    Boore, Jeffrey L.

    2006-01-09

    Mitochondria and plastids (collectively called organelles)descended from prokaryotes that adopted an intracellular, endosymbioticlifestyle within early eukaryotes. Comparisons of their remnant genomesaddress a wide variety of biological questions, especially when includingthe genomes of their prokaryotic relatives and the many genes transferredto the eukaryotic nucleus during the transitions from endosymbiont toorganelle. The pace of producing complete organellar genome sequences nowmakes it unfeasible to do broad comparisons using the primary literatureand, even if it were feasible, it is now becoming uncommon for journalsto accept detailed descriptions of genome-level features. Unfortunatelyno database is currently useful for this task, since they have littlestandardization and are riddled with error. Here I outline what iscurrently wrong and what must be done to make this data useful to thescientific community.

  5. Regulated eukaryotic DNA replication origin firing with purified proteins.

    Science.gov (United States)

    Yeeles, Joseph T P; Deegan, Tom D; Janska, Agnieszka; Early, Anne; Diffley, John F X

    2015-03-26

    Eukaryotic cells initiate DNA replication from multiple origins, which must be tightly regulated to promote precise genome duplication in every cell cycle. To accomplish this, initiation is partitioned into two temporally discrete steps: a double hexameric minichromosome maintenance (MCM) complex is first loaded at replication origins during G1 phase, and then converted to the active CMG (Cdc45-MCM-GINS) helicase during S phase. Here we describe the reconstitution of budding yeast DNA replication initiation with 16 purified replication factors, made from 42 polypeptides. Origin-dependent initiation recapitulates regulation seen in vivo. Cyclin-dependent kinase (CDK) inhibits MCM loading by phosphorylating the origin recognition complex (ORC) and promotes CMG formation by phosphorylating Sld2 and Sld3. Dbf4-dependent kinase (DDK) promotes replication by phosphorylating MCM, and can act either before or after CDK. These experiments define the minimum complement of proteins, protein kinase substrates and co-factors required for regulated eukaryotic DNA replication.

  6. Comparative radiobiology of genetic loci of eukaryots as the basis of the general theory of mutations

    International Nuclear Information System (INIS)

    Aleksandrov, I.D.

    1983-01-01

    One of the fundamental problems of modern molecular cellular radiobiology is to reveal general and peculiar processes of the formation of gene mutations and chromosome aberrations in each stage of their formation in the irradiated genome of the higher eukaryots. The solution of the problems depends on the development of research within the framework of comparative radiobiology of genetic loci of the higher eukaryots that makes it possible to study quantitative regularities in the formation of gene (point) mutations and chromosome aberrations in one object and in the same experiment

  7. The genome of the obligate intracellular parasite Trachipleistophora hominis: new insights into microsporidian genome dynamics and reductive evolution.

    Directory of Open Access Journals (Sweden)

    Eva Heinz

    Full Text Available The dynamics of reductive genome evolution for eukaryotes living inside other eukaryotic cells are poorly understood compared to well-studied model systems involving obligate intracellular bacteria. Here we present 8.5 Mb of sequence from the genome of the microsporidian Trachipleistophora hominis, isolated from an HIV/AIDS patient, which is an outgroup to the smaller compacted-genome species that primarily inform ideas of evolutionary mode for these enormously successful obligate intracellular parasites. Our data provide detailed information on the gene content, genome architecture and intergenic regions of a larger microsporidian genome, while comparative analyses allowed us to infer genomic features and metabolism of the common ancestor of the species investigated. Gene length reduction and massive loss of metabolic capacity in the common ancestor was accompanied by the evolution of novel microsporidian-specific protein families, whose conservation among microsporidians, against a background of reductive evolution, suggests they may have important functions in their parasitic lifestyle. The ancestor had already lost many metabolic pathways but retained glycolysis and the pentose phosphate pathway to provide cytosolic ATP and reduced coenzymes, and it had a minimal mitochondrion (mitosome making Fe-S clusters but not ATP. It possessed bacterial-like nucleotide transport proteins as a key innovation for stealing host-generated ATP, the machinery for RNAi, key elements of the early secretory pathway, canonical eukaryotic as well as microsporidian-specific regulatory elements, a diversity of repetitive and transposable elements, and relatively low average gene density. Microsporidian genome evolution thus appears to have proceeded in at least two major steps: an ancestral remodelling of the proteome upon transition to intracellular parasitism that involved reduction but also selective expansion, followed by a secondary compaction of genome

  8. Searching for genomic constraints

    Energy Technology Data Exchange (ETDEWEB)

    Lio` , P [Cambridge, Univ. (United Kingdom). Genetics Dept.; Ruffo, S [Florence, Univ. (Italy). Fac. di Ingegneria. Dipt. di Energetica ` S. Stecco`

    1998-01-01

    The authors have analyzed general properties of very long DNA sequences belonging to simple and complex organisms, by using different correlation methods. They have distinguished those base compositional rules that concern the entire genome which they call `genomic constraints` from the rules that depend on the `external natural selection` acting on single genes, i. e. protein-centered constraints. They show that G + C content, purine / pyrimidine distributions and biological complexity of the organism are the most important factors which determine base compositional rules and genome complexity. Three main facts are here reported: bacteria with high G + C content have more restrictions on base composition than those with low G + C content; at constant G + C content more complex organisms, ranging from prokaryotes to higher eukaryotes (e.g. human) display an increase of repeats 10-20 nucleotides long, which are also partly responsible for long-range correlations; work selection of length 3 to 10 is stronger in human and in bacteria for two distinct reasons. With respect to previous studies, they have also compared the genomic sequence of the archeon Methanococcus jannaschii with those of bacteria and eukaryotes: it shows sometimes an intermediate statistical behaviour.

  9. Searching for genomic constraints

    International Nuclear Information System (INIS)

    Lio', P.; Ruffo, S.

    1998-01-01

    The authors have analyzed general properties of very long DNA sequences belonging to simple and complex organisms, by using different correlation methods. They have distinguished those base compositional rules that concern the entire genome which they call 'genomic constraints' from the rules that depend on the 'external natural selection' acting on single genes, i. e. protein-centered constraints. They show that G + C content, purine / pyrimidine distributions and biological complexity of the organism are the most important factors which determine base compositional rules and genome complexity. Three main facts are here reported: bacteria with high G + C content have more restrictions on base composition than those with low G + C content; at constant G + C content more complex organisms, ranging from prokaryotes to higher eukaryotes (e.g. human) display an increase of repeats 10-20 nucleotides long, which are also partly responsible for long-range correlations; work selection of length 3 to 10 is stronger in human and in bacteria for two distinct reasons. With respect to previous studies, they have also compared the genomic sequence of the archeon Methanococcus jannaschii with those of bacteria and eukaryotes: it shows sometimes an intermediate statistical behaviour

  10. New insights into the structural organization of eukaryotic and prokaryotic cytoskeletons using cryo-electron tomography

    International Nuclear Information System (INIS)

    Kuerner, Julia; Medalia, Ohad; Linaroudis, Alexandros A.; Baumeister, Wolfgang

    2004-01-01

    Cryo-electron tomography (cryo-ET) is an emerging imaging technology that combines the potential of three-dimensional (3-D) imaging at molecular resolution (<5 nm) with a close-to-life preservation of the specimen. In conjunction with pattern recognition techniques, it enables us to map the molecular landscape inside cells. The application of cryo-ET to intact cells provides novel insights into the structure and the spatial organization of the cytoskeleton in prokaryotic and eukaryotic cells

  11. Elevated Rate of Genome Rearrangements in Radiation-Resistant Bacteria

    OpenAIRE

    Repar, Jelena; Supek, Fran; Klanjscek, Tin; Warnecke, Tobias; Zahradka, Ksenija; Zahradka, Davor

    2017-01-01

    A number of bacterial, archaeal, and eukaryotic species are known for their resistance to ionizing radiation. One of the challenges these species face is a potent environmental source of DNA double-strand breaks, potential drivers of genome structure evolution. Efficient and accurate DNA double-strand break repair systems have been demonstrated in several unrelated radiation-resistant species and are putative adaptations to the DNA damaging environment. Such adaptations are expected to compen...

  12. Universal Temporal Profile of Replication Origin Activation in Eukaryotes

    Science.gov (United States)

    Goldar, Arach

    2011-03-01

    The complete and faithful transmission of eukaryotic genome to daughter cells involves the timely duplication of mother cell's DNA. DNA replication starts at multiple chromosomal positions called replication origin. From each activated replication origin two replication forks progress in opposite direction and duplicate the mother cell's DNA. While it is widely accepted that in eukaryotic organisms replication origins are activated in a stochastic manner, little is known on the sources of the observed stochasticity. It is often associated to the population variability to enter S phase. We extract from a growing Saccharomyces cerevisiae population the average rate of origin activation in a single cell by combining single molecule measurements and a numerical deconvolution technique. We show that the temporal profile of the rate of origin activation in a single cell is similar to the one extracted from a replicating cell population. Taking into account this observation we exclude the population variability as the origin of observed stochasticity in origin activation. We confirm that the rate of origin activation increases in the early stage of S phase and decreases at the latter stage. The population average activation rate extracted from single molecule analysis is in prefect accordance with the activation rate extracted from published micro-array data, confirming therefore the homogeneity and genome scale invariance of dynamic of replication process. All these observations point toward a possible role of replication fork to control the rate of origin activation.

  13. Genomes of coral dinoflagellate symbionts highlight evolutionary adaptations conducive to a symbiotic lifestyle

    KAUST Repository

    Aranda, Manuel

    2016-12-22

    Despite half a century of research, the biology of dinoflagellates remains enigmatic: they defy many functional and genetic traits attributed to typical eukaryotic cells. Genomic approaches to study dinoflagellates are often stymied due to their large, multi-gigabase genomes. Members of the genus Symbiodinium are photosynthetic endosymbionts of stony corals that provide the foundation of coral reef ecosystems. Their smaller genome sizes provide an opportunity to interrogate evolution and functionality of dinoflagellate genomes and endosymbiosis. We sequenced the genome of the ancestral Symbiodinium microadriaticum and compared it to the genomes of the more derived Symbiodinium minutum and Symbiodinium kawagutii and eukaryote model systems as well as transcriptomes from other dinoflagellates. Comparative analyses of genome and transcriptome protein sets show that all dinoflagellates, not only Symbiodinium, possess significantly more transmembrane transporters involved in the exchange of amino acids, lipids, and glycerol than other eukaryotes. Importantly, we find that only Symbiodinium harbor an extensive transporter repertoire associated with the provisioning of carbon and nitrogen. Analyses of these transporters show species-specific expansions, which provides a genomic basis to explain differential compatibilities to an array of hosts and environments, and highlights the putative importance of gene duplications as an evolutionary mechanism in dinoflagellates and Symbiodinium.

  14. Genomes of coral dinoflagellate symbionts highlight evolutionary adaptations conducive to a symbiotic lifestyle

    KAUST Repository

    Aranda, Manuel; Li, Yangyang; Liew, Yi Jin; Baumgarten, Sebastian; Simakov, O.; Wilson, M. C.; Piel, J.; Ashoor, Haitham; Bougouffa, Salim; Bajic, Vladimir B.; Ryu, Tae Woo; Ravasi, Timothy; Bayer, Till; Micklem, G.; Kim, H.; Bhak, J.; LaJeunesse, T. C.; Voolstra, Christian R.

    2016-01-01

    Despite half a century of research, the biology of dinoflagellates remains enigmatic: they defy many functional and genetic traits attributed to typical eukaryotic cells. Genomic approaches to study dinoflagellates are often stymied due to their large, multi-gigabase genomes. Members of the genus Symbiodinium are photosynthetic endosymbionts of stony corals that provide the foundation of coral reef ecosystems. Their smaller genome sizes provide an opportunity to interrogate evolution and functionality of dinoflagellate genomes and endosymbiosis. We sequenced the genome of the ancestral Symbiodinium microadriaticum and compared it to the genomes of the more derived Symbiodinium minutum and Symbiodinium kawagutii and eukaryote model systems as well as transcriptomes from other dinoflagellates. Comparative analyses of genome and transcriptome protein sets show that all dinoflagellates, not only Symbiodinium, possess significantly more transmembrane transporters involved in the exchange of amino acids, lipids, and glycerol than other eukaryotes. Importantly, we find that only Symbiodinium harbor an extensive transporter repertoire associated with the provisioning of carbon and nitrogen. Analyses of these transporters show species-specific expansions, which provides a genomic basis to explain differential compatibilities to an array of hosts and environments, and highlights the putative importance of gene duplications as an evolutionary mechanism in dinoflagellates and Symbiodinium.

  15. Functional Insights from Structural Genomics

    Energy Technology Data Exchange (ETDEWEB)

    Forouhar,F.; Kuzin, A.; Seetharaman, J.; Lee, I.; Zhou, W.; Abashidze, M.; Chen, Y.; Montelione, G.; Tong, L.; et al

    2007-01-01

    Structural genomics efforts have produced structural information, either directly or by modeling, for thousands of proteins over the past few years. While many of these proteins have known functions, a large percentage of them have not been characterized at the functional level. The structural information has provided valuable functional insights on some of these proteins, through careful structural analyses, serendipity, and structure-guided functional screening. Some of the success stories based on structures solved at the Northeast Structural Genomics Consortium (NESG) are reported here. These include a novel methyl salicylate esterase with important role in plant innate immunity, a novel RNA methyltransferase (H. influenzae yggJ (HI0303)), a novel spermidine/spermine N-acetyltransferase (B. subtilis PaiA), a novel methyltransferase or AdoMet binding protein (A. fulgidus AF{_}0241), an ATP:cob(I)alamin adenosyltransferase (B. subtilis YvqK), a novel carboxysome pore (E. coli EutN), a proline racemase homolog with a disrupted active site (B. melitensis BME11586), an FMN-dependent enzyme (S. pneumoniae SP{_}1951), and a 12-stranded {beta}-barrel with a novel fold (V. parahaemolyticus VPA1032).

  16. Communities of microbial eukaryotes in the mammalian gut within the context of environmental eukaryotic diversity

    Energy Technology Data Exchange (ETDEWEB)

    Parfrey, Laura Wegener; Walters, William A.; Lauber, Christian L.; Clemente, Jose C.; Berg-Lyons, Donna; Teiling, Clotilde; Kodira, Chinnappa; Mohiuddin, Mohammed; Brunelle, Julie; Driscoll, Mark; Fierer, Noah; Gilbert, Jack A.; Knight, Rob

    2014-06-19

    Eukaryotic microbes (protists) residing in the vertebrate gut influence host health and disease, but their diversity and distribution in healthy hosts is poorly understood. Protists found in the gut are typically considered parasites, but many are commensal and some are beneficial. Further, the hygiene hypothesis predicts that association with our co-evolved microbial symbionts may be important to overall health. It is therefore imperative that we understand the normal diversity of our eukaryotic gut microbiota to test for such effects and avoid eliminating commensal organisms. We assembled a dataset of healthy individuals from two populations, one with traditional, agrarian lifestyles and a second with modern, westernized lifestyles, and characterized the human eukaryotic microbiota via high-throughput sequencing. To place the human gut microbiota within a broader context our dataset also includes gut samples from diverse mammals and samples from other aquatic and terrestrial environments. We curated the SILVA ribosomal database to reflect current knowledge of eukaryotic taxonomy and employ it as a phylogenetic framework to compare eukaryotic diversity across environment. We show that adults from the non-western population harbor a diverse community of protists, and diversity in the human gut is comparable to that in other mammals. However, the eukaryotic microbiota of the western population appears depauperate. The distribution of symbionts found in mammals reflects both host phylogeny and diet. Eukaryotic microbiota in the gut are less diverse and more patchily distributed than bacteria. More broadly, we show that eukaryotic communities in the gut are less diverse than in aquatic and terrestrial habitats, and few taxa are shared across habitat types, and diversity patterns of eukaryotes are correlated with those observed for bacteria. These results outline the distribution and diversity of microbial eukaryotic communities in the mammalian gut and across

  17. Functional and evolutionary analysis of alternatively spliced genes is consistent with an early eukaryotic origin of alternative splicing

    Directory of Open Access Journals (Sweden)

    Penny David

    2007-10-01

    Full Text Available Abstract Background Alternative splicing has been reported in various eukaryotic groups including plants, apicomplexans, diatoms, amoebae, animals and fungi. However, whether widespread alternative splicing has evolved independently in the different eukaryotic groups or was inherited from their last common ancestor, and may therefore predate multicellularity, is still unknown. To better understand the origin and evolution of alternative splicing and its usage in diverse organisms, we studied alternative splicing in 12 eukaryotic species, comparing rates of alternative splicing across genes of different functional classes, cellular locations, intron/exon structures and evolutionary origins. Results For each species, we find that genes from most functional categories are alternatively spliced. Ancient genes (shared between animals, fungi and plants show high levels of alternative splicing. Genes with products expressed in the nucleus or plasma membrane are generally more alternatively spliced while those expressed in extracellular location show less alternative splicing. We find a clear correspondence between incidence of alternative splicing and intron number per gene both within and between genomes. In general, we find several similarities in patterns of alternative splicing across these diverse eukaryotes. Conclusion Along with previous studies indicating intron-rich genes with weak intron boundary consensus and complex spliceosomes in ancestral organisms, our results suggest that at least a simple form of alternative splicing may already have been present in the unicellular ancestor of plants, fungi and animals. A role for alternative splicing in the evolution of multicellularity then would largely have arisen by co-opting the preexisting process.

  18. Functions and structures of eukaryotic recombination proteins

    International Nuclear Information System (INIS)

    Ogawa, Tomoko

    1994-01-01

    We have found that Rad51 and RecA Proteins form strikingly similar structures together with dsDNA and ATP. Their right handed helical nucleoprotein filaments extend the B-form DNA double helixes to 1.5 times in length and wind the helix. The similarity and uniqueness of their structures must reflect functional homologies between these proteins. Therefore, it is highly probable that similar recombination proteins are present in various organisms of different evolutional states. We have succeeded to clone RAD51 genes from human, mouse, chicken and fission yeast genes, and found that the homologues are widely distributed in eukaryotes. The HsRad51 and MmRad51 or ChRad51 proteins consist of 339 amino acids differing only by 4 or 12 amino acids, respectively, and highly homologous to both yeast proteins, but less so to Dmcl. All of these proteins are homologous to the region from residues 33 to 240 of RecA which was named ''homologous core. The homologous core is likely to be responsible for functions common for all of them, such as the formation of helical nucleoprotein filament that is considered to be involved in homologous pairing in the recombination reaction. The mouse gene is transcribed at a high level in thymus, spleen, testis, and ovary, at lower level in brain and at a further lower level in some other tissues. It is transcribed efficiently in recombination active tissues. A clear functional difference of Rad51 homologues from RecA was suggested by the failure of heterologous genes to complement the deficiency of Scrad51 mutants. This failure seems to reflect the absence of a compatible partner, such as ScRad52 protein in the case of ScRad51 protein, between different species. Thus, these discoveries play a role of the starting point to understand the fundamental gene targeting in mammalian cells and in gene therapy. (J.P.N.)

  19. Functional 5′ UTR mRNA structures in eukaryotic translation regulation and how to find them

    Science.gov (United States)

    Leppek, Kathrin; Das, Rhiju; Barna, Maria

    2017-01-01

    RNA molecules can fold into intricate shapes that can provide an additional layer of control of gene expression beyond that of their sequence. In this Review, we discuss the current mechanistic understanding of structures in 5′ untranslated regions (UTRs) of eukaryotic mRNAs and the emerging methodologies used to explore them. These structures may regulate cap-dependent translation initiation through helicase-mediated remodelling of RNA structures and higher-order RNA interactions, as well as cap-independent translation initiation through internal ribosome entry sites (IRESs), mRNA modifications and other specialized translation pathways. We discuss known 5′ UTR RNA structures and how new structure probing technologies coupled with prospective validation, particularly compensatory mutagenesis, are likely to identify classes of structured RNA elements that shape post-transcriptional control of gene expression and the development of multicellular organisms. PMID:29165424

  20. Structural genomic variation in ischemic stroke

    Science.gov (United States)

    Matarin, Mar; Simon-Sanchez, Javier; Fung, Hon-Chung; Scholz, Sonja; Gibbs, J. Raphael; Hernandez, Dena G.; Crews, Cynthia; Britton, Angela; Wavrant De Vrieze, Fabienne; Brott, Thomas G.; Brown, Robert D.; Worrall, Bradford B.; Silliman, Scott; Case, L. Douglas; Hardy, John A.; Rich, Stephen S.; Meschia, James F.; Singleton, Andrew B.

    2008-01-01

    Technological advances in molecular genetics allow rapid and sensitive identification of genomic copy number variants (CNVs). This, in turn, has sparked interest in the function such variation may play in disease. While a role for copy number mutations as a cause of Mendelian disorders is well established, it is unclear whether CNVs may affect risk for common complex disorders. We sought to investigate whether CNVs may modulate risk for ischemic stroke (IS) and to provide a catalog of CNVs in patients with this disorder by analyzing copy number metrics produced as a part of our previous genome-wide single-nucleotide polymorphism (SNP)-based association study of ischemic stroke in a North American white population. We examined CNVs in 263 patients with ischemic stroke (IS). Each identified CNV was compared with changes identified in 275 neurologically normal controls. Our analysis identified 247 CNVs, corresponding to 187 insertions (76%; 135 heterozygous; 25 homozygous duplications or triplications; 2 heterosomic) and 60 deletions (24%; 40 heterozygous deletions;3 homozygous deletions; 14 heterosomic deletions). Most alterations (81%) were the same as, or overlapped with, previously reported CNVs. We report here the first genome-wide analysis of CNVs in IS patients. In summary, our study did not detect any common genomic structural variation unequivocally linked to IS, although we cannot exclude that smaller CNVs or CNVs in genomic regions poorly covered by this methodology may confer risk for IS. The application of genome-wide SNP arrays now facilitates the evaluation of structural changes through the entire genome as part of a genome-wide genetic association study. PMID:18288507

  1. The Big Bang of picorna-like virus evolution antedates the radiation of eukaryotic supergroups.

    Science.gov (United States)

    Koonin, Eugene V; Wolf, Yuri I; Nagasaki, Keizo; Dolja, Valerian V

    2008-12-01

    The recent discovery of RNA viruses in diverse unicellular eukaryotes and developments in evolutionary genomics have provided the means for addressing the origin of eukaryotic RNA viruses. The phylogenetic analyses of RNA polymerases and helicases presented in this Analysis article reveal close evolutionary relationships between RNA viruses infecting hosts from the Chromalveolate and Excavate supergroups and distinct families of picorna-like viruses of plants and animals. Thus, diversification of picorna-like viruses probably occurred in a 'Big Bang' concomitant with key events of eukaryogenesis. The origins of the conserved genes of picorna-like viruses are traced to likely ancestors including bacterial group II retroelements, the family of HtrA proteases and DNA bacteriophages.

  2. Comparative genomic analysis of multi-subunit tethering complexes demonstrates an ancient pan-eukaryotic complement and sculpting in Apicomplexa.

    Directory of Open Access Journals (Sweden)

    Christen M Klinger

    Full Text Available Apicomplexa are obligate intracellular parasites that cause tremendous disease burden world-wide. They utilize a set of specialized secretory organelles in their invasive process that require delivery of components for their biogenesis and function, yet the precise mechanisms underpinning such processes remain unclear. One set of potentially important components is the multi-subunit tethering complexes (MTCs, factors increasingly implicated in all aspects of vesicle-target interactions. Prompted by the results of previous studies indicating a loss of membrane trafficking factors in Apicomplexa, we undertook a bioinformatic analysis of MTC conservation. Building on knowledge of the ancient presence of most MTC proteins, we demonstrate the near complete retention of MTCs in the newly available genomes for Guillardiatheta and Bigelowiellanatans. The latter is a key taxonomic sampling point as a basal sister taxa to the group including Apicomplexa. We also demonstrate an ancient origin of the CORVET complex subunits Vps8 and Vps3, as well as the TRAPPII subunit Tca17. Having established that the lineage leading to Apicomplexa did at one point possess the complete eukaryotic complement of MTC components, we undertook a deeper taxonomic investigation in twelve apicomplexan genomes. We observed excellent conservation of the VpsC core of the HOPS and CORVET complexes, as well as the core TRAPP subunits, but sparse conservation of TRAPPII, COG, Dsl1, and HOPS/CORVET-specific subunits. However, those subunits that we did identify appear to be expressed with similar patterns to the fully conserved MTC proteins, suggesting that they may function as minimal complexes or with analogous partners. Strikingly, we failed to identify any subunits of the exocyst complex in all twelve apicomplexan genomes, as well as the dinoflagellate Perkinsus marinus. Overall, we demonstrate reduction of MTCs in Apicomplexa and their ancestors, consistent with modification during

  3. Systematics of Short-range Correlations in Eukaryotic Genomes

    Science.gov (United States)

    Hameister, Jörn; Helm, Werner E.; Hütt, Marc-Thorsten; Dehnert, Manuel

    Attempts to identify a species on the basis of its DNA sequence on purely statistical grounds have been formulated for more than a decade. Solving this problem could have a huge impact on understanding processes of genome evolution and on the design of classification schemes for DNA sequences.

  4. Structure of Prokaryotic Polyamine Deacetylase Reveals Evolutionary Functional Relationships with Eukaryotic Histone Deacetylases

    Energy Technology Data Exchange (ETDEWEB)

    P Lombardi; H Angell; D Whittington; E Flynn; K Rajashankar; D Christianson

    2011-12-31

    Polyamines are a ubiquitous class of polycationic small molecules that can influence gene expression by binding to nucleic acids. Reversible polyamine acetylation regulates nucleic acid binding and is required for normal cell cycle progression and proliferation. Here, we report the structures of Mycoplana ramosa acetylpolyamine amidohydrolase (APAH) complexed with a transition state analogue and a hydroxamate inhibitor and an inactive mutant complexed with two acetylpolyamine substrates. The structure of APAH is the first of a histone deacetylase-like oligomer and reveals that an 18-residue insert in the L2 loop promotes dimerization and the formation of an 18 {angstrom} long 'L'-shaped active site tunnel at the dimer interface, accessible only to narrow and flexible substrates. The importance of dimerization for polyamine deacetylase function leads to the suggestion that a comparable dimeric or double-domain histone deacetylase could catalyze polyamine deacetylation reactions in eukaryotes.

  5. Meeting Report: Towards the Visualization of Genome Activity at Nanoscale Dimensions

    International Nuclear Information System (INIS)

    Ritland Politz, Joan C.

    2006-01-01

    A report on the Fifth Annual Nanostructural Genomics meeting, Bar Harbor, USA, 7-10 September 2005. It is a rare meeting where one can hear the latest developments in comparative genome analysis, relate these findings to advances in understanding both the linear and three-dimensional organization of the eukaryotic genome, and see it all beginning to fit into the context of the structure and function of the nucleus, visualized using state-of-the art labeling and microscopic techniques. These cross-disciplinary areas of research have been presented by a diverse group of scientists for the past five years at the Nanostructural Genomics meeting at the Jackson Laboratory in Bar Harbor, and the 2005 meeting again gave attendees much food for thought. In summary, the meeting provided a delightfully unique perspective on the application of exciting experimental breakthroughs at the interface of genomics, cell biology and optical physics.

  6. Characterization of an eukaryotic peptide deformylase from Plasmodium falciparum.

    Science.gov (United States)

    Bracchi-Ricard, V; Nguyen, K T; Zhou, Y; Rajagopalan, P T; Chakrabarti, D; Pei, D

    2001-12-15

    Ribosomal protein synthesis in eubacteria and eukaryotic organelles initiates with an N-formylmethionyl-tRNA(i), resulting in N-terminal formylation of all nascent polypeptides. Peptide deformylase (PDF) catalyzes the subsequent removal of the N-terminal formyl group from the majority of bacterial proteins. Until recently, PDF has been thought as an enzyme unique to the bacterial kingdom. Searches of the genomic DNA databases identified several genes that encode proteins of high sequence homology to bacterial PDF from eukaryotic organisms. The cDNA encoding Plasmodium falciparum PDF (PfPDF) has been cloned and overexpressed in Escherichia coli. The recombinant protein is catalytically active in deformylating N-formylated peptides, shares many of the properties of bacterial PDF, and is inhibited by specific PDF inhibitors. Western blot analysis indicated expression of mature PfPDF in trophozoite, schizont, and segmenter stages of intraerythrocytic development. These results provide strong evidence that a functional PDF is present in P. falciparum. In addition, PDF inhibitors inhibited the growth of P. falciparum in the intraerythrocytic culture. (c)2001 Elsevier Science.

  7. RNA Export through the NPC in Eukaryotes.

    Science.gov (United States)

    Okamura, Masumi; Inose, Haruko; Masuda, Seiji

    2015-03-20

    In eukaryotic cells, RNAs are transcribed in the nucleus and exported to the cytoplasm through the nuclear pore complex. The RNA molecules that are exported from the nucleus into the cytoplasm include messenger RNAs (mRNAs), ribosomal RNAs (rRNAs), transfer RNAs (tRNAs), small nuclear RNAs (snRNAs), micro RNAs (miRNAs), and viral mRNAs. Each RNA is transported by a specific nuclear export receptor. It is believed that most of the mRNAs are exported by Nxf1 (Mex67 in yeast), whereas rRNAs, snRNAs, and a certain subset of mRNAs are exported in a Crm1/Xpo1-dependent manner. tRNAs and miRNAs are exported by Xpot and Xpo5. However, multiple export receptors are involved in the export of some RNAs, such as 60S ribosomal subunit. In addition to these export receptors, some adapter proteins are required to export RNAs. The RNA export system of eukaryotic cells is also used by several types of RNA virus that depend on the machineries of the host cell in the nucleus for replication of their genome, therefore this review describes the RNA export system of two representative viruses. We also discuss the NPC anchoring-dependent mRNA export factors that directly recruit specific genes to the NPC.

  8. Insights into the Initiation of Eukaryotic DNA Replication.

    Science.gov (United States)

    Bruck, Irina; Perez-Arnaiz, Patricia; Colbert, Max K; Kaplan, Daniel L

    2015-01-01

    The initiation of DNA replication is a highly regulated event in eukaryotic cells to ensure that the entire genome is copied once and only once during S phase. The primary target of cellular regulation of eukaryotic DNA replication initiation is the assembly and activation of the replication fork helicase, the 11-subunit assembly that unwinds DNA at a replication fork. The replication fork helicase, called CMG for Cdc45-Mcm2-7, and GINS, assembles in S phase from the constituent Cdc45, Mcm2-7, and GINS proteins. The assembly and activation of the CMG replication fork helicase during S phase is governed by 2 S-phase specific kinases, CDK and DDK. CDK stimulates the interaction between Sld2, Sld3, and Dpb11, 3 initiation factors that are each required for the initiation of DNA replication. DDK, on the other hand, phosphorylates the Mcm2, Mcm4, and Mcm6 subunits of the Mcm2-7 complex. Sld3 recruits Cdc45 to Mcm2-7 in a manner that depends on DDK, and recent work suggests that Sld3 binds directly to Mcm2-7 and also to single-stranded DNA. Furthermore, recent work demonstrates that Sld3 and its human homolog Treslin substantially stimulate DDK phosphorylation of Mcm2. These data suggest that the initiation factor Sld3/Treslin coordinates the assembly and activation of the eukaryotic replication fork helicase by recruiting Cdc45 to Mcm2-7, stimulating DDK phosphorylation of Mcm2, and binding directly to single-stranded DNA as the origin is melted.

  9. Simultaneous Structural Variation Discovery in Multiple Paired-End Sequenced Genomes

    Science.gov (United States)

    Hormozdiari, Fereydoun; Hajirasouliha, Iman; McPherson, Andrew; Eichler, Evan E.; Sahinalp, S. Cenk

    Next generation sequencing technologies have been decreasing the costs and increasing the world-wide capacity for sequence production at an unprecedented rate, making the initiation of large scale projects aiming to sequence almost 2000 genomes [1]. Structural variation detection promises to be one of the key diagnostic tools for cancer and other diseases with genomic origin. In this paper, we study the problem of detecting structural variation events in two or more sequenced genomes through high throughput sequencing . We propose to move from the current model of (1) detecting genomic variations in single next generation sequenced (NGS) donor genomes independently, and (2) checking whether two or more donor genomes indeed agree or disagree on the variations (in this paper we name this framework Independent Structural Variation Discovery and Merging - ISV&M), to a new model in which we detect structural variation events among multiple genomes simultaneously.

  10. A novel rat genomic simple repeat DNA with RNA-homology shows triplex (H-DNA)-like structure and tissue-specific RNA expression

    International Nuclear Information System (INIS)

    Dey, Indranil; Rath, Pramod C.

    2005-01-01

    Mammalian genome contains a wide variety of repetitive DNA sequences of relatively unknown function. We report a novel 227 bp simple repeat DNA (3.3 DNA) with a d {(GA) 7 A (AG) 7 } dinucleotide mirror repeat from the rat (Rattus norvegicus) genome. 3.3 DNA showed 75-85% homology with several eukaryotic mRNAs due to (GA/CU) n dinucleotide repeats by nBlast search and a dispersed distribution in the rat genome by Southern blot hybridization with [ 32 P]3.3 DNA. The d {(GA) 7 A (AG) 7 } mirror repeat formed a triplex (H-DNA)-like structure in vitro. Two large RNAs of 9.1 and 7.5 kb were detected by [ 32 P]3.3 DNA in rat brain by Northern blot hybridization indicating expression of such simple sequence repeats at RNA level in vivo. Further, several cDNAs were isolated from a rat cDNA library by [ 32 P]3.3 DNA probe. Three such cDNAs showed tissue-specific RNA expression in rat. pRT 4.1 cDNA showed strong expression of a 2.39 kb RNA in brain and spleen, pRT 5.5 cDNA showed strong expression of a 2.8 kb RNA in brain and a 3.9 kb RNA in lungs, and pRT 11.4 cDNA showed weak expression of a 2.4 kb RNA in lungs. Thus, genomic simple sequence repeats containing d (GA/CT) n dinucleotides are transcriptionally expressed and regulated in rat tissues. Such d (GA/CT) n dinucleotide repeats may form structural elements (e.g., triplex) which may be sites for functional regulation of genomic coding sequences as well as RNAs. This may be a general function of such transcriptionally active simple sequence repeats widely dispersed in mammalian genome

  11. Using Genomics for Natural Product Structure Elucidation.

    Science.gov (United States)

    Tietz, Jonathan I; Mitchell, Douglas A

    2016-01-01

    Natural products (NPs) are the most historically bountiful source of chemical matter for drug development-especially for anti-infectives. With insights gleaned from genome mining, interest in natural product discovery has been reinvigorated. An essential stage in NP discovery is structural elucidation, which sheds light not only on the chemical composition of a molecule but also its novelty, properties, and derivatization potential. The history of structure elucidation is replete with techniquebased revolutions: combustion analysis, crystallography, UV, IR, MS, and NMR have each provided game-changing advances; the latest such advance is genomics. All natural products have a genetic basis, and the ability to obtain and interpret genomic information for structure elucidation is increasingly available at low cost to non-specialists. In this review, we describe the value of genomics as a structural elucidation technique, especially from the perspective of the natural product chemist approaching an unknown metabolite. Herein we first introduce the databases and programs of interest to the natural products chemist, with an emphasis on those currently most suited for general usability. We describe strategies for linking observed natural product-linked phenotypes to their corresponding gene clusters. We then discuss techniques for extracting structural information from genes, illustrated with numerous case examples. We also provide an analysis of the biases and limitations of the field with recommendations for future development. Our overview is not only aimed at biologically-oriented researchers already at ease with bioinformatic techniques, but also, in particular, at natural product, organic, and/or medicinal chemists not previously familiar with genomic techniques.

  12. i-Genome: A database to summarize oligonucleotide data in genomes

    Directory of Open Access Journals (Sweden)

    Chang Yu-Chung

    2004-10-01

    Full Text Available Abstract Background Information on the occurrence of sequence features in genomes is crucial to comparative genomics, evolutionary analysis, the analyses of regulatory sequences and the quantitative evaluation of sequences. Computing the frequencies and the occurrences of a pattern in complete genomes is time-consuming. Results The proposed database provides information about sequence features generated by exhaustively computing the sequences of the complete genome. The repetitive elements in the eukaryotic genomes, such as LINEs, SINEs, Alu and LTR, are obtained from Repbase. The database supports various complete genomes including human, yeast, worm, and 128 microbial genomes. Conclusions This investigation presents and implements an efficiently computational approach to accumulate the occurrences of the oligonucleotides or patterns in complete genomes. A database is established to maintain the information of the sequence features, including the distributions of oligonucleotide, the gene distribution, the distribution of repetitive elements in genomes and the occurrences of the oligonucleotides. The database can provide more effective and efficient way to access the repetitive features in genomes.

  13. Specificity and evolvability in eukaryotic protein interaction networks.

    Directory of Open Access Journals (Sweden)

    Pedro Beltrao

    2007-02-01

    Full Text Available Progress in uncovering the protein interaction networks of several species has led to questions of what underlying principles might govern their organization. Few studies have tried to determine the impact of protein interaction network evolution on the observed physiological differences between species. Using comparative genomics and structural information, we show here that eukaryotic species have rewired their interactomes at a fast rate of approximately 10(-5 interactions changed per protein pair, per million years of divergence. For Homo sapiens this corresponds to 10(3 interactions changed per million years. Additionally we find that the specificity of binding strongly determines the interaction turnover and that different biological processes show significantly different link dynamics. In particular, human proteins involved in immune response, transport, and establishment of localization show signs of positive selection for change of interactions. Our analysis suggests that a small degree of molecular divergence can give rise to important changes at the network level. We propose that the power law distribution observed in protein interaction networks could be partly explained by the cell's requirement for different degrees of protein binding specificity.

  14. Variation in recombination frequency and distribution across eukaryotes: patterns and processes

    Science.gov (United States)

    Feulner, Philine G. D.; Johnston, Susan E.; Santure, Anna W.; Smadja, Carole M.

    2017-01-01

    Recombination, the exchange of DNA between maternal and paternal chromosomes during meiosis, is an essential feature of sexual reproduction in nearly all multicellular organisms. While the role of recombination in the evolution of sex has received theoretical and empirical attention, less is known about how recombination rate itself evolves and what influence this has on evolutionary processes within sexually reproducing organisms. Here, we explore the patterns of, and processes governing recombination in eukaryotes. We summarize patterns of variation, integrating current knowledge with an analysis of linkage map data in 353 organisms. We then discuss proximate and ultimate processes governing recombination rate variation and consider how these influence evolutionary processes. Genome-wide recombination rates (cM/Mb) can vary more than tenfold across eukaryotes, and there is large variation in the distribution of recombination events across closely related taxa, populations and individuals. We discuss how variation in rate and distribution relates to genome architecture, genetic and epigenetic mechanisms, sex, environmental perturbations and variable selective pressures. There has been great progress in determining the molecular mechanisms governing recombination, and with the continued development of new modelling and empirical approaches, there is now also great opportunity to further our understanding of how and why recombination rate varies. This article is part of the themed issue ‘Evolutionary causes and consequences of recombination rate variation in sexual organisms’. PMID:29109219

  15. Elucidating the composition and conservation of the autophagy pathway in photosynthetic eukaryotes

    Science.gov (United States)

    Shemi, Adva; Ben-Dor, Shifra; Vardi, Assaf

    2015-01-01

    Aquatic photosynthetic eukaryotes represent highly diverse groups (green, red, and chromalveolate algae) derived from multiple endosymbiosis events, covering a wide spectrum of the tree of life. They are responsible for about 50% of the global photosynthesis and serve as the foundation for oceanic and fresh water food webs. Although the ecophysiology and molecular ecology of some algal species are extensively studied, some basic aspects of algal cell biology are still underexplored. The recent wealth of genomic resources from algae has opened new frontiers to decipher the role of cell signaling pathways and their function in an ecological and biotechnological context. Here, we took a bioinformatic approach to explore the distribution and conservation of TOR and autophagy-related (ATG) proteins (Atg in yeast) in diverse algal groups. Our genomic analysis demonstrates conservation of TOR and ATG proteins in green algae. In contrast, in all 5 available red algal genomes, we could not detect the sequences that encode for any of the 17 core ATG proteins examined, albeit TOR and its interacting proteins are conserved. This intriguing data suggests that the autophagy pathway is not conserved in red algae as it is in the entire eukaryote domain. In contrast, chromalveolates, despite being derived from the red-plastid lineage, retain and express ATG genes, which raises a fundamental question regarding the acquisition of ATG genes during algal evolution. Among chromalveolates, Emiliania huxleyi (Haptophyta), a bloom-forming coccolithophore, possesses the most complete set of ATG genes, and may serve as a model organism to study autophagy in marine protists with great ecological significance. PMID:25915714

  16. Convergent use of RhoGAP toxins by eukaryotic parasites and bacterial pathogens.

    Directory of Open Access Journals (Sweden)

    Dominique Colinet

    2007-12-01

    Full Text Available Inactivation of host Rho GTPases is a widespread strategy employed by bacterial pathogens to manipulate mammalian cellular functions and avoid immune defenses. Some bacterial toxins mimic eukaryotic Rho GTPase-activating proteins (GAPs to inactivate mammalian GTPases, probably as a result of evolutionary convergence. An intriguing question remains whether eukaryotic pathogens or parasites may use endogenous GAPs as immune-suppressive toxins to target the same key genes as bacterial pathogens. Interestingly, a RhoGAP domain-containing protein, LbGAP, was recently characterized from the parasitoid wasp Leptopilina boulardi, and shown to protect parasitoid eggs from the immune response of Drosophila host larvae. We demonstrate here that LbGAP has structural characteristics of eukaryotic RhoGAPs but that it acts similarly to bacterial RhoGAP toxins in mammals. First, we show by immunocytochemistry that LbGAP enters Drosophila immune cells, plasmatocytes and lamellocytes, and that morphological changes in lamellocytes are correlated with the quantity of LbGAP they contain. Demonstration that LbGAP displays a GAP activity and specifically interacts with the active, GTP-bound form of the two Drosophila Rho GTPases Rac1 and Rac2, both required for successful encapsulation of Leptopilina eggs, was then achieved using biochemical tests, yeast two-hybrid analysis, and GST pull-down assays. In addition, we show that the overall structure of LbGAP is similar to that of eukaryotic RhoGAP domains, and we identify distinct residues involved in its interaction with Rac GTPases. Altogether, these results show that eukaryotic parasites can use endogenous RhoGAPs as virulence factors and that despite their differences in sequence and structure, eukaryotic and bacterial RhoGAP toxins are similarly used to target the same immune pathways in insects and mammals.

  17. SINEs as driving forces in genome evolution.

    Science.gov (United States)

    Schmitz, J

    2012-01-01

    SINEs are short interspersed elements derived from cellular RNAs that repetitively retropose via RNA intermediates and integrate more or less randomly back into the genome. SINEs propagate almost entirely vertically within their host cells and, once established in the germline, are passed on from generation to generation. As non-autonomous elements, their reverse transcription (from RNA to cDNA) and genomic integration depends on the activity of the enzymatic machinery of autonomous retrotransposons, such as long interspersed elements (LINEs). SINEs are widely distributed in eukaryotes, but are especially effectively propagated in mammalian species. For example, more than a million Alu-SINE copies populate the human genome (approximately 13% of genomic space), and few master copies of them are still active. In the organisms where they occur, SINEs are a challenge to genomic integrity, but in the long term also can serve as beneficial building blocks for evolution, contributing to phenotypic heterogeneity and modifying gene regulatory networks. They substantially expand the genomic space and introduce structural variation to the genome. SINEs have the potential to mutate genes, to alter gene expression, and to generate new parts of genes. A balanced distribution and controlled activity of such properties is crucial to maintaining the organism's dynamic and thriving evolution. Copyright © 2012 S. Karger AG, Basel.

  18. Chromatin Structure in Cell Differentiation, Aging and Cancer

    NARCIS (Netherlands)

    S. Kheradmand Kia (Sima)

    2009-01-01

    textabstractChromatin is the structure that the eukaryotic genome is packaged into, allowing over a metre of DNA to fit into the small volume of the nucleus. It is composed of DNA and proteins, most of which are histones. This DNA-protein complex is the template for a number of essential cell

  19. Mapping the pericentric heterochromatin by comparative genomic hybridization analysis and chromosome deletions in Drosophila melanogaster

    OpenAIRE

    He, Bing; Caudy, Amy; Parsons, Lance; Rosebrock, Adam; Pane, Attilio; Raj, Sandeep; Wieschaus, Eric

    2012-01-01

    Heterochromatin represents a significant portion of eukaryotic genomes and has essential structural and regulatory functions. Its molecular organization is largely unknown due to difficulties in sequencing through and assembling repetitive sequences enriched in the heterochromatin. Here we developed a novel strategy using chromosomal rearrangements and embryonic phenotypes to position unmapped Drosophila melanogaster heterochromatic sequence to specific chromosomal regions. By excluding seque...

  20. insights from the genome of Melitaea cinxia

    OpenAIRE

    Ahola, Virpi; Wahlberg, Niklas; Frilander, Mikko J.

    2017-01-01

    The first lepidopteran genome (Bombyx mori) was published in 2004. Ten years later the genome of Melitaea cinxia came out as the third butterfly genome published, and the first eukaryotic genome sequenced in Finland. Owing to Ilkka Hanski, the M. cinxia system in the angstrom land Islands has become a famous model for metapopulation biology. More than 20 years of research on this system provides a strong ecological basis upon which a genetic framework could be built. Genetic knowledge is an e...

  1. Structural genomic variations and Parkinson's disease.

    Science.gov (United States)

    Bandrés-Ciga, Sara; Ruz, Clara; Barrero, Francisco J; Escamilla-Sevilla, Francisco; Pelegrina, Javier; Vives, Francisco; Duran, Raquel

    2017-10-01

    Parkinson's disease (PD) is the second most common neurodegenerative disease, whose prevalence is projected to be between 8.7 and 9.3 million by 2030. Until about 20 years ago, PD was considered to be the textbook example of a "non-genetic" disorder. Nowadays, PD is generally considered a multifactorial disorder that arises from the combination and complex interaction of genes and environmental factors. To date, a total of 7 genes including SNCA, LRRK2, PARK2, DJ-1, PINK 1, VPS35 and ATP13A2 have been seen to cause unequivocally Mendelian PD. Also, variants with incomplete penetrance in the genes LRRK2 and GBA are considered to be strong risk factors for PD worldwide. Although genetic studies have provided valuable insights into the pathogenic mechanisms underlying PD, the role of structural variation in PD has been understudied in comparison with other genomic variations. Structural genomic variations might substantially account for such genetic substrates yet to be discovered. The present review aims to provide an overview of the structural genomic variants implicated in the pathogenesis of PD.

  2. Re-evaluating the green versus red signal in eukaryotes with secondary plastid of red algal origin

    KAUST Repository

    Burki, Fabien

    2012-05-16

    The transition from endosymbiont to organelle in eukaryotic cells involves the transfer of significant numbers of genes to the host genomes, a process known as endosymbiotic gene transfer (EGT). In the case of plastid organelles, EGTs have been shown to leave a footprint in the nuclear genome that can be indicative of ancient photosynthetic activity in present-day plastid-lacking organisms, or even hint at the existence of cryptic plastids. Here,we evaluated the impact of EGTon eukaryote genomes by reanalyzing the recently published EST dataset for Chromera velia, an interesting test case of a photosynthetic alga closely related to apicomplexan parasites. Previously, 513 genes were reported to originate from red and green algae in a 1:1 ratio. In contrast, by manually inspecting newly generated trees indicating putative algal ancestry, we recovered only 51 genes congruent with EGT, of which 23 and 9 were of red and green algal origin, respectively,whereas 19 were ambiguous regarding the algal provenance.Our approach also uncovered 109 genes that branched within a monocot angiosperm clade, most likely representing a contamination. We emphasize the lack of congruence and the subjectivity resulting from independent phylogenomic screens for EGT, which appear to call for extreme caution when drawing conclusions for major evolutionary events. 2012 The Author(s).

  3. Inorganic phosphate uptake in unicellular eukaryotes.

    Science.gov (United States)

    Dick, Claudia F; Dos-Santos, André L A; Meyer-Fernandes, José R

    2014-07-01

    Inorganic phosphate (Pi) is an essential nutrient for all organisms. The route of Pi utilization begins with Pi transport across the plasma membrane. Here, we analyzed the gene sequences and compared the biochemical profiles, including kinetic and modulator parameters, of Pi transporters in unicellular eukaryotes. The objective of this review is to evaluate the recent findings regarding Pi uptake mechanisms in microorganisms, such as the fungi Neurospora crassa and Saccharomyces cerevisiae and the parasite protozoans Trypanosoma cruzi, Trypanosoma rangeli, Leishmania infantum and Plasmodium falciparum. Pi uptake is the key step of Pi homeostasis and in the subsequent signaling event in eukaryotic microorganisms. Biochemical and structural studies are important for clarifying mechanisms of Pi homeostasis, as well as Pi sensor and downstream pathways, and raise possibilities for future studies in this field. Copyright © 2014 Elsevier B.V. All rights reserved.

  4. Structural dynamics of retroviral genome and the packaging.

    Science.gov (United States)

    Miyazaki, Yasuyuki; Miyake, Ariko; Nomaguchi, Masako; Adachi, Akio

    2011-01-01

    Retroviruses can cause diseases such as AIDS, leukemia, and tumors, but are also used as vectors for human gene therapy. All retroviruses, except foamy viruses, package two copies of unspliced genomic RNA into their progeny viruses. Understanding the molecular mechanisms of retroviral genome packaging will aid the design of new anti-retroviral drugs targeting the packaging process and improve the efficacy of retroviral vectors. Retroviral genomes have to be specifically recognized by the cognate nucleocapsid domain of the Gag polyprotein from among an excess of cellular and spliced viral mRNA. Extensive virological and structural studies have revealed how retroviral genomic RNA is selectively packaged into the viral particles. The genomic area responsible for the packaging is generally located in the 5' untranslated region (5' UTR), and contains dimerization site(s). Recent studies have shown that retroviral genome packaging is modulated by structural changes of RNA at the 5' UTR accompanied by the dimerization. In this review, we focus on three representative retroviruses, Moloney murine leukemia virus, human immunodeficiency virus type 1 and 2, and describe the molecular mechanism of retroviral genome packaging.

  5. Characterization and Evolution of the Cell Cycle-Associated Mob Domain-Containing Proteins in Eukaryotes

    Directory of Open Access Journals (Sweden)

    Nicola Vitulo

    2007-01-01

    Full Text Available The MOB family includes a group of cell cycle-associated proteins highly conserved throughout eukaryotes, whose founding members are implicated in mitotic exit and co-ordination of cell cycle progression with cell polarity and morphogenesis. Here we report the characterization and evolution of the MOB domain-containing proteins as inferred from the 43 eukaryotic genomes so far sequenced. We show that genes for Mob-like proteins are present in at least 41 of these genomes, confi rming the universal distribution of this protein family and suggesting its prominent biological function. The phylogenetic analysis reveals fi ve distinct MOB domain classes, showing a progressive expansion of this family from unicellular to multicellular organisms, reaching the highest number in mammals. Plant Mob genes appear to have evolved from a single ancestor, most likely after the loss of one or more genes during the early stage of Viridiplantae evolutionary history. Three of the Mob classes are widespread among most of the analyzed organisms. The possible biological and molecular function of Mob proteins and their role in conserved signaling pathways related to cell proliferation, cell death and cell polarity are also presented and critically discussed.

  6. A neural network method for identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites

    DEFF Research Database (Denmark)

    Nielsen, Henrik; Engelbrecht, Jacob; Brunak, Søren

    1997-01-01

    We have developed a new method for the identication of signal peptides and their cleavage sites based on neural networks trained on separate sets of prokaryotic and eukaryotic sequences. The method performs signicantly better than previous prediction schemes, and can easily be applied to genome...

  7. Towards New Antifolates Targeting Eukaryotic Opportunistic Infections

    Energy Technology Data Exchange (ETDEWEB)

    Liu, J.; Bolstad, D; Bolstad, E; Wright, D; Anderson, A

    2009-01-01

    Trimethoprim, an antifolate commonly prescribed in combination with sulfamethoxazole, potently inhibits several prokaryotic species of dihydrofolate reductase (DHFR). However, several eukaryotic pathogenic organisms are resistant to trimethoprim, preventing its effective use as a therapeutic for those infections. We have been building a program to reengineer trimethoprim to more potently and selectively inhibit eukaryotic species of DHFR as a viable strategy for new drug discovery targeting several opportunistic pathogens. We have developed a series of compounds that exhibit potent and selective inhibition of DHFR from the parasitic protozoa Cryptosporidium and Toxoplasma as well as the fungus Candida glabrata. A comparison of the structures of DHFR from the fungal species Candida glabrata and Pneumocystis suggests that the compounds may also potently inhibit Pneumocystis DHFR.

  8. Defensins: antifungal lessons from eukaryotes

    Directory of Open Access Journals (Sweden)

    Patrícia M. Silva

    2014-03-01

    Full Text Available Over the last years, antimicrobial peptides (AMPs have been the focus of intense research towards the finding of a viable alternative to current antifungal drugs. Defensins are one of the major families of AMPs and the most represented among all eukaryotic groups, providing an important first line of host defense against pathogenic microorganisms. Several of these cysteine-stabilized peptides present a relevant effect against fungi. Defensins are the AMPs with the broader distribution across all eukaryotic kingdoms, namely, Fungi, Plantæ and Animalia, and were recently shown to have an ancestor in a bacterial organism. As a part of the host defense, defensins act as an important vehicle of information between innate and adaptive immune system and have a role in immunomodulation. This multidimensionality represents a powerful host shield, hard for microorganisms to overcome using single approach resistance strategies. Pathogenic fungi resistance to conventional antimycotic drugs is becoming a major problem. Defensins, as other AMPs, have shown to be an effective alternative to the current antimycotic therapies, demonstrating potential as novel therapeutic agents or drug leads. In this review, we summarize the current knowledge on some eukaryotic defensins with antifungal action. An overview of the main targets in the fungal cell and the mechanism of action of these AMPs (namely, the selectivity for some fungal membrane components are presented. Additionally, recent works on antifungal defensins structure, activity and citotoxicity are also reviewed.

  9. Use of the Operon Structure of the C. elegans Genome as a Tool to Identify Functionally Related Proteins

    Directory of Open Access Journals (Sweden)

    Silvia Dossena

    2013-12-01

    Full Text Available One of the most pressing challenges in the post genomic era is the identification and characterization of protein-protein interactions (PPIs, as these are essential in understanding the cellular physiology of health and disease. Experimental techniques suitable for characterizing PPIs (X-ray crystallography or nuclear magnetic resonance spectroscopy, among others are usually laborious, time-consuming and often difficult to apply to membrane proteins, and therefore require accurate prediction of the candidate interacting partners. High-throughput experimental methods (yeast two-hybrid and affinity purification succumb to the same shortcomings, and can also lead to high rates of false positive and negative results. Therefore, reliable tools for predicting PPIs are needed. The use of the operon structure in the eukaryote Caenorhabditis elegans genome is a valuable, though underserved, tool for identifying physically or functionally interacting proteins. Based on the concept that genes organized in the same operon may encode physically or functionally related proteins, this algorithm is easy to be applied and, importantly, gives a limited number of candidate partners of a given protein, allowing for focused experimental verification. Moreover, this approach can be successfully used to predict PPIs in the human system, including those of membrane proteins.

  10. Long- and short-term selective forces on malaria parasite genomes

    DEFF Research Database (Denmark)

    Nygaard, Sanne; Braunstein, Alexander; Malsen, Gareth

    2010-01-01

    Plasmodium parasites, the causal agents of malaria, result in more than 1 million deaths annually. Plasmodium are unicellular eukaryotes with small ~23 Mb genomes encoding ~5200 protein-coding genes. The protein-coding genes comprise about half of these genomes. Although evolutionary processes ha...

  11. Structural determinants and mechanism of HIV-1 genome packaging.

    Science.gov (United States)

    Lu, Kun; Heng, Xiao; Summers, Michael F

    2011-07-22

    Like all retroviruses, the human immunodeficiency virus selectively packages two copies of its unspliced RNA genome, both of which are utilized for strand-transfer-mediated recombination during reverse transcription-a process that enables rapid evolution under environmental and chemotherapeutic pressures. The viral RNA appears to be selected for packaging as a dimer, and there is evidence that dimerization and packaging are mechanistically coupled. Both processes are mediated by interactions between the nucleocapsid domains of a small number of assembling viral Gag polyproteins and RNA elements within the 5'-untranslated region of the genome. A number of secondary structures have been predicted for regions of the genome that are responsible for packaging, and high-resolution structures have been determined for a few small RNA fragments and protein-RNA complexes. However, major questions regarding the RNA structures (and potentially the structural changes) that are responsible for dimeric genome selection remain unanswered. Here, we review efforts that have been made to identify the molecular determinants and mechanism of human immunodeficiency virus type 1 genome packaging. Copyright © 2011 Elsevier Ltd. All rights reserved.

  12. Genome-wide survey of repetitive DNA elements in the button mushroom Agaricus bisporus

    NARCIS (Netherlands)

    Foulongne-Oriol, M.; Murat, C.; Castanera, R.; Ramírez, L.; Sonnenberg, A.S.M.

    2013-01-01

    Repetitive DNA elements are ubiquitous constituents of eukaryotic genomes. The biological roles of these repetitive elements, supposed to impact genome organization and evolution, are not completely elucidated yet. The availability of whole genome sequence offers the opportunity to draw a picture of

  13. Multi-scale structural community organisation of the human genome.

    Science.gov (United States)

    Boulos, Rasha E; Tremblay, Nicolas; Arneodo, Alain; Borgnat, Pierre; Audit, Benjamin

    2017-04-11

    Structural interaction frequency matrices between all genome loci are now experimentally achievable thanks to high-throughput chromosome conformation capture technologies. This ensues a new methodological challenge for computational biology which consists in objectively extracting from these data the structural motifs characteristic of genome organisation. We deployed the fast multi-scale community mining algorithm based on spectral graph wavelets to characterise the networks of intra-chromosomal interactions in human cell lines. We observed that there exist structural domains of all sizes up to chromosome length and demonstrated that the set of structural communities forms a hierarchy of chromosome segments. Hence, at all scales, chromosome folding predominantly involves interactions between neighbouring sites rather than the formation of links between distant loci. Multi-scale structural decomposition of human chromosomes provides an original framework to question structural organisation and its relationship to functional regulation across the scales. By construction the proposed methodology is independent of the precise assembly of the reference genome and is thus directly applicable to genomes whose assembly is not fully determined.

  14. The DNA-encoded nucleosome organization of a eukaryotic genome.

    Science.gov (United States)

    Kaplan, Noam; Moore, Irene K; Fondufe-Mittendorf, Yvonne; Gossett, Andrea J; Tillo, Desiree; Field, Yair; LeProust, Emily M; Hughes, Timothy R; Lieb, Jason D; Widom, Jonathan; Segal, Eran

    2009-03-19

    Nucleosome organization is critical for gene regulation. In living cells this organization is determined by multiple factors, including the action of chromatin remodellers, competition with site-specific DNA-binding proteins, and the DNA sequence preferences of the nucleosomes themselves. However, it has been difficult to estimate the relative importance of each of these mechanisms in vivo, because in vivo nucleosome maps reflect the combined action of all influencing factors. Here we determine the importance of nucleosome DNA sequence preferences experimentally by measuring the genome-wide occupancy of nucleosomes assembled on purified yeast genomic DNA. The resulting map, in which nucleosome occupancy is governed only by the intrinsic sequence preferences of nucleosomes, is similar to in vivo nucleosome maps generated in three different growth conditions. In vitro, nucleosome depletion is evident at many transcription factor binding sites and around gene start and end sites, indicating that nucleosome depletion at these sites in vivo is partly encoded in the genome. We confirm these results with a micrococcal nuclease-independent experiment that measures the relative affinity of nucleosomes for approximately 40,000 double-stranded 150-base-pair oligonucleotides. Using our in vitro data, we devise a computational model of nucleosome sequence preferences that is significantly correlated with in vivo nucleosome occupancy in Caenorhabditis elegans. Our results indicate that the intrinsic DNA sequence preferences of nucleosomes have a central role in determining the organization of nucleosomes in vivo.

  15. Use of mariner transposases for one-step delivery and integration of DNA in prokaryotes and eukaryotes by transfection.

    Science.gov (United States)

    Trubitsyna, Maryia; Michlewski, Gracjan; Finnegan, David J; Elfick, Alistair; Rosser, Susan J; Richardson, Julia M; French, Christopher E

    2017-06-02

    Delivery of DNA to cells and its subsequent integration into the host genome is a fundamental task in molecular biology, biotechnology and gene therapy. Here we describe an IP-free one-step method that enables stable genome integration into either prokaryotic or eukaryotic cells. A synthetic mariner transposon is generated by flanking a DNA sequence with short inverted repeats. When purified recombinant Mos1 or Mboumar-9 transposase is co-transfected with transposon-containing plasmid DNA, it penetrates prokaryotic or eukaryotic cells and integrates the target DNA into the genome. In vivo integrations by purified transposase can be achieved by electroporation, chemical transfection or Lipofection of the transposase:DNA mixture, in contrast to other published transposon-based protocols which require electroporation or microinjection. As in other transposome systems, no helper plasmids are required since transposases are not expressed inside the host cells, thus leading to generation of stable cell lines. Since it does not require electroporation or microinjection, this tool has the potential to be applied for automated high-throughput creation of libraries of random integrants for purposes including gene knock-out libraries, screening for optimal integration positions or safe genome locations in different organisms, selection of the highest production of valuable compounds for biotechnology, and sequencing. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  16. Structural Genomics of Minimal Organisms: Pipeline and Results

    Energy Technology Data Exchange (ETDEWEB)

    Kim, Sung-Hou; Shin, Dong-Hae; Kim, Rosalind; Adams, Paul; Chandonia, John-Marc

    2007-09-14

    The initial objective of the Berkeley Structural Genomics Center was to obtain a near complete three-dimensional (3D) structural information of all soluble proteins of two minimal organisms, closely related pathogens Mycoplasma genitalium and M. pneumoniae. The former has fewer than 500 genes and the latter has fewer than 700 genes. A semiautomated structural genomics pipeline was set up from target selection, cloning, expression, purification, and ultimately structural determination. At the time of this writing, structural information of more than 93percent of all soluble proteins of M. genitalium is avail able. This chapter summarizes the approaches taken by the authors' center.

  17. Structural dynamics of retroviral genome and the packaging

    Directory of Open Access Journals (Sweden)

    Yasuyuki eMiyazaki

    2011-12-01

    Full Text Available Retroviruses can cause diseases such as AIDS, leukemia and tumors, but are also used as vectors for human gene therapy. All retroviruses, except foamy viruses, package two copies of unspliced genomic RNA into their progeny viruses. Understanding the molecular mechanisms of retroviral genome packaging will aid the design of new anti-retroviral drugs targeting the packaging process and improve the efficacy of retroviral vectors. Retroviral genomes have to be specifically recognized by the cognate nucleocapsid (NC domain of the Gag polyprotein from among an excess of cellular and spliced viral mRNA. Extensive virological and structural studies have revealed how retroviral genomic RNA is selectively packaged into the viral particles. The genomic area responsible for the packaging is generally located in the 5’ untranslated region (5’ UTR, and contains dimerization site(s. Recent studies have shown that retroviral genome packaging is modulated by structural changes of RNA at the 5’ UTR accompanied by the dimerization. In this review, we focus on three representative retroviruses, Moloney murine leukemia virus (MoMLV, human immunodeficiency virus type 1 (HIV-1 and 2 (HIV-2, and describe the molecular mechanism of retroviral genome packaging.

  18. Dispersed repetitive sequences in eukaryotic genomes and their possible biological significance

    International Nuclear Information System (INIS)

    Georgiev, G.P.; Kramerov, D.A.; Ryskov, A.P.; Skryabin, K.G.; Lukanidin, E.M.

    1983-01-01

    In this paper is described the properties of a novel mouse mdg-like element, the A2 sequence, which is the most abundant repetitive sequence. We also characterized an ubiquitous B2 sequence that represents, after B1, the dominant family among the short interspersed repeats of the mouse genome. The existence of some putative transposition intermediates was shown for repeats of both A and B types of the mouse genome. These are closed circular DNA of the A type and small polyadenylated B + RNAs. The fundamental question that arises is whether these sequences are simply selfish DNA capable of transpositions or do they fulfill some useful biological functions within the genome. 66 references, 11 figures, 1 table

  19. Current Perspectives of Telomerase Structure and Function in Eukaryotes with Emerging Views on Telomerase in Human Parasites.

    Science.gov (United States)

    Dey, Abhishek; Chakrabarti, Kausik

    2018-01-24

    Replicative capacity of a cell is strongly correlated with telomere length regulation. Aberrant lengthening or reduction in the length of telomeres can lead to health anomalies, such as cancer or premature aging. Telomerase is a master regulator for maintaining replicative potential in most eukaryotic cells. It does so by controlling telomere length at chromosome ends. Akin to cancer cells, most single-cell eukaryotic pathogens are highly proliferative and require persistent telomerase activity to maintain constant length of telomere and propagation within their host. Although telomerase is key to unlimited cellular proliferation in both cases, not much was known about the role of telomerase in human parasites (malaria, Trypanosoma , etc.) until recently. Since telomerase regulation is mediated via its own structural components, interactions with catalytic reverse transcriptase and several factors that can recruit and assemble telomerase to telomeres in a cell cycle-dependent manner, we compare and discuss here recent findings in telomerase biology in cancer, aging and parasitic diseases to give a broader perspective of telomerase function in human diseases.

  20. Broad genomic and transcriptional analysis reveals a highly derived genome in dinoflagellate mitochondria

    Directory of Open Access Journals (Sweden)

    Keeling Patrick J

    2007-09-01

    Full Text Available Abstract Background Dinoflagellates comprise an ecologically significant and diverse eukaryotic phylum that is sister to the phylum containing apicomplexan endoparasites. The mitochondrial genome of apicomplexans is uniquely reduced in gene content and size, encoding only three proteins and two ribosomal RNAs (rRNAs within a highly compacted 6 kb DNA. Dinoflagellate mitochondrial genomes have been comparatively poorly studied: limited available data suggest some similarities with apicomplexan mitochondrial genomes but an even more radical type of genomic organization. Here, we investigate structure, content and expression of dinoflagellate mitochondrial genomes. Results From two dinoflagellates, Crypthecodinium cohnii and Karlodinium micrum, we generated over 42 kb of mitochondrial genomic data that indicate a reduced gene content paralleling that of mitochondrial genomes in apicomplexans, i.e., only three protein-encoding genes and at least eight conserved components of the highly fragmented large and small subunit rRNAs. Unlike in apicomplexans, dinoflagellate mitochondrial genes occur in multiple copies, often as gene fragments, and in numerous genomic contexts. Analysis of cDNAs suggests several novel aspects of dinoflagellate mitochondrial gene expression. Polycistronic transcripts were found, standard start codons are absent, and oligoadenylation occurs upstream of stop codons, resulting in the absence of termination codons. Transcripts of at least one gene, cox3, are apparently trans-spliced to generate full-length mRNAs. RNA substitutional editing, a process previously identified for mRNAs in dinoflagellate mitochondria, is also implicated in rRNA expression. Conclusion The dinoflagellate mitochondrial genome shares the same gene complement and fragmentation of rRNA genes with its apicomplexan counterpart. However, it also exhibits several unique characteristics. Most notable are the expansion of gene copy numbers and their arrangements

  1. Structurally Complex Organization of Repetitive DNAs in the Genome of Cobia (Rachycentron canadum).

    Science.gov (United States)

    Costa, Gideão W W F; Cioffi, Marcelo de B; Bertollo, Luiz A C; Molina, Wagner F

    2015-06-01

    Repetitive DNAs comprise the largest fraction of the eukaryotic genome. They include microsatellites or simple sequence repeats (SSRs), which play an important role in the chromosome differentiation among fishes. Rachycentron canadum is the only representative of the family Rachycentridae. This species has been focused on several multidisciplinary studies in view of its important potential for marine fish farming. In the present study, distinct classes of repetitive DNAs, with emphasis on SSRs, were mapped in the chromosomes of this species to improve the knowledge of its genome organization. Microsatellites exhibited a diversified distribution, both dispersed in euchromatin and clustered in the heterochromatin. The multilocus location of SSRs strengthened the heterochromatin heterogeneity in this species, as suggested by some previous studies. The colocalization of SSRs with retrotransposons and transposons pointed to a close evolutionary relationship between these repetitive sequences. A number of heterochromatic regions highlighted a greater complex organization than previously supposed, harboring a diversity of repetitive elements. In this sense, there was also evidence of colocalization of active genetic regions and different classes of repetitive DNAs in a common heterochromatic region, which offers a potential opportunity for further researches regarding the interaction of these distinct fractions in fish genomes.

  2. Crystal structure of Homo sapiens protein LOC79017

    Energy Technology Data Exchange (ETDEWEB)

    Bae, Euiyoung; Bingman, Craig A.; Aceti, David J.; Phillips, Jr., George N. (UW)

    2010-02-08

    LOC79017 (MW 21.0 kDa, residues 1-188) was annotated as a hypothetical protein encoded by Homo sapiens chromosome 7 open reading frame 24. It was selected as a target by the Center for Eukaryotic Structural Genomics (CESG) because it did not share more than 30% sequence identity with any protein for which the three-dimensional structure is known. The biological function of the protein has not been established yet. Parts of LOC79017 were identified as members of uncharacterized Pfam families (residues 1-95 as PB006073 and residues 104-180 as PB031696). BLAST searches revealed homologues of LOC79017 in many eukaryotes, but none of them have been functionally characterized. Here, we report the crystal structure of H. sapiens protein LOC79017 (UniGene code Hs.530024, UniProt code O75223, CESG target number go.35223).

  3. Functional and Structural Overview of G-Protein-Coupled Receptors Comprehensively Obtained from Genome Sequences

    Directory of Open Access Journals (Sweden)

    Makiko Suwa

    2011-04-01

    Full Text Available An understanding of the functional mechanisms of G-protein-coupled receptors (GPCRs is very important for GPCR-related drug design. We have developed an integrated GPCR database (SEVENS http://sevens.cbrc.jp/ that includes 64,090 reliable GPCR genes comprehensively identified from 56 eukaryote genome sequences, and overviewed the sequences and structure spaces of the GPCRs. In vertebrates, the number of receptors for biological amines, peptides, etc. is conserved in most species, whereas the number of chemosensory receptors for odorant, pheromone, etc. significantly differs among species. The latter receptors tend to be single exon type or a few exon type and show a high ratio in the numbers of GPCRs, whereas some families, such as Class B and Class C receptors, have long lengths due to the presence of many exons. Statistical analyses of amino acid residues reveal that most of the conserved residues in Class A GPCRs are found in the cytoplasmic half regions of transmembrane (TM helices, while residues characteristic to each subfamily found on the extracellular half regions. The 69 of Protein Data Bank (PDB entries of complete or fragmentary structures could be mapped on the TM/loop regions of Class A GPCRs covering 14 subfamilies.

  4. The genome of obligately intracellular Ehrlichia canis revealsthemes of complex membrane structure and immune evasion strategies

    Energy Technology Data Exchange (ETDEWEB)

    Mavromatis, K.; Kuyler Doyle, C.; Lykidis, A.; Ivanova, N.; Francino, P.; Chain, P.; Shin, M.; Malfatti, S.; Larimer, F.; Copeland,A.; Detter, J.C.; Land, M.; Richardson, P.M.; Yu, X.J.; Walker, D.H.; McBride, J.W.; Kyrpides, N.C.

    2005-09-01

    Ehrlichia canis, a small obligately intracellular, tick-transmitted, gram-negative, a-proteobacterium is the primary etiologic agent of globally distributed canine monocytic ehrlichiosis. Complete genome sequencing revealed that the E. canis genome consists of a single circular chromosome of 1,315,030 bp predicted to encode 925 proteins, 40 stable RNA species, and 17 putative pseudogenes, and a substantial proportion of non-coding sequence (27 percent). Interesting genome features include a large set of proteins with transmembrane helices and/or signal sequences, and a unique serine-threonine bias associated with the potential for O-glycosylation that was prominent in proteins associated with pathogen-host interactions. Furthermore, two paralogous protein families associated with immune evasion were identified, one of which contains poly G:C tracts, suggesting that they may play a role in phase variation and facilitation of persistent infections. Proteins associated with pathogen-host interactions were identified including a small group of proteins (12) with tandem repeats and another with eukaryotic-like ankyrin domains (7).

  5. Highly variable rates of genome rearrangements between hemiascomycetous yeast lineages.

    Directory of Open Access Journals (Sweden)

    2006-03-01

    Full Text Available Hemiascomycete yeasts cover an evolutionary span comparable to that of the entire phylum of chordates. Since this group currently contains the largest number of complete genome sequences it presents unique opportunities to understand the evolution of genome organization in eukaryotes. We inferred rates of genome instability on all branches of a phylogenetic tree for 11 species and calculated species-specific rates of genome rearrangements. We characterized all inversion events that occurred within synteny blocks between six representatives of the different lineages. We show that the rates of macro- and microrearrangements of gene order are correlated within individual lineages but are highly variable across different lineages. The most unstable genomes correspond to the pathogenic yeasts Candida albicans and Candida glabrata. Chromosomal maps have been intensively shuffled by numerous interchromosomal rearrangements, even between species that have retained a very high physical fraction of their genomes within small synteny blocks. Despite this intensive reshuffling of gene positions, essential genes, which cluster in low recombination regions in the genome of Saccharomyces cerevisiae, tend to remain syntenic during evolution. This work reveals that the high plasticity of eukaryotic genomes results from rearrangement rates that vary between lineages but also at different evolutionary times of a given lineage.

  6. Genome-wide analysis of tandem repeats in plants and green algae

    Science.gov (United States)

    Zhixin Zhao; Cheng Guo; Sreeskandarajan Sutharzan; Pei Li; Craig Echt; Jie Zhang; Chun Liang

    2014-01-01

    Tandem repeats (TRs) extensively exist in the genomes of prokaryotes and eukaryotes. Based on the sequenced genomes and gene annotations of 31 plant and algal species in Phytozome version 8.0 (http://www.phytozome.net/), we examined TRs in a genome-wide scale, characterized their distributions and motif features, and explored their putative biological functions. Among...

  7. Extensive expansion of A1 family aspartic proteinases in fungi revealed by evolutionary analyses of 107 complete eukaryotic proteomes

    NARCIS (Netherlands)

    Revuelta, M.V.; Kan, van J.A.L.; Kay, J.; Have, ten A.

    2014-01-01

    The A1 family of eukaryotic aspartic proteinases (APs) forms one of the 16 AP families. Although one of the best characterized families, the recent increase in genome sequence data has revealed many fungal AP homologs with novel sequence characteristics. This study was performed to explore the

  8. Comparative RNA genomics

    DEFF Research Database (Denmark)

    Backofen, Rolf; Gorodkin, Jan; Hofacker, Ivo L.

    2018-01-01

    Over the last two decades it has become clear that RNA is much more than just a boring intermediate in protein expression. Ancient RNAs still appear in the core information metabolism and comprise a surprisingly large component in bacterial gene regulation. A common theme with these types of mostly...... small RNAs is their reliance of conserved secondary structures. Large scale sequencing projects, on the other hand, have profoundly changed our understanding of eukaryotic genomes. Pervasively transcribed, they give rise to a plethora of large and evolutionarily extremely flexible noncoding RNAs...... that exert a vastly diverse array of molecule functions. In this chapter we provide a—necessarily incomplete—overview of the current state of comparative analysis of noncoding RNAs, emphasizing computational approaches as a means to gain a global picture of the modern RNA world....

  9. Fluorescence in situ hybridization and optical mapping to correct scaffold arrangement in the tomato genome

    Science.gov (United States)

    Modern biological analyses are often assisted by recent technologies making the sequencing of complex genomes both technically possible and feasible. We recently sequenced the tomato genome that, like many eukaryotic genomes, is large and complex. Current sequencing technologies allow the developmen...

  10. Genome profiling of sterol synthesis shows convergent evolution in parasites and guides chemotherapeutic attack.

    Science.gov (United States)

    Fügi, Matthias A; Gunasekera, Kapila; Ochsenreiter, Torsten; Guan, Xueli; Wenk, Markus R; Mäser, Pascal

    2014-05-01

    Sterols are an essential class of lipids in eukaryotes, where they serve as structural components of membranes and play important roles as signaling molecules. Sterols are also of high pharmacological significance: cholesterol-lowering drugs are blockbusters in human health, and inhibitors of ergosterol biosynthesis are widely used as antifungals. Inhibitors of ergosterol synthesis are also being developed for Chagas's disease, caused by Trypanosoma cruzi. Here we develop an in silico pipeline to globally evaluate sterol metabolism and perform comparative genomics. We generate a library of hidden Markov model-based profiles for 42 sterol biosynthetic enzymes, which allows expressing the genomic makeup of a given species as a numerical vector. Hierarchical clustering of these vectors functionally groups eukaryote proteomes and reveals convergent evolution, in particular metabolic reduction in obligate endoparasites. We experimentally explore sterol metabolism by testing a set of sterol biosynthesis inhibitors against trypanosomatids, Plasmodium falciparum, Giardia, and mammalian cells, and by quantifying the expression levels of sterol biosynthetic genes during the different life stages of T. cruzi and Trypanosoma brucei. The phenotypic data correlate with genomic makeup for simvastatin, which showed activity against trypanosomatids. Other findings, such as the activity of terbinafine against Giardia, are not in agreement with the genotypic profile.

  11. Interrogating the druggable genome with structural informatics.

    Science.gov (United States)

    Hambly, Kevin; Danzer, Joseph; Muskal, Steven; Debe, Derek A

    2006-08-01

    Structural genomics projects are producing protein structure data at an unprecedented rate. In this paper, we present the Target Informatics Platform (TIP), a novel structural informatics approach for amplifying the rapidly expanding body of experimental protein structure information to enhance the discovery and optimization of small molecule protein modulators on a genomic scale. In TIP, existing experimental structure information is augmented using a homology modeling approach, and binding sites across multiple target families are compared using a clique detection algorithm. We report here a detailed analysis of the structural coverage for the set of druggable human targets, highlighting drug target families where the level of structural knowledge is currently quite high, as well as those areas where structural knowledge is sparse. Furthermore, we demonstrate the utility of TIP's intra- and inter-family binding site similarity analysis using a series of retrospective case studies. Our analysis underscores the utility of a structural informatics infrastructure for extracting drug discovery-relevant information from structural data, aiding researchers in the identification of lead discovery and optimization opportunities as well as potential "off-target" liabilities.

  12. Mitigating Mitochondrial Genome Erosion Without Recombination.

    Science.gov (United States)

    Radzvilavicius, Arunas L; Kokko, Hanna; Christie, Joshua R

    2017-11-01

    Mitochondria are ATP-producing organelles of bacterial ancestry that played a key role in the origin and early evolution of complex eukaryotic cells. Most modern eukaryotes transmit mitochondrial genes uniparentally, often without recombination among genetically divergent organelles. While this asymmetric inheritance maintains the efficacy of purifying selection at the level of the cell, the absence of recombination could also make the genome susceptible to Muller's ratchet. How mitochondria escape this irreversible defect accumulation is a fundamental unsolved question. Occasional paternal leakage could in principle promote recombination, but it would also compromise the purifying selection benefits of uniparental inheritance. We assess this tradeoff using a stochastic population-genetic model. In the absence of recombination, uniparental inheritance of freely-segregating genomes mitigates mutational erosion, while paternal leakage exacerbates the ratchet effect. Mitochondrial fusion-fission cycles ensure independent genome segregation, improving purifying selection. Paternal leakage provides opportunity for recombination to slow down the mutation accumulation, but always at a cost of increased steady-state mutation load. Our findings indicate that random segregation of mitochondrial genomes under uniparental inheritance can effectively combat the mutational meltdown, and that homologous recombination under paternal leakage might not be needed. Copyright © 2017 by the Genetics Society of America.

  13. Visualization of RNA structure models within the Integrative Genomics Viewer.

    Science.gov (United States)

    Busan, Steven; Weeks, Kevin M

    2017-07-01

    Analyses of the interrelationships between RNA structure and function are increasingly important components of genomic studies. The SHAPE-MaP strategy enables accurate RNA structure probing and realistic structure modeling of kilobase-length noncoding RNAs and mRNAs. Existing tools for visualizing RNA structure models are not suitable for efficient analysis of long, structurally heterogeneous RNAs. In addition, structure models are often advantageously interpreted in the context of other experimental data and gene annotation information, for which few tools currently exist. We have developed a module within the widely used and well supported open-source Integrative Genomics Viewer (IGV) that allows visualization of SHAPE and other chemical probing data, including raw reactivities, data-driven structural entropies, and data-constrained base-pair secondary structure models, in context with linear genomic data tracks. We illustrate the usefulness of visualizing RNA structure in the IGV by exploring structure models for a large viral RNA genome, comparing bacterial mRNA structure in cells with its structure under cell- and protein-free conditions, and comparing a noncoding RNA structure modeled using SHAPE data with a base-pairing model inferred through sequence covariation analysis. © 2017 Busan and Weeks; Published by Cold Spring Harbor Laboratory Press for the RNA Society.

  14. Yeast genome sequencing:

    DEFF Research Database (Denmark)

    Piskur, Jure; Langkjær, Rikke Breinhold

    2004-01-01

    For decades, unicellular yeasts have been general models to help understand the eukaryotic cell and also our own biology. Recently, over a dozen yeast genomes have been sequenced, providing the basis to resolve several complex biological questions. Analysis of the novel sequence data has shown...... of closely related species helps in gene annotation and to answer how many genes there really are within the genomes. Analysis of non-coding regions among closely related species has provided an example of how to determine novel gene regulatory sequences, which were previously difficult to analyse because...... they are short and degenerate and occupy different positions. Comparative genomics helps to understand the origin of yeasts and points out crucial molecular events in yeast evolutionary history, such as whole-genome duplication and horizontal gene transfer(s). In addition, the accumulating sequence data provide...

  15. From structure to mechanism—understanding initiation of DNA replication

    Science.gov (United States)

    Riera, Alberto; Barbon, Marta; Noguchi, Yasunori; Reuter, L. Maximilian; Schneider, Sarah; Speck, Christian

    2017-01-01

    DNA replication results in the doubling of the genome prior to cell division. This process requires the assembly of 50 or more protein factors into a replication fork. Here, we review recent structural and biochemical insights that start to explain how specific proteins recognize DNA replication origins, load the replicative helicase on DNA, unwind DNA, synthesize new DNA strands, and reassemble chromatin. We focus on the minichromosome maintenance (MCM2–7) proteins, which form the core of the eukaryotic replication fork, as this complex undergoes major structural rearrangements in order to engage with DNA, regulate its DNA-unwinding activity, and maintain genome stability. PMID:28717046

  16. Expressed Peptide Tags: An additional layer of data for genome annotation

    Energy Technology Data Exchange (ETDEWEB)

    Savidor, Alon [ORNL; Donahoo, Ryan S [ORNL; Hurtado-Gonzales, Oscar [University of Tennessee, Knoxville (UTK); Verberkmoes, Nathan C [ORNL; Shah, Manesh B [ORNL; Lamour, Kurt H [ORNL; McDonald, W Hayes [ORNL

    2006-01-01

    While genome sequencing is becoming ever more routine, genome annotation remains a challenging process. Identification of the coding sequences within the genomic milieu presents a tremendous challenge, especially for eukaryotes with their complex gene architectures. Here we present a method to assist the annotation process through the use of proteomic data and bioinformatics. Mass spectra of digested protein preparations of the organism of interest were acquired and searched against a protein database created by a six frame translation of the genome. The identified peptides were mapped back to the genome, compared to the current annotation, and then categorized as supporting or extending the current genome annotation. We named the classified peptides Expressed Peptide Tags (EPTs). The well annotated bacterium Rhodopseudomonas palustris was used as a control for the method and showed high degree of correlation between EPT mapping and the current annotation, with 86% of the EPTs confirming existing gene calls and less than 1% of the EPTs expanding on the current annotation. The eukaryotic plant pathogens Phytophthora ramorum and Phytophthora sojae, whose genomes have been recently sequenced and are much less well annotated, were also subjected to this method. A series of algorithmic steps were taken to increase the confidence of EPT identification for these organisms, including generation of smaller sub-databases to be searched against, and definition of EPT criteria that accommodates the more complex eukaryotic gene architecture. As expected, the analysis of the Phytophthora species showed less correlation between EPT mapping and their current annotation. While ~77% of Phytophthora EPTs supported the current annotation, a portion of them (7.2% and 12.6% for P. ramorum and P. sojae, respectively) suggested modification to current gene calls or identified novel genes that were missed by the current genome annotation of these organisms.

  17. The genome of the diatom Thalassiosira pseudonana: Ecology,evolution, and metabolism

    Energy Technology Data Exchange (ETDEWEB)

    Ambrust, E.V.; Berges, J.; Bowler, C.; Green, B.; Martinez, D.; Putnam, N.; Zhou, S.; Allen, A.; Apt, K.; Bechner, M.; Brzezinski, M.; Chaal, B.; Chiovitti, A.; Davis, A.; Goodstein, D.; Hadi, M.; Hellsten,U.; Hildebrand, M.; Jenkins, B.; Jurka, J.; Kapitonov, V.; Kroger, N.; Lau, W.; Lane, T.; Larimer, F.; Lippmeier, J.; Lucas, S.; Medina, M.; Montsant, A.; Obornik, M.; Parker, M. Schnitzler; Palenik, B.; Pazour,G.; Richardson, P.; Rynearson, T.; Saito, M.; Schwartz, D.; Thamatrakoln,K.; Valentin, K.; Vardi, A.; Wilkerson, F.; Rokhsar, D.; Vardi, A.; Wilkerson, F.P.; Rokhsar, D.S.

    2004-09-01

    Diatoms are unicellular algae with plastids acquired by secondary endosymbiosis. They are responsible for {approx}20% of global carbon fixation. We report the 34 Mbp draft nuclear genome of the marine diatom, Thalassiosira pseudonana and its 129 Kbp plastid and 44 Kbp mitochondrial genomes. Sequence and optical restriction mapping revealed 24 diploid nuclear chromosomes. We identified novel genes for silicic acid transport and formation of silica-based cell walls, high-affinity iron uptake, biosynthetic enzymes for several types of polyunsaturated fatty acids, utilization of a range of nitrogenous compounds and a complete urea cycle, all attributes that allow diatoms to prosper in the marine environment. Diatoms are unicellular, photosynthetic, eukaryotic algae found throughout the world's oceans and freshwater systems. They form the base of short, energetically-efficient food webs that support large-scale coastal fisheries. Photosynthesis by marine diatoms generates as much as 40% of the 45-50 billion tonnes of organic carbon produced each year in the sea (1), and their role in global carbon cycling is predicted to be comparable to that of all terrestrial rainforests combined (2, 3). Over geological time, diatoms may have influenced global climate by changing the flux of atmospheric carbon dioxide into the oceans (4). A defining feature of diatoms is their ornately patterned silicified cell wall or frustule, which displays species-specific nano-structures of such fine detail that diatoms have long been used to test the resolution of optical microscopes. Recent attention has focused on biosynthesis of these nano-structures as a paradigm for future silica nanotechnology (5). The long history (over 180 million years) and dominance of diatoms in the oceans is reflected by their contributions to vast deposits of diatomite, most cherts and a significant fraction of current petroleum reserves (6). As photosynthetic heterokonts, diatoms reflect a fundamentally

  18. Producing genome structure populations with the dynamic and automated PGS software.

    Science.gov (United States)

    Hua, Nan; Tjong, Harianto; Shin, Hanjun; Gong, Ke; Zhou, Xianghong Jasmine; Alber, Frank

    2018-05-01

    Chromosome conformation capture technologies such as Hi-C are widely used to investigate the spatial organization of genomes. Because genome structures can vary considerably between individual cells of a population, interpreting ensemble-averaged Hi-C data can be challenging, in particular for long-range and interchromosomal interactions. We pioneered a probabilistic approach for the generation of a population of distinct diploid 3D genome structures consistent with all the chromatin-chromatin interaction probabilities from Hi-C experiments. Each structure in the population is a physical model of the genome in 3D. Analysis of these models yields new insights into the causes and the functional properties of the genome's organization in space and time. We provide a user-friendly software package, called PGS, which runs on local machines (for practice runs) and high-performance computing platforms. PGS takes a genome-wide Hi-C contact frequency matrix, along with information about genome segmentation, and produces an ensemble of 3D genome structures entirely consistent with the input. The software automatically generates an analysis report, and provides tools to extract and analyze the 3D coordinates of specific domains. Basic Linux command-line knowledge is sufficient for using this software. A typical running time of the pipeline is ∼3 d with 300 cores on a computer cluster to generate a population of 1,000 diploid genome structures at topological-associated domain (TAD)-level resolution.

  19. Alignment-free comparative genomic screen for structured RNAs using coarse-grained secondary structure dot plots

    DEFF Research Database (Denmark)

    Kato, Yuki; Gorodkin, Jan; Havgaard, Jakob Hull

    2017-01-01

    . Methods: Here we present a fast and efficient method, DotcodeR, for detecting structurally similar RNAs in genomic sequences by comparing their corresponding coarse-grained secondary structure dot plots at string level. This allows us to perform an all-against-all scan of all window pairs from two genomes...... without alignment. Results: Our computational experiments with simulated data and real chromosomes demonstrate that the presented method has good sensitivity. Conclusions: DotcodeR can be useful as a pre-filter in a genomic comparative scan for structured RNAs....

  20. Algal genomes reveal evolutionary mosaicism and the fate of nucleomorphs

    Energy Technology Data Exchange (ETDEWEB)

    Curtis, Bruce A.; Tanifuji, Goro; Burki, Fabien; Gruber, Ansgar; Irimia, Manuuel; Maruyama, Shinichiro; Arias, Maria C.; Ball, Steven G.; Gile, Gillian H.; Hirakawa, Yoshihisa; Hopkins, Julia F.; Kuo, Alan; Rensing, Stefan A.; Schmutz, Jeremy; Symeonidi, Aikaterini; Elias, Marek; Eveleigh, Robert J. M.; Herman, Emily K.; Klute, Mary J.; Nakayama, Takuro; Obornik, Miroslav; Reyes-Prieto, Adrian; Armbrust, E. Virginia; Aves, Stephen J.; Beiko, Robert G.; Coutinho, Pedro; Dacks, Joel B.; Durnford, Dion G.; Fast, Naomi M.; Green, Beverley R.; Grisdale, Cameron J.; Hempel, Franziska; Henrissat, Bernard; Hoppner, Marc P.; Ishida, Ken-Ichiro; Kim, Eunsoo; Koreny, Ludek; Kroth, Peter G.; Liu, Yuan; Malik, Shehre-Banoo; Maier, Uwe G.; McRose, Darcy; Mock, Thomas; Neilson, Jonathan A. D.; Onodera, Naoko T.; Poole, Anthony M.; Pritham, Ellen J.; Richards, Thomas A.; Rocap, Gabrielle; Roy, Scott W.; Sarai, Chihiro; Schaack, Sarah; Shirato, Shu; Slamovits, Claudio H.; Spencer, Davie F.; Suzuki, Shigekatsu; Worden, Alexandra Z.; Zauner, Stefan; Barry, Kerrie; Bell, Callum; Bharti, Arvind K.; Crow, John A.; Grimwood, Jane; Kramer, Robin; Lindquist, Erika; Lucas, Susan; Salamov, Asaf; McFadden, Geoffrey I.; Lane, Christopher E.; Keeling, Patrick J.; Gray, Michael W.; Grigoriev, Igor V.; Archibald, John M.

    2012-08-10

    Cryptophyte and chlorarachniophyte algae are transitional forms in the widespread secondary endosymbiotic acquisition of photosynthesis by engulfment of eukaryotic algae. Unlike most secondary plastid-bearing algae, miniaturized versions of the endosymbiont nuclei (nucleomorphs) persist in cryptophytes and chlorarachniophytes. To determine why, and to address other fundamental questions about eukaryote eukaryote endosymbiosis, we sequenced the nuclear genomes of the cryptophyte Guillardia theta and the chlorarachniophyte Bigelowiella natans. Both genomes have 21,000 protein genes and are intron rich, and B. natans exhibits unprecedented alternative splicing for a single-celled organism. Phylogenomic analyses and subcellular targeting predictions reveal extensive genetic and biochemical mosaicism, with both host- and endosymbiont-derived genes servicing the mitochondrion, the host cell cytosol, the plastid and the remnant endosymbiont cytosol of both algae. Mitochondrion-to-nucleus gene transfer still occurs in both organisms but plastid-to-nucleus and nucleomorph-to-nucleus transfers do not, which explains why a small residue of essential genes remains locked in each nucleomorph.

  1. Exploration of the Germline Genome of the Ciliate Chilodonella uncinata through Single-Cell Omics (Transcriptomics and Genomics

    Directory of Open Access Journals (Sweden)

    Xyrus X. Maurer-Alcalá

    2018-01-01

    Full Text Available Separate germline and somatic genomes are found in numerous lineages across the eukaryotic tree of life, often separated into distinct tissues (e.g., in plants, animals, and fungi or distinct nuclei sharing a common cytoplasm (e.g., in ciliates and some foraminifera. In ciliates, germline-limited (i.e., micronuclear-specific DNA is eliminated during the development of a new somatic (i.e., macronuclear genome in a process that is tightly linked to large-scale genome rearrangements, such as deletions and reordering of protein-coding sequences. Most studies of germline genome architecture in ciliates have focused on the model ciliates Oxytricha trifallax, Paramecium tetraurelia, and Tetrahymena thermophila, for which the complete germline genome sequences are known. Outside of these model taxa, only a few dozen germline loci have been characterized from a limited number of cultivable species, which is likely due to difficulties in obtaining sufficient quantities of “purified” germline DNA in these taxa. Combining single-cell transcriptomics and genomics, we have overcome these limitations and provide the first insights into the structure of the germline genome of the ciliate Chilodonella uncinata, a member of the understudied class Phyllopharyngea. Our analyses reveal the following: (i large gene families contain a disproportionate number of genes from scrambled germline loci; (ii germline-soma boundaries in the germline genome are demarcated by substantial shifts in GC content; (iii single-cell omics techniques provide large-scale quality germline genome data with limited effort, at least for ciliates with extensively fragmented somatic genomes. Our approach provides an efficient means to understand better the evolution of genome rearrangements between germline and soma in ciliates.

  2. Pathgroups, a dynamic data structure for genome reconstruction problems.

    Science.gov (United States)

    Zheng, Chunfang

    2010-07-01

    Ancestral gene order reconstruction problems, including the median problem, quartet construction, small phylogeny, guided genome halving and genome aliquoting, are NP hard. Available heuristics dedicated to each of these problems are computationally costly for even small instances. We present a data structure enabling rapid heuristic solution to all these ancestral genome reconstruction problems. A generic greedy algorithm with look-ahead based on an automatically generated priority system suffices for all the problems using this data structure. The efficiency of the algorithm is due to fast updating of the structure during run time and to the simplicity of the priority scheme. We illustrate with the first rapid algorithm for quartet construction and apply this to a set of yeast genomes to corroborate a recent gene sequence-based phylogeny. http://albuquerque.bioinformatics.uottawa.ca/pathgroup/Quartet.html chunfang313@gmail.com Supplementary data are available at Bioinformatics online.

  3. The complete chloroplast genome sequence of Podocarpus lambertii: genome structure, evolutionary aspects, gene content and SSR detection.

    Directory of Open Access Journals (Sweden)

    Leila do Nascimento Vieira

    Full Text Available BACKGROUND: Podocarpus lambertii (Podocarpaceae is a native conifer from the Brazilian Atlantic Forest Biome, which is considered one of the 25 biodiversity hotspots in the world. The advancement of next-generation sequencing technologies has enabled the rapid acquisition of whole chloroplast (cp genome sequences at low cost. Several studies have proven the potential of cp genomes as tools to understand enigmatic and basal phylogenetic relationships at different taxonomic levels, as well as further probe the structural and functional evolution of plants. In this work, we present the complete cp genome sequence of P. lambertii. METHODOLOGY/PRINCIPAL FINDINGS: The P. lambertii cp genome is 133,734 bp in length, and similar to other sequenced cupressophytes, it lacks one of the large inverted repeat regions (IR. It contains 118 unique genes and one duplicated tRNA (trnN-GUU, which occurs as an inverted repeat sequence. The rps16 gene was not found, which was previously reported for the plastid genome of another Podocarpaceae (Nageia nagi and Araucariaceae (Agathis dammara. Structurally, P. lambertii shows 4 inversions of a large DNA fragment ∼20,000 bp compared to the Podocarpus totara cp genome. These unexpected characteristics may be attributed to geographical distance and different adaptive needs. The P. lambertii cp genome presents a total of 28 tandem repeats and 156 SSRs, with homo- and dipolymers being the most common and tri-, tetra-, penta-, and hexapolymers occurring with less frequency. CONCLUSION: The complete cp genome sequence of P. lambertii revealed significant structural changes, even in species from the same genus. These results reinforce the apparently loss of rps16 gene in Podocarpaceae cp genome. In addition, several SSRs in the P. lambertii cp genome are likely intraspecific polymorphism sites, which may allow highly sensitive phylogeographic and population structure studies, as well as phylogenetic studies of species of

  4. Chromatin structure and evolution in the human genome

    Directory of Open Access Journals (Sweden)

    Dunlop Malcolm G

    2007-05-01

    Full Text Available Abstract Background Evolutionary rates are not constant across the human genome but genes in close proximity have been shown to experience similar levels of divergence and selection. The higher-order organisation of chromosomes has often been invoked to explain such phenomena but previously there has been insufficient data on chromosome structure to investigate this rigorously. Using the results of a recent genome-wide analysis of open and closed human chromatin structures we have investigated the global association between divergence, selection and chromatin structure for the first time. Results In this study we have shown that, paradoxically, synonymous site divergence (dS at non-CpG sites is highest in regions of open chromatin, primarily as a result of an increased number of transitions, while the rates of other traditional measures of mutation (intergenic, intronic and ancient repeat divergence as well as SNP density are highest in closed regions of the genome. Analysis of human-chimpanzee divergence across intron-exon boundaries indicates that although genes in relatively open chromatin generally display little selection at their synonymous sites, those in closed regions show markedly lower divergence at their fourfold degenerate sites than in neighbouring introns and intergenic regions. Exclusion of known Exonic Splice Enhancer hexamers has little affect on the divergence observed at fourfold degenerate sites across chromatin categories; however, we show that closed chromatin is enriched with certain classes of ncRNA genes whose RNA secondary structure may be particularly important. Conclusion We conclude that, overall, non-CpG mutation rates are lowest in open regions of the genome and that regions of the genome with a closed chromatin structure have the highest background mutation rate. This might reflect lower rates of DNA damage or enhanced DNA repair processes in regions of open chromatin. Our results also indicate that dS is a poor

  5. Evolutionary genomics and population structure of Entamoeba histolytica

    Directory of Open Access Journals (Sweden)

    Koushik Das

    2014-11-01

    Full Text Available Amoebiasis caused by the gastrointestinal parasite Entamoeba histolytica has diverse disease outcomes. Study of genome and evolution of this fascinating parasite will help us to understand the basis of its virulence and explain why, when and how it causes diseases. In this review, we have summarized current knowledge regarding evolutionary genomics of E. histolytica and discussed their association with parasite phenotypes and its differential pathogenic behavior. How genetic diversity reveals parasite population structure has also been discussed. Queries concerning their evolution and population structure which were required to be addressed have also been highlighted. This significantly large amount of genomic data will improve our knowledge about this pathogenic species of Entamoeba.

  6. Child Development and Structural Variation in the Human Genome

    Science.gov (United States)

    Zhang, Ying; Haraksingh, Rajini; Grubert, Fabian; Abyzov, Alexej; Gerstein, Mark; Weissman, Sherman; Urban, Alexander E.

    2013-01-01

    Structural variation of the human genome sequence is the insertion, deletion, or rearrangement of stretches of DNA sequence sized from around 1,000 to millions of base pairs. Over the past few years, structural variation has been shown to be far more common in human genomes than previously thought. Very little is currently known about the effects…

  7. Higher order structure in the 3'-minor domain of small subunit ribosomal RNAs from a gram negative bacterium, a gram positive bacterium and a eukaryote

    DEFF Research Database (Denmark)

    Douthwaite, S; Christensen, A; Garrett, R A

    1983-01-01

    of additional higher order structure in the renatured free RNA. It can be concluded that a high level of conservation of higher order structure has occurred during the evolution of the gram negative and gram positive eubacteria and the eukaryote in both the double helical regions and the "unstructured" regions...

  8. The SGC beyond structural genomics: redefining the role of 3D structures by coupling genomic stratification with fragment-based discovery.

    Science.gov (United States)

    Bradley, Anthony R; Echalier, Aude; Fairhead, Michael; Strain-Damerell, Claire; Brennan, Paul; Bullock, Alex N; Burgess-Brown, Nicola A; Carpenter, Elisabeth P; Gileadi, Opher; Marsden, Brian D; Lee, Wen Hwa; Yue, Wyatt; Bountra, Chas; von Delft, Frank

    2017-11-08

    The ongoing explosion in genomics data has long since outpaced the capacity of conventional biochemical methodology to verify the large number of hypotheses that emerge from the analysis of such data. In contrast, it is still a gold-standard for early phenotypic validation towards small-molecule drug discovery to use probe molecules (or tool compounds), notwithstanding the difficulty and cost of generating them. Rational structure-based approaches to ligand discovery have long promised the efficiencies needed to close this divergence; in practice, however, this promise remains largely unfulfilled, for a host of well-rehearsed reasons and despite the huge technical advances spearheaded by the structural genomics initiatives of the noughties. Therefore the current, fourth funding phase of the Structural Genomics Consortium (SGC), building on its extensive experience in structural biology of novel targets and design of protein inhibitors, seeks to redefine what it means to do structural biology for drug discovery. We developed the concept of a Target Enabling Package (TEP) that provides, through reagents, assays and data, the missing link between genetic disease linkage and the development of usefully potent compounds. There are multiple prongs to the ambition: rigorously assessing targets' genetic disease linkages through crowdsourcing to a network of collaborating experts; establishing a systematic approach to generate the protocols and data that comprise each target's TEP; developing new, X-ray-based fragment technologies for generating high quality chemical matter quickly and cheaply; and exploiting a stringently open access model to build multidisciplinary partnerships throughout academia and industry. By learning how to scale these approaches, the SGC aims to make structures finally serve genomics, as originally intended, and demonstrate how 3D structures systematically allow new modes of druggability to be discovered for whole classes of targets. © 2017 The

  9. Population Genomics of Paramecium Species.

    Science.gov (United States)

    Johri, Parul; Krenek, Sascha; Marinov, Georgi K; Doak, Thomas G; Berendonk, Thomas U; Lynch, Michael

    2017-05-01

    Population-genomic analyses are essential to understanding factors shaping genomic variation and lineage-specific sequence constraints. The dearth of such analyses for unicellular eukaryotes prompted us to assess genomic variation in Paramecium, one of the most well-studied ciliate genera. The Paramecium aurelia complex consists of ∼15 morphologically indistinguishable species that diverged subsequent to two rounds of whole-genome duplications (WGDs, as long as 320 MYA) and possess extremely streamlined genomes. We examine patterns of both nuclear and mitochondrial polymorphism, by sequencing whole genomes of 10-13 worldwide isolates of each of three species belonging to the P. aurelia complex: P. tetraurelia, P. biaurelia, P. sexaurelia, as well as two outgroup species that do not share the WGDs: P. caudatum and P. multimicronucleatum. An apparent absence of global geographic population structure suggests continuous or recent dispersal of Paramecium over long distances. Intergenic regions are highly constrained relative to coding sequences, especially in P. caudatum and P. multimicronucleatum that have shorter intergenic distances. Sequence diversity and divergence are reduced up to ∼100-150 bp both upstream and downstream of genes, suggesting strong constraints imposed by the presence of densely packed regulatory modules. In addition, comparison of sequence variation at non-synonymous and synonymous sites suggests similar recent selective pressures on paralogs within and orthologs across the deeply diverging species. This study presents the first genome-wide population-genomic analysis in ciliates and provides a valuable resource for future studies in evolutionary and functional genetics in Paramecium. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  10. Structural Genomics and Drug Discovery for Infectious Diseases

    International Nuclear Information System (INIS)

    Anderson, W.F.

    2009-01-01

    The application of structural genomics methods and approaches to proteins from organisms causing infectious diseases is making available the three dimensional structures of many proteins that are potential drug targets and laying the groundwork for structure aided drug discovery efforts. There are a number of structural genomics projects with a focus on pathogens that have been initiated worldwide. The Center for Structural Genomics of Infectious Diseases (CSGID) was recently established to apply state-of-the-art high throughput structural biology technologies to the characterization of proteins from the National Institute for Allergy and Infectious Diseases (NIAID) category A-C pathogens and organisms causing emerging, or re-emerging infectious diseases. The target selection process emphasizes potential biomedical benefits. Selected proteins include known drug targets and their homologs, essential enzymes, virulence factors and vaccine candidates. The Center also provides a structure determination service for the infectious disease scientific community. The ultimate goal is to generate a library of structures that are available to the scientific community and can serve as a starting point for further research and structure aided drug discovery for infectious diseases. To achieve this goal, the CSGID will determine protein crystal structures of 400 proteins and protein-ligand complexes using proven, rapid, highly integrated, and cost-effective methods for such determination, primarily by X-ray crystallography. High throughput crystallographic structure determination is greatly aided by frequent, convenient access to high-performance beamlines at third-generation synchrotron X-ray sources.

  11. Eukaryotic tRNAs fingerprint invertebrates vis-à-vis vertebrates.

    Science.gov (United States)

    Mitra, Sanga; Das, Pijush; Samadder, Arpa; Das, Smarajit; Betai, Rupal; Chakrabarti, Jayprokas

    2015-01-01

    During translation, aminoacyl-tRNA synthetases recognize the identities of the tRNAs to charge them with their respective amino acids. The conserved identities of 58,244 eukaryotic tRNAs of 24 invertebrates and 45 vertebrates in genomic tRNA database were analyzed and their novel features extracted. The internal promoter sequences, namely, A-Box and B-Box, were investigated and evidence gathered that the intervention of optional nucleotides at 17a and 17b correlated with the optimal length of the A-Box. The presence of canonical transcription terminator sequences at the immediate vicinity of tRNA genes was ventured. Even though non-canonical introns had been reported in red alga, green alga, and nucleomorph so far, fairly motivating evidence of their existence emerged in tRNA genes of other eukaryotes. Non-canonical introns were seen to interfere with the internal promoters in two cases, questioning their transcription fidelity. In a first of its kind, phylogenetic constructs based on tRNA molecules delineated and built the trees of the vast and diverse invertebrates and vertebrates. Finally, two tRNA models representing the invertebrates and the vertebrates were drawn, by isolating the dominant consensus in the positional fluctuations of nucleotide compositions.

  12. Structural genomics of infectious disease drug targets: the SSGCID

    International Nuclear Information System (INIS)

    Stacy, Robin; Begley, Darren W.; Phan, Isabelle; Staker, Bart L.; Van Voorhis, Wesley C.; Varani, Gabriele; Buchko, Garry W.; Stewart, Lance J.; Myler, Peter J.

    2011-01-01

    An introduction and overview of the focus, goals and overall mission of the Seattle Structural Genomics Center for Infectious Disease (SSGCID) is given. The Seattle Structural Genomics Center for Infectious Disease (SSGCID) is a consortium of researchers at Seattle BioMed, Emerald BioStructures, the University of Washington and Pacific Northwest National Laboratory that was established to apply structural genomics approaches to drug targets from infectious disease organisms. The SSGCID is currently funded over a five-year period by the National Institute of Allergy and Infectious Diseases (NIAID) to determine the three-dimensional structures of 400 proteins from a variety of Category A, B and C pathogens. Target selection engages the infectious disease research and drug-therapy communities to identify drug targets, essential enzymes, virulence factors and vaccine candidates of biomedical relevance to combat infectious diseases. The protein-expression systems, purified proteins, ligand screens and three-dimensional structures produced by SSGCID constitute a valuable resource for drug-discovery research, all of which is made freely available to the greater scientific community. This issue of Acta Crystallographica Section F, entirely devoted to the work of the SSGCID, covers the details of the high-throughput pipeline and presents a series of structures from a broad array of pathogenic organisms. Here, a background is provided on the structural genomics of infectious disease, the essential components of the SSGCID pipeline are discussed and a survey of progress to date is presented

  13. Gene Composer in a structural genomics environment

    International Nuclear Information System (INIS)

    Lorimer, Don; Raymond, Amy; Mixon, Mark; Burgin, Alex; Staker, Bart; Stewart, Lance

    2011-01-01

    For structural biology applications, protein-construct engineering is guided by comparative sequence analysis and structural information, which allow the researcher to better define domain boundaries for terminal deletions and nonconserved regions for surface mutants. A database software application called Gene Composer has been developed to facilitate construct design. The structural genomics effort at the Seattle Structural Genomics Center for Infectious Disease (SSGCID) requires the manipulation of large numbers of amino-acid sequences and the underlying DNA sequences which are to be cloned into expression vectors. To improve efficiency in high-throughput protein structure determination, a database software package, Gene Composer, has been developed which facilitates the information-rich design of protein constructs and their underlying gene sequences. With its modular workflow design and numerous graphical user interfaces, Gene Composer enables researchers to perform all common bioinformatics steps used in modern structure-guided protein engineering and synthetic gene engineering. An example of the structure determination of H1N1 RNA-dependent RNA polymerase PB2 subunit is given

  14. Evolution of DNA replication protein complexes in eukaryotes and Archaea.

    Directory of Open Access Journals (Sweden)

    Nicholas Chia

    Full Text Available BACKGROUND: The replication of DNA in Archaea and eukaryotes requires several ancillary complexes, including proliferating cell nuclear antigen (PCNA, replication factor C (RFC, and the minichromosome maintenance (MCM complex. Bacterial DNA replication utilizes comparable proteins, but these are distantly related phylogenetically to their archaeal and eukaryotic counterparts at best. METHODOLOGY/PRINCIPAL FINDINGS: While the structures of each of the complexes do not differ significantly between the archaeal and eukaryotic versions thereof, the evolutionary dynamic in the two cases does. The number of subunits in each complex is constant across all taxa. However, they vary subtly with regard to composition. In some taxa the subunits are all identical in sequence, while in others some are homologous rather than identical. In the case of eukaryotes, there is no phylogenetic variation in the makeup of each complex-all appear to derive from a common eukaryotic ancestor. This is not the case in Archaea, where the relationship between the subunits within each complex varies taxon-to-taxon. We have performed a detailed phylogenetic analysis of these relationships in order to better understand the gene duplications and divergences that gave rise to the homologous subunits in Archaea. CONCLUSION/SIGNIFICANCE: This domain level difference in evolution suggests that different forces have driven the evolution of DNA replication proteins in each of these two domains. In addition, the phylogenies of all three gene families support the distinctiveness of the proposed archaeal phylum Thaumarchaeota.

  15. Structural biology at York Structural Biology Laboratory; laboratory information management systems for structural genomics

    Czech Academy of Sciences Publication Activity Database

    Dohnálek, Jan

    2005-01-01

    Roč. 12, č. 1 (2005), s. 3 ISSN 1211-5894. [Meeting of Structural Biologists /4./. 10.03.2005-12.03.2005, Nové Hrady] R&D Projects: GA MŠk(CZ) 1K05008 Keywords : structural biology * LIMS * structural genomics Subject RIV: CD - Macromolecular Chemistry

  16. A post-assembly genome-improvement toolkit (PAGIT) to obtain annotated genomes from contigs.

    Science.gov (United States)

    Swain, Martin T; Tsai, Isheng J; Assefa, Samual A; Newbold, Chris; Berriman, Matthew; Otto, Thomas D

    2012-06-07

    Genome projects now produce draft assemblies within weeks owing to advanced high-throughput sequencing technologies. For milestone projects such as Escherichia coli or Homo sapiens, teams of scientists were employed to manually curate and finish these genomes to a high standard. Nowadays, this is not feasible for most projects, and the quality of genomes is generally of a much lower standard. This protocol describes software (PAGIT) that is used to improve the quality of draft genomes. It offers flexible functionality to close gaps in scaffolds, correct base errors in the consensus sequence and exploit reference genomes (if available) in order to improve scaffolding and generating annotations. The protocol is most accessible for bacterial and small eukaryotic genomes (up to 300 Mb), such as pathogenic bacteria, malaria and parasitic worms. Applying PAGIT to an E. coli assembly takes ∼24 h: it doubles the average contig size and annotates over 4,300 gene models.

  17. Structure-based inference of molecular functions of proteins of unknown function from Berkeley Structural Genomics Center

    Energy Technology Data Exchange (ETDEWEB)

    Kim, Sung-Hou; Shin, Dong Hae; Hou, Jingtong; Chandonia, John-Marc; Das, Debanu; Choi, In-Geol; Kim, Rosalind; Kim, Sung-Hou

    2007-09-02

    Advances in sequence genomics have resulted in an accumulation of a huge number of protein sequences derived from genome sequences. However, the functions of a large portion of them cannot be inferred based on the current methods of sequence homology detection to proteins of known functions. Three-dimensional structure can have an important impact in providing inference of molecular function (physical and chemical function) of a protein of unknown function. Structural genomics centers worldwide have been determining many 3-D structures of the proteins of unknown functions, and possible molecular functions of them have been inferred based on their structures. Combined with bioinformatics and enzymatic assay tools, the successful acceleration of the process of protein structure determination through high throughput pipelines enables the rapid functional annotation of a large fraction of hypothetical proteins. We present a brief summary of the process we used at the Berkeley Structural Genomics Center to infer molecular functions of proteins of unknown function.

  18. From structure prediction to genomic screens for novel non-coding RNAs

    DEFF Research Database (Denmark)

    Gorodkin, Jan; Hofacker, Ivo L.

    2011-01-01

    Abstract: Non-coding RNAs (ncRNAs) are receiving more and more attention not only as an abundant class of genes, but also as regulatory structural elements (some located in mRNAs). A key feature of RNA function is its structure. Computational methods were developed early for folding and prediction....... This and the increased amount of available genomes have made it possible to employ structure-based methods for genomic screens. The field has moved from folding prediction of single sequences to computational screens for ncRNAs in genomic sequence using the RNA structure as the main characteristic feature. Whereas early...... upon some of the concepts in current methods that have been applied in genomic screens for de novo RNA structures in searches for novel ncRNA genes and regulatory RNA structure on mRNAs. We discuss the strengths and weaknesses of the different strategies and how they can complement each other....

  19. Waggawagga-CLI: A command-line tool for predicting stable single α-helices (SAH-domains, and the SAH-domain distribution across eukaryotes.

    Directory of Open Access Journals (Sweden)

    Dominic Simm

    Full Text Available Stable single-alpha helices (SAH-domains function as rigid connectors and constant force springs between structural domains, and can provide contact surfaces for protein-protein and protein-RNA interactions. SAH-domains mainly consist of charged amino acids and are monomeric and stable in polar solutions, characteristics which distinguish them from coiled-coil domains and intrinsically disordered regions. Although the number of reported SAH-domains is steadily increasing, genome-wide analyses of SAH-domains in eukaryotic genomes are still missing. Here, we present Waggawagga-CLI, a command-line tool for predicting and analysing SAH-domains in protein sequence datasets. Using Waggawagga-CLI we predicted SAH-domains in 24 datasets from eukaryotes across the tree of life. SAH-domains were predicted in 0.5 to 3.5% of the protein-coding content per species. SAH-domains are particularly present in longer proteins supporting their function as structural building block in multi-domain proteins. In human, SAH-domains are mainly used as alternative building blocks not being present in all transcripts of a gene. Gene ontology analysis showed that yeast proteins with SAH-domains are particular enriched in macromolecular complex subunit organization, cellular component biogenesis and RNA metabolic processes, and that they have a strong nuclear and ribonucleoprotein complex localization and function in ribosome and nucleic acid binding. Human proteins with SAH-domains have roles in all types of RNA processing and cytoskeleton organization, and are predicted to function in RNA binding, protein binding involved in cell and cell-cell adhesion, and cytoskeletal protein binding. Waggawagga-CLI allows the user to adjust the stabilizing and destabilizing contribution of amino acid interactions in i,i+3 and i,i+4 spacings, and provides extensive flexibility for user-designed analyses.

  20. Microbial eukaryote plankton communities of high-mountain lakes from three continents exhibit strong biogeographic patterns.

    Science.gov (United States)

    Filker, Sabine; Sommaruga, Ruben; Vila, Irma; Stoeck, Thorsten

    2016-05-01

    Microbial eukaryotes hold a key role in aquatic ecosystem functioning. Yet, their diversity in freshwater lakes, particularly in high-mountain lakes, is relatively unknown compared with the marine environment. Low nutrient availability, low water temperature and high ultraviolet radiation make most high-mountain lakes extremely challenging habitats for life and require specific molecular and physiological adaptations. We therefore expected that these ecosystems support a plankton diversity that differs notably from other freshwater lakes. In addition, we hypothesized that the communities under study exhibit geographic structuring. Our rationale was that geographic dispersal of small-sized eukaryotes in high-mountain lakes over continental distances seems difficult. We analysed hypervariable V4 fragments of the SSU rRNA gene to compare the genetic microbial eukaryote diversity in high-mountain lakes located in the European Alps, the Chilean Altiplano and the Ethiopian Bale Mountains. Microbial eukaryotes were not globally distributed corroborating patterns found for bacteria, multicellular animals and plants. Instead, the plankton community composition emerged as a highly specific fingerprint of a geographic region even on higher taxonomic levels. The intraregional heterogeneity of the investigated lakes was mirrored in shifts in microbial eukaryote community structure, which, however, was much less pronounced compared with interregional beta-diversity. Statistical analyses revealed that on a regional scale, environmental factors are strong predictors for plankton community structures in high-mountain lakes. While on long-distance scales (>10 000 km), isolation by distance is the most plausible scenario, on intermediate scales (up to 6000 km), both contemporary environmental factors and historical contingencies interact to shift plankton community structures. © 2016 John Wiley & Sons Ltd.

  1. Structure of a eukaryotic CLC transporter defines an intermediate state in the transport cycle

    Science.gov (United States)

    Feng, Liang; Campbell, Ernest B.; Hsiung, Yichun; MacKinnon, Roderick

    2011-01-01

    CLC proteins transport Cl− ions across cell membranes to control the electrical potential of muscle cells, transfer electrolytes across epithelia, and control the pH and electrolyte composition of intracellular organelles. Some members of this protein family are Cl− ion channels, while others are secondary active transporters that exchange Cl− ions and H+ with a 2:1 stoichiometry. We have determined the structure of a eukaryotic CLC transporter at 3.5 Å resolution. Cytoplasmic CBS domains are strategically positioned to regulate the ion transport pathway, and many disease-causing mutations in human CLCs reside on the CBS-transmembrane interface. Comparison with prokaryotic CLC shows that a gating glutamate changes conformation and suggests a basis for 2:1 Cl−/H+ exchange and a simple mechanistic connection between CLC channels and transporters. PMID:20929736

  2. Mutant power: using mutant allele collections for yeast functional genomics.

    Science.gov (United States)

    Norman, Kaitlyn L; Kumar, Anuj

    2016-03-01

    The budding yeast has long served as a model eukaryote for the functional genomic analysis of highly conserved signaling pathways, cellular processes and mechanisms underlying human disease. The collection of reagents available for genomics in yeast is extensive, encompassing a growing diversity of mutant collections beyond gene deletion sets in the standard wild-type S288C genetic background. We review here three main types of mutant allele collections: transposon mutagen collections, essential gene collections and overexpression libraries. Each collection provides unique and identifiable alleles that can be utilized in genome-wide, high-throughput studies. These genomic reagents are particularly informative in identifying synthetic phenotypes and functions associated with essential genes, including those modeled most effectively in complex genetic backgrounds. Several examples of genomic studies in filamentous/pseudohyphal backgrounds are provided here to illustrate this point. Additionally, the limitations of each approach are examined. Collectively, these mutant allele collections in Saccharomyces cerevisiae and the related pathogenic yeast Candida albicans promise insights toward an advanced understanding of eukaryotic molecular and cellular biology. © The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please email: journals.permissions@oup.com.

  3. Inverse PCR-based method for isolating novel SINEs from genome.

    Science.gov (United States)

    Han, Yawei; Chen, Liping; Guan, Lihong; He, Shunping

    2014-04-01

    Short interspersed elements (SINEs) are moderately repetitive DNA sequences in eukaryotic genomes. Although eukaryotic genomes contain numerous SINEs copy, it is very difficult and laborious to isolate and identify them by the reported methods. In this study, the inverse PCR was successfully applied to isolate SINEs from Opsariichthys bidens genome in Eastern Asian Cyprinid. A group of SINEs derived from tRNA(Ala) molecular had been identified, which were named Opsar according to Opsariichthys. SINEs characteristics were exhibited in Opsar, which contained a tRNA(Ala)-derived region at the 5' end, a tRNA-unrelated region, and AT-rich region at the 3' end. The tRNA-derived region of Opsar shared 76 % sequence similarity with tRNA(Ala) gene. This result indicated that Opsar could derive from the inactive or pseudogene of tRNA(Ala). The reliability of method was tested by obtaining C-SINE, Ct-SINE, and M-SINEs from Ctenopharyngodon idellus, Megalobrama amblycephala, and Cyprinus carpio genomes. This method is simpler than the previously reported, which successfully omitted many steps, such as preparation of probes, construction of genomic libraries, and hybridization.

  4. Structured RNAs and synteny regions in the pig genome

    DEFF Research Database (Denmark)

    Anthon, Christian; Tafer, Hakim; Havgaard, Jakob H

    2014-01-01

    BACKGROUND: Annotating mammalian genomes for noncoding RNAs (ncRNAs) is nontrivial since far from all ncRNAs are known and the computational models are resource demanding. Currently, the human genome holds the best mammalian ncRNA annotation, a result of numerous efforts by several groups. However......, a more direct strategy is desired for the increasing number of sequenced mammalian genomes of which some, such as the pig, are relevant as disease models and production animals. RESULTS: We present a comprehensive annotation of structured RNAs in the pig genome. Combining sequence and structure...... lncRNA loci, 11 conflicts of annotation, and 3,183 ncRNA genes. The ncRNA genes comprise 359 miRNAs, 8 ribozymes, 185 rRNAs, 638 snoRNAs, 1,030 snRNAs, 810 tRNAs and 153 ncRNA genes not belonging to the here fore mentioned classes. When running the pipeline on a local shuffled version of the genome...

  5. The Impact of Chromatin Dynamics on Cas9-Mediated Genome Editing in Human Cells.

    Science.gov (United States)

    Daer, René M; Cutts, Josh P; Brafman, David A; Haynes, Karmella A

    2017-03-17

    In order to efficiently edit eukaryotic genomes, it is critical to test the impact of chromatin dynamics on CRISPR/Cas9 function and develop strategies to adapt the system to eukaryotic contexts. So far, research has extensively characterized the relationship between the CRISPR endonuclease Cas9 and the composition of the RNA-DNA duplex that mediates the system's precision. Evidence suggests that chromatin modifications and DNA packaging can block eukaryotic genome editing by custom-built DNA endonucleases like Cas9; however, the underlying mechanism of Cas9 inhibition is unclear. Here, we demonstrate that closed, gene-silencing-associated chromatin is a mechanism for the interference of Cas9-mediated DNA editing. Our assays use a transgenic cell line with a drug-inducible switch to control chromatin states (open and closed) at a single genomic locus. We show that closed chromatin inhibits binding and editing at specific target sites and that artificial reversal of the silenced state restores editing efficiency. These results provide new insights to improve Cas9-mediated editing in human and other mammalian cells.

  6. Structural variation in two human genomes mapped at single-nucleotide resolution by whole genome de novo assembly

    DEFF Research Database (Denmark)

    Li, Yingrui; Zheng, Hancheng; Luo, Ruibang

    2011-01-01

    Here we use whole-genome de novo assembly of second-generation sequencing reads to map structural variation (SV) in an Asian genome and an African genome. Our approach identifies small- and intermediate-size homozygous variants (1-50 kb) including insertions, deletions, inversions and their precise...

  7. From structure to mechanism-understanding initiation of DNA replication.

    Science.gov (United States)

    Riera, Alberto; Barbon, Marta; Noguchi, Yasunori; Reuter, L Maximilian; Schneider, Sarah; Speck, Christian

    2017-06-01

    DNA replication results in the doubling of the genome prior to cell division. This process requires the assembly of 50 or more protein factors into a replication fork. Here, we review recent structural and biochemical insights that start to explain how specific proteins recognize DNA replication origins, load the replicative helicase on DNA, unwind DNA, synthesize new DNA strands, and reassemble chromatin. We focus on the minichromosome maintenance (MCM2-7) proteins, which form the core of the eukaryotic replication fork, as this complex undergoes major structural rearrangements in order to engage with DNA, regulate its DNA-unwinding activity, and maintain genome stability. © 2017 Riera et al.; Published by Cold Spring Harbor Laboratory Press.

  8. RPAN: rice pan-genome browser for ∼3000 rice genomes.

    Science.gov (United States)

    Sun, Chen; Hu, Zhiqiang; Zheng, Tianqing; Lu, Kuangchen; Zhao, Yue; Wang, Wensheng; Shi, Jianxin; Wang, Chunchao; Lu, Jinyuan; Zhang, Dabing; Li, Zhikang; Wei, Chaochun

    2017-01-25

    A pan-genome is the union of the gene sets of all the individuals of a clade or a species and it provides a new dimension of genome complexity with the presence/absence variations (PAVs) of genes among these genomes. With the progress of sequencing technologies, pan-genome study is becoming affordable for eukaryotes with large-sized genomes. The Asian cultivated rice, Oryza sativa L., is one of the major food sources for the world and a model organism in plant biology. Recently, the 3000 Rice Genome Project (3K RGP) sequenced more than 3000 rice genomes with a mean sequencing depth of 14.3×, which provided a tremendous resource for rice research. In this paper, we present a genome browser, Rice Pan-genome Browser (RPAN), as a tool to search and visualize the rice pan-genome derived from 3K RGP. RPAN contains a database of the basic information of 3010 rice accessions, including genomic sequences, gene annotations, PAV information and gene expression data of the rice pan-genome. At least 12 000 novel genes absent in the reference genome were included. RPAN also provides multiple search and visualization functions. RPAN can be a rich resource for rice biology and rice breeding. It is available at http://cgm.sjtu.edu.cn/3kricedb/ or http://www.rmbreeding.cn/pan3k. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  9. The Plasmodiophora brassicae genome reveals insights in its life cycle and ancestry of chitin synthases.

    Science.gov (United States)

    Schwelm, Arne; Fogelqvist, Johan; Knaust, Andrea; Jülke, Sabine; Lilja, Tua; Bonilla-Rosso, German; Karlsson, Magnus; Shevchenko, Andrej; Dhandapani, Vignesh; Choi, Su Ryun; Kim, Hong Gi; Park, Ju Young; Lim, Yong Pyo; Ludwig-Müller, Jutta; Dixelius, Christina

    2015-06-18

    Plasmodiophora brassicae causes clubroot, a major disease of Brassica oil and vegetable crops worldwide. P. brassicae is a Plasmodiophorid, obligate biotrophic protist in the eukaryotic kingdom of Rhizaria. Here we present the 25.5 Mb genome draft of P. brassicae, developmental stage-specific transcriptomes and a transcriptome of Spongospora subterranea, the Plasmodiophorid causing powdery scab on potato. Like other biotrophic pathogens both Plasmodiophorids are reduced in metabolic pathways. Phytohormones contribute to the gall phenotypes of infected roots. We report a protein (PbGH3) that can modify auxin and jasmonic acid. Plasmodiophorids contain chitin in cell walls of the resilient resting spores. If recognized, chitin can trigger defense responses in plants. Interestingly, chitin-related enzymes of Plasmodiophorids built specific families and the carbohydrate/chitin binding (CBM18) domain is enriched in the Plasmodiophorid secretome. Plasmodiophorids chitin synthases belong to two families, which were present before the split of the eukaryotic Stramenopiles/Alveolates/Rhizaria/Plantae and Metazoa/Fungi/Amoebozoa megagroups, suggesting chitin synthesis to be an ancient feature of eukaryotes. This exemplifies the importance of genomic data from unexplored eukaryotic groups, such as the Plasmodiophorids, to decipher evolutionary relationships and gene diversification of early eukaryotes.

  10. From structure prediction to genomic screens for novel non-coding RNAs.

    Science.gov (United States)

    Gorodkin, Jan; Hofacker, Ivo L

    2011-08-01

    Non-coding RNAs (ncRNAs) are receiving more and more attention not only as an abundant class of genes, but also as regulatory structural elements (some located in mRNAs). A key feature of RNA function is its structure. Computational methods were developed early for folding and prediction of RNA structure with the aim of assisting in functional analysis. With the discovery of more and more ncRNAs, it has become clear that a large fraction of these are highly structured. Interestingly, a large part of the structure is comprised of regular Watson-Crick and GU wobble base pairs. This and the increased amount of available genomes have made it possible to employ structure-based methods for genomic screens. The field has moved from folding prediction of single sequences to computational screens for ncRNAs in genomic sequence using the RNA structure as the main characteristic feature. Whereas early methods focused on energy-directed folding of single sequences, comparative analysis based on structure preserving changes of base pairs has been efficient in improving accuracy, and today this constitutes a key component in genomic screens. Here, we cover the basic principles of RNA folding and touch upon some of the concepts in current methods that have been applied in genomic screens for de novo RNA structures in searches for novel ncRNA genes and regulatory RNA structure on mRNAs. We discuss the strengths and weaknesses of the different strategies and how they can complement each other.

  11. The impact of genomics on research in diversity and evolution of archaea.

    Science.gov (United States)

    Mardanov, A V; Ravin, N V

    2012-08-01

    Since the definition of archaea as a separate domain of life along with bacteria and eukaryotes, they have become one of the most interesting objects of modern microbiology, molecular biology, and biochemistry. Sequencing and analysis of archaeal genomes were especially important for studies on archaea because of a limited availability of genetic tools for the majority of these microorganisms and problems associated with their cultivation. Fifteen years since the publication of the first genome of an archaeon, more than one hundred complete genome sequences of representatives of different phylogenetic groups have been determined. Analysis of these genomes has expanded our knowledge of biology of archaea, their diversity and evolution, and allowed identification and characterization of new deep phylogenetic lineages of archaea. The development of genome technologies has allowed sequencing the genomes of uncultivated archaea directly from enrichment cultures, metagenomic samples, and even from single cells. Insights have been gained into the evolution of key biochemical processes in archaea, such as cell division and DNA replication, the role of horizontal gene transfer in the evolution of archaea, and new relationships between archaea and eukaryotes have been revealed.

  12. Functional RNA structures throughout the Hepatitis C Virus genome.

    Science.gov (United States)

    Adams, Rebecca L; Pirakitikulr, Nathan; Pyle, Anna Marie

    2017-06-01

    The single-stranded Hepatitis C Virus (HCV) genome adopts a set of elaborate RNA structures that are involved in every stage of the viral lifecycle. Recent advances in chemical probing, sequencing, and structural biology have facilitated analysis of RNA folding on a genome-wide scale, revealing novel structures and networks of interactions. These studies have underscored the active role played by RNA in every function of HCV and they open the door to new types of RNA-targeted therapeutics. Copyright © 2017 Elsevier B.V. All rights reserved.

  13. Genome3D: a UK collaborative project to annotate genomic sequences with predicted 3D structures based on SCOP and CATH domains.

    Science.gov (United States)

    Lewis, Tony E; Sillitoe, Ian; Andreeva, Antonina; Blundell, Tom L; Buchan, Daniel W A; Chothia, Cyrus; Cuff, Alison; Dana, Jose M; Filippis, Ioannis; Gough, Julian; Hunter, Sarah; Jones, David T; Kelley, Lawrence A; Kleywegt, Gerard J; Minneci, Federico; Mitchell, Alex; Murzin, Alexey G; Ochoa-Montaño, Bernardo; Rackham, Owen J L; Smith, James; Sternberg, Michael J E; Velankar, Sameer; Yeats, Corin; Orengo, Christine

    2013-01-01

    Genome3D, available at http://www.genome3d.eu, is a new collaborative project that integrates UK-based structural resources to provide a unique perspective on sequence-structure-function relationships. Leading structure prediction resources (DomSerf, FUGUE, Gene3D, pDomTHREADER, Phyre and SUPERFAMILY) provide annotations for UniProt sequences to indicate the locations of structural domains (structural annotations) and their 3D structures (structural models). Structural annotations and 3D model predictions are currently available for three model genomes (Homo sapiens, E. coli and baker's yeast), and the project will extend to other genomes in the near future. As these resources exploit different strategies for predicting structures, the main aim of Genome3D is to enable comparisons between all the resources so that biologists can see where predictions agree and are therefore more trusted. Furthermore, as these methods differ in whether they build their predictions using CATH or SCOP, Genome3D also contains the first official mapping between these two databases. This has identified pairs of similar superfamilies from the two resources at various degrees of consensus (532 bronze pairs, 527 silver pairs and 370 gold pairs).

  14. Footprinting analysis of interactions between the largest eukaryotic RNase P/MRP protein Pop1 and RNase P/MRP RNA components.

    Science.gov (United States)

    Fagerlund, Robert D; Perederina, Anna; Berezin, Igor; Krasilnikov, Andrey S

    2015-09-01

    Ribonuclease (RNase) P and RNase MRP are closely related catalytic ribonucleoproteins involved in the metabolism of a wide range of RNA molecules, including tRNA, rRNA, and some mRNAs. The catalytic RNA component of eukaryotic RNase P retains the core elements of the bacterial RNase P ribozyme; however, the peripheral RNA elements responsible for the stabilization of the global architecture are largely absent in the eukaryotic enzyme. At the same time, the protein makeup of eukaryotic RNase P is considerably more complex than that of the bacterial RNase P. RNase MRP, an essential and ubiquitous eukaryotic enzyme, has a structural organization resembling that of eukaryotic RNase P, and the two enzymes share most of their protein components. Here, we present the results of the analysis of interactions between the largest protein component of yeast RNases P/MRP, Pop1, and the RNA moieties of the enzymes, discuss structural implications of the results, and suggest that Pop1 plays the role of a scaffold for the stabilization of the global architecture of eukaryotic RNase P RNA, substituting for the network of RNA-RNA tertiary interactions that maintain the global RNA structure in bacterial RNase P. © 2015 Fagerlund et al.; Published by Cold Spring Harbor Laboratory Press for the RNA Society.

  15. How MCM loading and spreading specify eukaryotic DNA replication initiation sites.

    Science.gov (United States)

    Hyrien, Olivier

    2016-01-01

    DNA replication origins strikingly differ between eukaryotic species and cell types. Origins are localized and can be highly efficient in budding yeast, are randomly located in early fly and frog embryos, which do not transcribe their genomes, and are clustered in broad (10-100 kb) non-transcribed zones, frequently abutting transcribed genes, in mammalian cells. Nonetheless, in all cases, origins are established during the G1-phase of the cell cycle by the loading of double hexamers of the Mcm 2-7 proteins (MCM DHs), the core of the replicative helicase. MCM DH activation in S-phase leads to origin unwinding, polymerase recruitment, and initiation of bidirectional DNA synthesis. Although MCM DHs are initially loaded at sites defined by the binding of the origin recognition complex (ORC), they ultimately bind chromatin in much greater numbers than ORC and only a fraction are activated in any one S-phase. Data suggest that the multiplicity and functional redundancy of MCM DHs provide robustness to the replication process and affect replication time and that MCM DHs can slide along the DNA and spread over large distances around the ORC. Recent studies further show that MCM DHs are displaced along the DNA by collision with transcription complexes but remain functional for initiation after displacement. Therefore, eukaryotic DNA replication relies on intrinsically mobile and flexible origins, a strategy fundamentally different from bacteria but conserved from yeast to human. These properties of MCM DHs likely contribute to the establishment of broad, intergenic replication initiation zones in higher eukaryotes.

  16. Snapshot of the eukaryotic gene expression in muskoxen rumen--a metatranscriptomic approach.

    Directory of Open Access Journals (Sweden)

    Meng Qi

    Full Text Available BACKGROUND: Herbivores rely on digestive tract lignocellulolytic microorganisms, including bacteria, fungi and protozoa, to derive energy and carbon from plant cell wall polysaccharides. Culture independent metagenomic studies have been used to reveal the genetic content of the bacterial species within gut microbiomes. However, the nature of the genes encoded by eukaryotic protozoa and fungi within these environments has not been explored using metagenomic or metatranscriptomic approaches. METHODOLOGY/PRINCIPAL FINDINGS: In this study, a metatranscriptomic approach was used to investigate the functional diversity of the eukaryotic microorganisms within the rumen of muskoxen (Ovibos moschatus, with a focus on plant cell wall degrading enzymes. Polyadenylated RNA (mRNA was sequenced on the Illumina Genome Analyzer II system and 2.8 gigabases of sequences were obtained and 59129 contigs assembled. Plant cell wall degrading enzyme modules including glycoside hydrolases, carbohydrate esterases and polysaccharide lyases were identified from over 2500 contigs. These included a number of glycoside hydrolase family 6 (GH6, GH48 and swollenin modules, which have rarely been described in previous gut metagenomic studies. CONCLUSIONS/SIGNIFICANCE: The muskoxen rumen metatranscriptome demonstrates a much higher percentage of cellulase enzyme discovery and an 8.7x higher rate of total carbohydrate active enzyme discovery per gigabase of sequence than previous rumen metagenomes. This study provides a snapshot of eukaryotic gene expression in the muskoxen rumen, and identifies a number of candidate genes coding for potentially valuable lignocellulolytic enzymes.

  17. Snapshot of the Eukaryotic Gene Expression in Muskoxen Rumen—A Metatranscriptomic Approach

    Science.gov (United States)

    O'Toole, Nicholas; Barboza, Perry S.; Ungerfeld, Emilio; Leigh, Mary Beth; Selinger, L. Brent; Butler, Greg; Tsang, Adrian; McAllister, Tim A.; Forster, Robert J.

    2011-01-01

    Background Herbivores rely on digestive tract lignocellulolytic microorganisms, including bacteria, fungi and protozoa, to derive energy and carbon from plant cell wall polysaccharides. Culture independent metagenomic studies have been used to reveal the genetic content of the bacterial species within gut microbiomes. However, the nature of the genes encoded by eukaryotic protozoa and fungi within these environments has not been explored using metagenomic or metatranscriptomic approaches. Methodology/Principal Findings In this study, a metatranscriptomic approach was used to investigate the functional diversity of the eukaryotic microorganisms within the rumen of muskoxen (Ovibos moschatus), with a focus on plant cell wall degrading enzymes. Polyadenylated RNA (mRNA) was sequenced on the Illumina Genome Analyzer II system and 2.8 gigabases of sequences were obtained and 59129 contigs assembled. Plant cell wall degrading enzyme modules including glycoside hydrolases, carbohydrate esterases and polysaccharide lyases were identified from over 2500 contigs. These included a number of glycoside hydrolase family 6 (GH6), GH48 and swollenin modules, which have rarely been described in previous gut metagenomic studies. Conclusions/Significance The muskoxen rumen metatranscriptome demonstrates a much higher percentage of cellulase enzyme discovery and an 8.7x higher rate of total carbohydrate active enzyme discovery per gigabase of sequence than previous rumen metagenomes. This study provides a snapshot of eukaryotic gene expression in the muskoxen rumen, and identifies a number of candidate genes coding for potentially valuable lignocellulolytic enzymes. PMID:21655220

  18. Statistical properties of thermodynamically predicted RNA secondary structures in viral genomes

    Science.gov (United States)

    Spanò, M.; Lillo, F.; Miccichè, S.; Mantegna, R. N.

    2008-10-01

    By performing a comprehensive study on 1832 segments of 1212 complete genomes of viruses, we show that in viral genomes the hairpin structures of thermodynamically predicted RNA secondary structures are more abundant than expected under a simple random null hypothesis. The detected hairpin structures of RNA secondary structures are present both in coding and in noncoding regions for the four groups of viruses categorized as dsDNA, dsRNA, ssDNA and ssRNA. For all groups, hairpin structures of RNA secondary structures are detected more frequently than expected for a random null hypothesis in noncoding rather than in coding regions. However, potential RNA secondary structures are also present in coding regions of dsDNA group. In fact, we detect evolutionary conserved RNA secondary structures in conserved coding and noncoding regions of a large set of complete genomes of dsDNA herpesviruses.

  19. Structure of a Eukaryotic CLC Transporter Defines an Intermediate State in the Transport Cycle

    International Nuclear Information System (INIS)

    Feng, Liang; Campbell, Ernest B.; Hsiung, Yichun; MacKinnon, Roderick

    2010-01-01

    CLC proteins transport chloride (Cl - ) ions across cell membranes to control the electrical potential of muscle cells, transfer electrolytes across epithelia, and control the pH and electrolyte composition of intracellular organelles. Some members of this protein family are Cl - ion channels, whereas others are secondary active transporters that exchange Cl - ions and protons (H + ) with a 2:1 stoichiometry. We have determined the structure of a eukaryotic CLC transporter at 3.5 angstrom resolution. Cytoplasmic cystathionine beta-synthase (CBS) domains are strategically positioned to regulate the ion-transport pathway, and many disease-causing mutations in human CLCs reside on the CBS-transmembrane interface. Comparison with prokaryotic CLC shows that a gating glutamate residue changes conformation and suggests a basis for 2:1 Cl - /H + exchange and a simple mechanistic connection between CLC channels and transporters.

  20. From structure prediction to genomic screens for novel non-coding RNAs.

    Directory of Open Access Journals (Sweden)

    Jan Gorodkin

    2011-08-01

    Full Text Available Non-coding RNAs (ncRNAs are receiving more and more attention not only as an abundant class of genes, but also as regulatory structural elements (some located in mRNAs. A key feature of RNA function is its structure. Computational methods were developed early for folding and prediction of RNA structure with the aim of assisting in functional analysis. With the discovery of more and more ncRNAs, it has become clear that a large fraction of these are highly structured. Interestingly, a large part of the structure is comprised of regular Watson-Crick and GU wobble base pairs. This and the increased amount of available genomes have made it possible to employ structure-based methods for genomic screens. The field has moved from folding prediction of single sequences to computational screens for ncRNAs in genomic sequence using the RNA structure as the main characteristic feature. Whereas early methods focused on energy-directed folding of single sequences, comparative analysis based on structure preserving changes of base pairs has been efficient in improving accuracy, and today this constitutes a key component in genomic screens. Here, we cover the basic principles of RNA folding and touch upon some of the concepts in current methods that have been applied in genomic screens for de novo RNA structures in searches for novel ncRNA genes and regulatory RNA structure on mRNAs. We discuss the strengths and weaknesses of the different strategies and how they can complement each other.

  1. MCM Paradox: Abundance of Eukaryotic Replicative Helicases and Genomic Integrity.

    Science.gov (United States)

    Das, Mitali; Singh, Sunita; Pradhan, Satyajit; Narayan, Gopeshwar

    2014-01-01

    As a crucial component of DNA replication licensing system, minichromosome maintenance (MCM) 2-7 complex acts as the eukaryotic DNA replicative helicase. The six related MCM proteins form a heterohexamer and bind with ORC, CDC6, and Cdt1 to form the prereplication complex. Although the MCMs are well known as replicative helicases, their overabundance and distribution patterns on chromatin present a paradox called the "MCM paradox." Several approaches had been taken to solve the MCM paradox and describe the purpose of excess MCMs distributed beyond the replication origins. Alternative functions of these MCMs rather than a helicase had also been proposed. This review focuses on several models and concepts generated to solve the MCM paradox coinciding with their helicase function and provides insight into the concept that excess MCMs are meant for licensing dormant origins as a backup during replication stress. Finally, we extend our view towards the effect of alteration of MCM level. Though an excess MCM constituent is needed for normal cells to withstand stress, there must be a delineation of the threshold level in normal and malignant cells. This review also outlooks the future prospects to better understand the MCM biology.

  2. Phylogenetic tree based on complete genomes using fractal and correlation analyses without sequence alignment

    Directory of Open Access Journals (Sweden)

    Zu-Guo Yu

    2006-06-01

    Full Text Available The complete genomes of living organisms have provided much information on their phylogenetic relationships. Similarly, the complete genomes of chloroplasts have helped resolve the evolution of this organelle in photosynthetic eukaryotes. In this review, we describe two algorithms to construct phylogenetic trees based on the theories of fractals and dynamic language using complete genomes. These algorithms were developed by our research group in the past few years. Our distance-based phylogenetic tree of 109 prokaryotes and eukaryotes agrees with the biologists' "tree of life" based on the 16S-like rRNA genes in a majority of basic branchings and most lower taxa. Our phylogenetic analysis also shows that the chloroplast genomes are separated into two major clades corresponding to chlorophytes s.l. and rhodophytes s.l. The interrelationships among the chloroplasts are largely in agreement with the current understanding on chloroplast evolution.

  3. Hypothesis: Gene-rich plastid genomes in red algae may be an outcome of nuclear genome reduction.

    Science.gov (United States)

    Qiu, Huan; Lee, Jun Mo; Yoon, Hwan Su; Bhattacharya, Debashish

    2017-06-01

    Red algae (Rhodophyta) putatively diverged from the eukaryote tree of life >1.2 billion years ago and are the source of plastids in the ecologically important diatoms, haptophytes, and dinoflagellates. In general, red algae contain the largest plastid gene inventory among all such organelles derived from primary, secondary, or additional rounds of endosymbiosis. In contrast, their nuclear gene inventory is reduced when compared to their putative sister lineage, the Viridiplantae, and other photosynthetic lineages. The latter is thought to have resulted from a phase of genome reduction that occurred in the stem lineage of Rhodophyta. A recent comparative analysis of a taxonomically broad collection of red algal and Viridiplantae plastid genomes demonstrates that the red algal ancestor encoded ~1.5× more plastid genes than Viridiplantae. This difference is primarily explained by more extensive endosymbiotic gene transfer (EGT) in the stem lineage of Viridiplantae, when compared to red algae. We postulate that limited EGT in Rhodophytes resulted from the countervailing force of ancient, and likely recurrent, nuclear genome reduction. In other words, the propensity for nuclear gene loss led to the retention of red algal plastid genes that would otherwise have undergone intracellular gene transfer to the nucleus. This hypothesis recognizes the primacy of nuclear genome evolution over that of plastids, which have no inherent control of their gene inventory and can change dramatically (e.g., secondarily non-photosynthetic eukaryotes, dinoflagellates) in response to selection acting on the host lineage. © 2017 Phycological Society of America.

  4. Genomic Organization of Zebrafish microRNAs

    Directory of Open Access Journals (Sweden)

    Paydar Ima

    2008-05-01

    Full Text Available Abstract Background microRNAs (miRNAs are small (~22 nt non-coding RNAs that regulate cell movement, specification, and development. Expression of miRNAs is highly regulated, both spatially and temporally. Based on direct cloning, sequence conservation, and predicted secondary structures, a large number of miRNAs have been identified in higher eukaryotic genomes but whether these RNAs are simply a subset of a much larger number of noncoding RNA families is unknown. This is especially true in zebrafish where genome sequencing and annotation is not yet complete. Results We analyzed the zebrafish genome to identify the number and location of proven and predicted miRNAs resulting in the identification of 35 new miRNAs. We then grouped all 415 zebrafish miRNAs into families based on seed sequence identity as a means to identify possible functional redundancy. Based on genomic location and expression analysis, we also identified those miRNAs that are likely to be encoded as part of polycistronic transcripts. Lastly, as a resource, we compiled existing zebrafish miRNA expression data and, where possible, listed all experimentally proven mRNA targets. Conclusion Current analysis indicates the zebrafish genome encodes 415 miRNAs which can be grouped into 44 families. The largest of these families (the miR-430 family contains 72 members largely clustered in two main locations along chromosome 4. Thus far, most zebrafish miRNAs exhibit tissue specific patterns of expression.

  5. Freedom and Responsibility in Synthetic Genomics: The Synthetic Yeast Project

    OpenAIRE

    Sliva, Anna; Yang, Huanming; Boeke, Jef D.; Mathews, Debra J. H.

    2015-01-01

    First introduced in 2011, the Synthetic Yeast Genome (Sc2.0) Project is a large international synthetic genomics project that will culminate in the first eukaryotic cell (Saccharomyces cerevisiae) with a fully synthetic genome. With collaborators from across the globe and from a range of institutions spanning from do-it-yourself biology (DIYbio) to commercial enterprises, it is important that all scientists working on this project are cognizant of the ethical and policy issues associated with...

  6. Efficient fdCas9 Synthetic Endonuclease with Improved Specificity for Precise Genome Engineering

    KAUST Repository

    Aouida, Mustapha

    2015-07-30

    The Cas9 endonuclease is used for genome editing applications in diverse eukaryotic species. A high frequency of off-target activity has been reported in many cell types, limiting its applications to genome engineering, especially in genomic medicine. Here, we generated a synthetic chimeric protein between the catalytic domain of the FokI endonuclease and the catalytically inactive Cas9 protein (fdCas9). A pair of guide RNAs (gRNAs) that bind to sense and antisense strands with a defined spacer sequence range can be used to form a catalytically active dimeric fdCas9 protein and generate double-strand breaks (DSBs) within the spacer sequence. Our data demonstrate an improved catalytic activity of the fdCas9 endonuclease, with a spacer range of 15–39 nucleotides, on surrogate reporters and genomic targets. Furthermore, we observed no detectable fdCas9 activity at known Cas9 off-target sites. Taken together, our data suggest that the fdCas9 endonuclease variant is a superior platform for genome editing applications in eukaryotic systems including mammalian cells.

  7. Efficient fdCas9 Synthetic Endonuclease with Improved Specificity for Precise Genome Engineering

    KAUST Repository

    Aouida, Mustapha; Eid, Ayman; Ali, Zahir; Cradick, Thomas; Lee, Ciaran; Deshmukh, Harshavardhan; Atef, Ahmed; Abu Samra, Dina Bashir Kamil; Gadhoum, Samah Zeineb; Merzaban, Jasmeen; Bao, Gang; Mahfouz, Magdy M.

    2015-01-01

    The Cas9 endonuclease is used for genome editing applications in diverse eukaryotic species. A high frequency of off-target activity has been reported in many cell types, limiting its applications to genome engineering, especially in genomic medicine. Here, we generated a synthetic chimeric protein between the catalytic domain of the FokI endonuclease and the catalytically inactive Cas9 protein (fdCas9). A pair of guide RNAs (gRNAs) that bind to sense and antisense strands with a defined spacer sequence range can be used to form a catalytically active dimeric fdCas9 protein and generate double-strand breaks (DSBs) within the spacer sequence. Our data demonstrate an improved catalytic activity of the fdCas9 endonuclease, with a spacer range of 15–39 nucleotides, on surrogate reporters and genomic targets. Furthermore, we observed no detectable fdCas9 activity at known Cas9 off-target sites. Taken together, our data suggest that the fdCas9 endonuclease variant is a superior platform for genome editing applications in eukaryotic systems including mammalian cells.

  8. From NGS assembly challenges to instability of fungal mitochondrial genomes: A case study in genome complexity.

    Science.gov (United States)

    Misas, Elizabeth; Muñoz, José Fernando; Gallo, Juan Esteban; McEwen, Juan Guillermo; Clay, Oliver Keatinge

    2016-04-01

    The presence of repetitive or non-unique DNA persisting over sizable regions of a eukaryotic genome can hinder the genome's successful de novo assembly from short reads: ambiguities in assigning genome locations to the non-unique subsequences can result in premature termination of contigs and thus overfragmented assemblies. Fungal mitochondrial (mtDNA) genomes are compact (typically less than 100 kb), yet often contain short non-unique sequences that can be shown to impede their successful de novo assembly in silico. Such repeats can also confuse processes in the cell in vivo. A well-studied example is ectopic (out-of-register, illegitimate) recombination associated with repeat pairs, which can lead to deletion of functionally important genes that are located between the repeats. Repeats that remain conserved over micro- or macroevolutionary timescales despite such risks may indicate functionally or structurally (e.g., for replication) important regions. This principle could form the basis of a mining strategy for accelerating discovery of function in genome sequences. We present here our screening of a sample of 11 fully sequenced fungal mitochondrial genomes by observing where exact k-mer repeats occurred several times; initial analyses motivated us to focus on 17-mers occurring more than three times. Based on the diverse repeats we observe, we propose that such screening may serve as an efficient expedient for gaining a rapid but representative first insight into the repeat landscapes of sparsely characterized mitochondrial chromosomes. Our matching of the flagged repeats to previously reported regions of interest supports the idea that systems of persisting, non-trivial repeats in genomes can often highlight features meriting further attention. Copyright © 2016 Elsevier Ltd. All rights reserved.

  9. Complete Mitochondrial Genome of the Medicinal Mushroom Ganoderma lucidum

    Science.gov (United States)

    Chen, Haimei; Chen, Xiangdong; Lan, Jin; Liu, Chang

    2013-01-01

    Ganoderma lucidum is one of the well-known medicinal basidiomycetes worldwide. The mitochondrion, referred to as the second genome, is an organelle found in most eukaryotic cells and participates in critical cellular functions. Elucidating the structure and function of this genome is important to understand completely the genetic contents of G. lucidum. In this study, we assembled the mitochondrial genome of G. lucidum and analyzed the differential expressions of its encoded genes across three developmental stages. The mitochondrial genome is a typical circular DNA molecule of 60,630 bp with a GC content of 26.67%. Genome annotation identified genes that encode 15 conserved proteins, 27 tRNAs, small and large rRNAs, four homing endonucleases, and two hypothetical proteins. Except for genes encoding trnW and two hypothetical proteins, all genes were located on the positive strand. For the repeat structure analysis, eight forward, two inverted, and three tandem repeats were detected. A pair of fragments with a total length around 5.5 kb was found in both the nuclear and mitochondrial genomes, which suggests the possible transfer of DNA sequences between two genomes. RNA-Seq data for samples derived from three stages, namely, mycelia, primordia, and fruiting bodies, were mapped to the mitochondrial genome and qualified. The protein-coding genes were expressed higher in mycelia or primordial stages compared with those in the fruiting bodies. The rRNA abundances were significantly higher in all three stages. Two regions were transcribed but did not contain any identified protein or tRNA genes. Furthermore, three RNA-editing sites were detected. Genome synteny analysis showed that significant genome rearrangements occurred in the mitochondrial genomes. This study provides valuable information on the gene contents of the mitochondrial genome and their differential expressions at various developmental stages of G. lucidum. The results contribute to the understanding of the

  10. Evolution of endogenous non-retroviral genes integrated into plant genomes

    Directory of Open Access Journals (Sweden)

    Hyosub Chu

    2014-08-01

    Full Text Available Numerous comparative genome analyses have revealed the wide extent of horizontal gene transfer (HGT in living organisms, which contributes to their evolution and genetic diversity. Viruses play important roles in HGT. Endogenous viral elements (EVEs are defined as viral DNA sequences present within the genomes of non-viral organisms. In eukaryotic cells, the majority of EVEs are derived from RNA viruses using reverse transcription. In contrast, endogenous non-retroviral elements (ENREs are poorly studied. However, the increasing availability of genomic data and the rapid development of bioinformatics tools have enabled the identification of several ENREs in various eukaryotic organisms. To date, a small number of ENREs integrated into plant genomes have been identified. Of the known non-retroviruses, most identified ENREs are derived from double-strand (ds RNA viruses, followed by single-strand (ss DNA and ssRNA viruses. At least eight virus families have been identified. Of these, viruses in the family Partitiviridae are dominant, followed by viruses of the families Chrysoviridae and Geminiviridae. The identified ENREs have been primarily identified in eudicots, followed by monocots. In this review, we briefly discuss the current view on non-retroviral sequences integrated into plant genomes that are associated with plant-virus evolution and their possible roles in antiviral resistance.

  11. Genomics of Volvocine Algae

    Science.gov (United States)

    Umen, James G.; Olson, Bradley J.S.C.

    2015-01-01

    Volvocine algae are a group of chlorophytes that together comprise a unique model for evolutionary and developmental biology. The species Chlamydomonas reinhardtii and Volvox carteri represent extremes in morphological diversity within the Volvocine clade. Chlamydomonas is unicellular and reflects the ancestral state of the group, while Volvox is multicellular and has evolved numerous innovations including germ-soma differentiation, sexual dimorphism, and complex morphogenetic patterning. The Chlamydomonas genome sequence has shed light on several areas of eukaryotic cell biology, metabolism and evolution, while the Volvox genome sequence has enabled a comparison with Chlamydomonas that reveals some of the underlying changes that enabled its transition to multicellularity, but also underscores the subtlety of this transition. Many of the tools and resources are in place to further develop Volvocine algae as a model for evolutionary genomics. PMID:25883411

  12. Heterologous Expression of Toxins from Bacterial Toxin-Antitoxin Systems in Eukaryotic Cells: Strategies and Applications

    Science.gov (United States)

    Yeo, Chew Chieng; Abu Bakar, Fauziah; Chan, Wai Ting; Espinosa, Manuel; Harikrishna, Jennifer Ann

    2016-01-01

    Toxin-antitoxin (TA) systems are found in nearly all prokaryotic genomes and usually consist of a pair of co-transcribed genes, one of which encodes a stable toxin and the other, its cognate labile antitoxin. Certain environmental and physiological cues trigger the degradation of the antitoxin, causing activation of the toxin, leading either to the death or stasis of the host cell. TA systems have a variety of functions in the bacterial cell, including acting as mediators of programmed cell death, the induction of a dormant state known as persistence and the stable maintenance of plasmids and other mobile genetic elements. Some bacterial TA systems are functional when expressed in eukaryotic cells and this has led to several innovative applications, which are the subject of this review. Here, we look at how bacterial TA systems have been utilized for the genetic manipulation of yeasts and other eukaryotes, for the containment of genetically modified organisms, and for the engineering of high expression eukaryotic cell lines. We also examine how TA systems have been adopted as an important tool in developmental biology research for the ablation of specific cells and the potential for utility of TA systems in antiviral and anticancer gene therapies. PMID:26907343

  13. Heterologous Expression of Toxins from Bacterial Toxin-Antitoxin Systems in Eukaryotic Cells: Strategies and Applications

    Directory of Open Access Journals (Sweden)

    Chew Chieng Yeo

    2016-02-01

    Full Text Available Toxin-antitoxin (TA systems are found in nearly all prokaryotic genomes and usually consist of a pair of co-transcribed genes, one of which encodes a stable toxin and the other, its cognate labile antitoxin. Certain environmental and physiological cues trigger the degradation of the antitoxin, causing activation of the toxin, leading either to the death or stasis of the host cell. TA systems have a variety of functions in the bacterial cell, including acting as mediators of programmed cell death, the induction of a dormant state known as persistence and the stable maintenance of plasmids and other mobile genetic elements. Some bacterial TA systems are functional when expressed in eukaryotic cells and this has led to several innovative applications, which are the subject of this review. Here, we look at how bacterial TA systems have been utilized for the genetic manipulation of yeasts and other eukaryotes, for the containment of genetically modified organisms, and for the engineering of high expression eukaryotic cell lines. We also examine how TA systems have been adopted as an important tool in developmental biology research for the ablation of specific cells and the potential for utility of TA systems in antiviral and anticancer gene therapies.

  14. Using physicochemical and compositional characteristics of DNA sequence for prediction of genomic signals

    KAUST Repository

    Mulamba, Pierre Abraham

    2014-12-01

    The challenge in finding genes in eukaryotic organisms using computational methods is an ongoing problem in the biology. Based on various genomic signals found in eukaryotic genomes, this problem can be divided into many different sub­‐problems such as identification of transcription start sites, translation initiation sites, splice sites, poly (A) signals, etc. Each sub-­problem deals with a particular type of genomic signals and various computational methods are used to solve each sub-­problem. Aggregating information from all these individual sub-­problems can lead to a complete annotation of a gene and its component signals. The fundamental principle of most of these computational methods is the mapping principle – building an input-­output model for the prediction of a particular genomic signal based on a set of known input signals and their corresponding output signal. The type of input signals used to build the model is an essential element in most of these computational methods. The common factor of most of these methods is that they are mainly based on the statistical analysis of the basic nucleotide sequence string composition. 4 Our study is based on a novel approach to predict genomic signals in which uniquely generated structural profiles that combine compressed physicochemical properties with topological and compositional properties of DNA sequences are used to develop machine learning predictive models. The compression of the physicochemical properties is made using principal component analysis transformation. Our ideas are evaluated through prediction models of canonical splice sites using support vector machine models. We demonstrate across several species that the proposed methodology has resulted in the most accurate splice site predictors that are publicly available or described. We believe that the approach in this study is quite general and has various applications in other biological modeling problems.

  15. Are maternal mitochondria the selfish entities that are masters of the cells of eukaryotic multicellular organisms?

    Science.gov (United States)

    Barlow, Peter W; Baldelli, E; Baluška, Frantisek

    2009-01-01

    The Energide concept, as well as the endosymbiotic theory of eukaryotic cell organization and evolution, proposes that present-day cells of eukaryotic organisms are mosaics of specialized and cooperating units, or organelles. Some of these units were originally free-living prokaryotes, which were engulfed during evolutionary time. Mitochondria represent one of these types of previously independent organisms, the Energide, is another type. This new perspective on the organization of the cell has been further expanded to reveal the concept of a public milieu, the cytosol, in which Energides and mitochondria live, each with their own private internal milieu. The present paper discusses how the endosymbiotic theory implicates a new hypothesis about the hierarchical and communicational organization of the integrated prokaryotic components of the eukaryotic cell and provides a new angle from which to consider the theory of evolution and its bearing upon cellular complexity. Thus, it is proposed that the “selfish gene” hypothesis of Dawkins1 is not the only possible perspective for comprehending genomic and cellular evolution. Our proposal is that maternal mitochondria are the selfish “master” entities of the eukaryotic cell with respect not only to their propagation from cell-to-cell and from generation-to-generation but also to their regulation of all other cellular functions. However, it should be recognized that the concept of “master” and “servant” cell components is a metaphor; in present-day living organisms their organellar components are considered to be interdependent and inseparable. PMID:19513277

  16. Structural insights and ab initio sequencing within the DING proteins family

    International Nuclear Information System (INIS)

    Elias, Mikael; Liebschner, Dorothee; Gotthard, Guillaume; Chabriere, Eric

    2011-01-01

    DING proteins constitute a recently discovered protein family that is ubiquitous in eukaryotes. The structural insights and the physiological involvements of these intriguing proteins are hereby deciphered. DING proteins constitute an intriguing family of phosphate-binding proteins that was identified in a wide range of organisms, from prokaryotes and archae to eukaryotes. Despite their seemingly ubiquitous occurrence in eukaryotes, their encoding genes are missing from sequenced genomes. Such a lack has considerably hampered functional studies. In humans, these proteins have been related to several diseases, like atherosclerosis, kidney stones, inflammation processes and HIV inhibition. The human phosphate binding protein is a human representative of the DING family that was serendipitously discovered from human plasma. An original approach was developed to determine ab initio the complete and exact sequence of this 38 kDa protein by utilizing mass spectrometry and X-ray data in tandem. Taking advantage of this first complete eukaryotic DING sequence, a immunohistochemistry study was undertaken to check the presence of DING proteins in various mice tissues, revealing that these proteins are widely expressed. Finally, the structure of a bacterial representative from Pseudomonas fluorescens was solved at sub-angstrom resolution, allowing the molecular mechanism of the phosphate binding in these high-affinity proteins to be elucidated

  17. Structural insights and ab initio sequencing within the DING proteins family

    Energy Technology Data Exchange (ETDEWEB)

    Elias, Mikael, E-mail: mikael.elias@weizmann.ac.il [Weizmann Institute of Science, Rehovot (Israel); Liebschner, Dorothee [CRM2, Nancy Université (France); Gotthard, Guillaume; Chabriere, Eric [AFMB, Université Aix-Marseille II (France)

    2011-01-01

    DING proteins constitute a recently discovered protein family that is ubiquitous in eukaryotes. The structural insights and the physiological involvements of these intriguing proteins are hereby deciphered. DING proteins constitute an intriguing family of phosphate-binding proteins that was identified in a wide range of organisms, from prokaryotes and archae to eukaryotes. Despite their seemingly ubiquitous occurrence in eukaryotes, their encoding genes are missing from sequenced genomes. Such a lack has considerably hampered functional studies. In humans, these proteins have been related to several diseases, like atherosclerosis, kidney stones, inflammation processes and HIV inhibition. The human phosphate binding protein is a human representative of the DING family that was serendipitously discovered from human plasma. An original approach was developed to determine ab initio the complete and exact sequence of this 38 kDa protein by utilizing mass spectrometry and X-ray data in tandem. Taking advantage of this first complete eukaryotic DING sequence, a immunohistochemistry study was undertaken to check the presence of DING proteins in various mice tissues, revealing that these proteins are widely expressed. Finally, the structure of a bacterial representative from Pseudomonas fluorescens was solved at sub-angstrom resolution, allowing the molecular mechanism of the phosphate binding in these high-affinity proteins to be elucidated.

  18. Multiple recent horizontal transfers of a large genomic region in cheese making fungi.

    Science.gov (United States)

    Cheeseman, Kevin; Ropars, Jeanne; Renault, Pierre; Dupont, Joëlle; Gouzy, Jérôme; Branca, Antoine; Abraham, Anne-Laure; Ceppi, Maurizio; Conseiller, Emmanuel; Debuchy, Robert; Malagnac, Fabienne; Goarin, Anne; Silar, Philippe; Lacoste, Sandrine; Sallet, Erika; Bensimon, Aaron; Giraud, Tatiana; Brygoo, Yves

    2014-01-01

    While the extent and impact of horizontal transfers in prokaryotes are widely acknowledged, their importance to the eukaryotic kingdom is unclear and thought by many to be anecdotal. Here we report multiple recent transfers of a huge genomic island between Penicillium spp. found in the food environment. Sequencing of the two leading filamentous fungi used in cheese making, P. roqueforti and P. camemberti, and comparison with the penicillin producer P. rubens reveals a 575 kb long genomic island in P. roqueforti--called Wallaby--present as identical fragments at non-homologous loci in P. camemberti and P. rubens. Wallaby is detected in Penicillium collections exclusively in strains from food environments. Wallaby encompasses about 250 predicted genes, some of which are probably involved in competition with microorganisms. The occurrence of multiple recent eukaryotic transfers in the food environment provides strong evidence for the importance of this understudied and probably underestimated phenomenon in eukaryotes.

  19. Diversity patterns of microbial eukaryotes mirror those of bacteria in Antarctic cryoconite holes.

    Science.gov (United States)

    Sommers, Pacifica; Darcy, John L; Gendron, Eli M S; Stanish, Lee F; Bagshaw, Elizabeth A; Porazinska, Dorota L; Schmidt, Steven K

    2018-01-01

    Ice-lidded cryoconite holes on glaciers in the Taylor Valley, Antarctica, provide a unique system of natural mesocosms for studying community structure and assembly. We used high-throughput DNA sequencing to characterize both microbial eukaryotic communities and bacterial communities within cryoconite holes across three glaciers to study similarities in their spatial patterns. We expected that the alpha (phylogenetic diversity) and beta (pairwise community dissimilarity) diversity patterns of eukaryotes in cryoconite holes would be related to those of bacteria, and that they would be related to the biogeochemical gradient within the Taylor Valley. We found that eukaryotic alpha and beta diversity were strongly related to those of bacteria across scales ranging from 140 m to 41 km apart. Alpha diversity of both was significantly related to position in the valley and surface area of the cryoconite hole, with pH also significantly correlated with the eukaryotic diversity. Beta diversity for both bacteria and eukaryotes was significantly related to position in the valley, with bacterial beta diversity also related to nitrate. These results are consistent with transport of sediments onto glaciers occurring primarily at local scales relative to the size of the valley, thus creating feedbacks in local chemistry and diversity. © FEMS 2017. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  20. AGORA : Organellar genome annotation from the amino acid and nucleotide references.

    Science.gov (United States)

    Jung, Jaehee; Kim, Jong Im; Jeong, Young-Sik; Yi, Gangman

    2018-03-29

    Next-generation sequencing (NGS) technologies have led to the accumulation of highthroughput sequence data from various organisms in biology. To apply gene annotation of organellar genomes for various organisms, more optimized tools for functional gene annotation are required. Almost all gene annotation tools are mainly focused on the chloroplast genome of land plants or the mitochondrial genome of animals.We have developed a web application AGORA for the fast, user-friendly, and improved annotations of organellar genomes. AGORA annotates genes based on a BLAST-based homology search and clustering with selected reference sequences from the NCBI database or user-defined uploaded data. AGORA can annotate the functional genes in almost all mitochondrion and plastid genomes of eukaryotes. The gene annotation of a genome with an exon-intron structure within a gene or inverted repeat region is also available. It provides information of start and end positions of each gene, BLAST results compared with the reference sequence, and visualization of gene map by OGDRAW. Users can freely use the software, and the accessible URL is https://bigdata.dongguk.edu/gene_project/AGORA/.The main module of the tool is implemented by the python and php, and the web page is built by the HTML and CSS to support all browsers. gangman@dongguk.edu.

  1. Experimental-confirmation and functional-annotation of predicted proteins in the chicken genome

    Directory of Open Access Journals (Sweden)

    McCarthy Fiona M

    2007-11-01

    Full Text Available Abstract Background The chicken genome was sequenced because of its phylogenetic position as a non-mammalian vertebrate, its use as a biomedical model especially to study embryology and development, its role as a source of human disease organisms and its importance as the major source of animal derived food protein. However, genomic sequence data is, in itself, of limited value; generally it is not equivalent to understanding biological function. The benefit of having a genome sequence is that it provides a basis for functional genomics. However, the sequence data currently available is poorly structurally and functionally annotated and many genes do not have standard nomenclature assigned. Results We analysed eight chicken tissues and improved the chicken genome structural annotation by providing experimental support for the in vivo expression of 7,809 computationally predicted proteins, including 30 chicken proteins that were only electronically predicted or hypothetical translations in human. To improve functional annotation (based on Gene Ontology, we mapped these identified proteins to their human and mouse orthologs and used this orthology to transfer Gene Ontology (GO functional annotations to the chicken proteins. The 8,213 orthology-based GO annotations that we produced represent an 8% increase in currently available chicken GO annotations. Orthologous chicken products were also assigned standardized nomenclature based on current chicken nomenclature guidelines. Conclusion We demonstrate the utility of high-throughput expression proteomics for rapid experimental structural annotation of a newly sequenced eukaryote genome. These experimentally-supported predicted proteins were further annotated by assigning the proteins with standardized nomenclature and functional annotation. This method is widely applicable to a diverse range of species. Moreover, information from one genome can be used to improve the annotation of other genomes and

  2. Six subgroups and extensive recent duplications characterize the evolution of the eukaryotic tubulin protein family.

    Science.gov (United States)

    Findeisen, Peggy; Mühlhausen, Stefanie; Dempewolf, Silke; Hertzog, Jonny; Zietlow, Alexander; Carlomagno, Teresa; Kollmar, Martin

    2014-08-27

    Tubulins belong to the most abundant proteins in eukaryotes providing the backbone for many cellular substructures like the mitotic and meiotic spindles, the intracellular cytoskeletal network, and the axonemes of cilia and flagella. Homologs have even been reported for archaea and bacteria. However, a taxonomically broad and whole-genome-based analysis of the tubulin protein family has never been performed, and thus, the number of subfamilies, their taxonomic distribution, and the exact grouping of the supposed archaeal and bacterial homologs are unknown. Here, we present the analysis of 3,524 tubulins from 504 species. The tubulins formed six major subfamilies, α to ζ. Species of all major kingdoms of the eukaryotes encode members of these subfamilies implying that they must have already been present in the last common eukaryotic ancestor. The proposed archaeal homologs grouped together with the bacterial TubZ proteins as sister clade to the FtsZ proteins indicating that tubulins are unique to eukaryotes. Most species contained α- and/or β-tubulin gene duplicates resulting from recent branch- and species-specific duplication events. This shows that tubulins cannot be used for constructing species phylogenies without resolving their ortholog-paralog relationships. The many gene duplicates and also the independent loss of the δ-, ε-, or ζ-tubulins, which have been shown to be part of the triplet microtubules in basal bodies, suggest that tubulins can functionally substitute each other. © The Author(s) 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  3. Genomes, Phylogeny, and Evolutionary Systems Biology

    Energy Technology Data Exchange (ETDEWEB)

    Medina, Monica

    2005-03-25

    With the completion of the human genome and the growing number of diverse genomes being sequenced, a new age of evolutionary research is currently taking shape. The myriad of technological breakthroughs in biology that are leading to the unification of broad scientific fields such as molecular biology, biochemistry, physics, mathematics and computer science are now known as systems biology. Here I present an overview, with an emphasis on eukaryotes, of how the postgenomics era is adopting comparative approaches that go beyond comparisons among model organisms to shape the nascent field of evolutionary systems biology.

  4. Eukaryotic RNA polymerase subunit RPB8 is a new relative of the OB family.

    Science.gov (United States)

    Krapp, S; Kelly, G; Reischl, J; Weinzierl, R O; Matthews, S

    1998-02-01

    RNA polymerase II subunit RPB8 is an essential subunit that is highly conserved throughout eukaryotic evolution and is present in all three types of nuclear RNA polymerases. We report the first high resolution structural insight into eukaryotic RNA polymerase architecture with the solution structure of RPB8 from Saccharomyces cerevisiae. It consists of an eight stranded, antiparallel beta-barrel, four short helical regions and a large, unstructured omega-loop. The strands are connected in classic Greek-key fashion. The overall topology is unusual and contains a striking C2 rotational symmetry. Furthermore, it is most likely a novel associate of the oligonucleotide/oligosaccharide (OB) binding protein class.

  5. Contributions to In Silico Genome Annotation

    KAUST Repository

    Kalkatawi, Manal M.

    2017-11-30

    Genome annotation is an important topic since it provides information for the foundation of downstream genomic and biological research. It is considered as a way of summarizing part of existing knowledge about the genomic characteristics of an organism. Annotating different regions of a genome sequence is known as structural annotation, while identifying functions of these regions is considered as a functional annotation. In silico approaches can facilitate both tasks that otherwise would be difficult and timeconsuming. This study contributes to genome annotation by introducing several novel bioinformatics methods, some based on machine learning (ML) approaches. First, we present Dragon PolyA Spotter (DPS), a method for accurate identification of the polyadenylation signals (PAS) within human genomic DNA sequences. For this, we derived a novel feature-set able to characterize properties of the genomic region surrounding the PAS, enabling development of high accuracy optimized ML predictive models. DPS considerably outperformed the state-of-the-art results. The second contribution concerns developing generic models for structural annotation, i.e., the recognition of different genomic signals and regions (GSR) within eukaryotic DNA. We developed DeepGSR, a systematic framework that facilitates generating ML models to predict GSR with high accuracy. To the best of our knowledge, no available generic and automated method exists for such task that could facilitate the studies of newly sequenced organisms. The prediction module of DeepGSR uses deep learning algorithms to derive highly abstract features that depend mainly on proper data representation and hyperparameters calibration. DeepGSR, which was evaluated on recognition of PAS and translation initiation sites (TIS) in different organisms, yields a simpler and more precise representation of the problem under study, compared to some other hand-tailored models, while producing high accuracy prediction results. Finally

  6. The N-terminal region of eukaryotic translation initiation factor 5A signals to nuclear localization of the protein

    International Nuclear Information System (INIS)

    Parreiras-e-Silva, Lucas T.; Gomes, Marcelo D.; Oliveira, Eduardo B.; Costa-Neto, Claudio M.

    2007-01-01

    The eukaryotic translation initiation factor 5A (eIF5A) is a ubiquitous protein of eukaryotic and archaeal organisms which undergoes hypusination, a unique post-translational modification. We have generated a polyclonal antibody against murine eIF5A, which in immunocytochemical assays in B16-F10 cells revealed that the endogenous protein is preferentially localized to the nuclear region. We therefore analyzed possible structural features present in eIF5A proteins that could be responsible for that characteristic. Multiple sequence alignment analysis of eIF5A proteins from different eukaryotic and archaeal organisms showed that the former sequences have an extended N-terminal segment. We have then performed in silico prediction analyses and constructed different truncated forms of murine eIF5A to verify any possible role that the N-terminal extension might have in determining the subcellular localization of the eIF5A in eukaryotic organisms. Our results indicate that the N-terminal extension of the eukaryotic eIF5A contributes in signaling this protein to nuclear localization, despite of bearing no structural similarity with classical nuclear localization signals

  7. Metaxa: a software tool for automated detection and discrimination among ribosomal small subunit (12S/16S/18S) sequences of archaea, bacteria, eukaryotes, mitochondria, and chloroplasts in metagenomes and environmental sequencing datasets.

    Science.gov (United States)

    Bengtsson, Johan; Eriksson, K Martin; Hartmann, Martin; Wang, Zheng; Shenoy, Belle Damodara; Grelet, Gwen-Aëlle; Abarenkov, Kessy; Petri, Anna; Rosenblad, Magnus Alm; Nilsson, R Henrik

    2011-10-01

    The ribosomal small subunit (SSU) rRNA gene has emerged as an important genetic marker for taxonomic identification in environmental sequencing datasets. In addition to being present in the nucleus of eukaryotes and the core genome of prokaryotes, the gene is also found in the mitochondria of eukaryotes and in the chloroplasts of photosynthetic eukaryotes. These three sets of genes are conceptually paralogous and should in most situations not be aligned and analyzed jointly. To identify the origin of SSU sequences in complex sequence datasets has hitherto been a time-consuming and largely manual undertaking. However, the present study introduces Metaxa ( http://microbiology.se/software/metaxa/ ), an automated software tool to extract full-length and partial SSU sequences from larger sequence datasets and assign them to an archaeal, bacterial, nuclear eukaryote, mitochondrial, or chloroplast origin. Using data from reference databases and from full-length organelle and organism genomes, we show that Metaxa detects and scores SSU sequences for origin with very low proportions of false positives and negatives. We believe that this tool will be useful in microbial and evolutionary ecology as well as in metagenomics.

  8. Three distinct modes of intron dynamics in the evolution of eukaryotes.

    Science.gov (United States)

    Carmel, Liran; Wolf, Yuri I; Rogozin, Igor B; Koonin, Eugene V

    2007-07-01

    Several contrasting scenarios have been proposed for the origin and evolution of spliceosomal introns, a hallmark of eukaryotic genes. A comprehensive probabilistic model to obtain a definitive reconstruction of intron evolution was developed and applied to 391 sets of conserved genes from 19 eukaryotic species. It is inferred that a relatively high intron density was reached early, i.e., the last common ancestor of eukaryotes contained >2.15 introns/kilobase, and the last common ancestor of multicellular life forms harbored approximately 3.4 introns/kilobase, a greater intron density than in most of the extant fungi and in some animals. The rates of intron gain and intron loss appear to have been dropping during the last approximately 1.3 billion years, with the decline in the gain rate being much steeper. Eukaryotic lineages exhibit three distinct modes of evolution of the intron-exon structure. The primary, balanced mode, apparently, operates in all lineages. In this mode, intron gain and loss are strongly and positively correlated, in contrast to previous reports on inverse correlation between these processes. The second mode involves an elevated rate of intron loss and is prevalent in several lineages, such as fungi and insects. The third mode, characterized by elevated rate of intron gain, is seen only in deep branches of the tree, indicating that bursts of intron invasion occurred at key points in eukaryotic evolution, such as the origin of animals. Intron dynamics could depend on multiple mechanisms, and in the balanced mode, gain and loss of introns might share common mechanistic features.

  9. Complete genome sequence of the myxobacterium Sorangium cellulosum

    DEFF Research Database (Denmark)

    Schneiker, S; Perlova, O; Kaiser, O

    2007-01-01

    The genus Sorangium synthesizes approximately half of the secondary metabolites isolated from myxobacteria, including the anti-cancer metabolite epothilone. We report the complete genome sequence of the model Sorangium strain S. cellulosum Soce56, which produces several natural products and has...... morphological and physiological properties typical of the genus. The circular genome, comprising 13,033,779 base pairs, is the largest bacterial genome sequenced to date. No global synteny with the genome of Myxococcus xanthus is apparent, revealing an unanticipated level of divergence between...... these myxobacteria. A large percentage of the genome is devoted to regulation, particularly post-translational phosphorylation, which probably supports the strain's complex, social lifestyle. This regulatory network includes the highest number of eukaryotic protein kinase-like kinases discovered in any organism...

  10. Morphological and ecological complexity in early eukaryotic ecosystems.

    Science.gov (United States)

    Javaux, E J; Knoll, A H; Walter, M R

    2001-07-05

    Molecular phylogeny and biogeochemistry indicate that eukaryotes differentiated early in Earth history. Sequence comparisons of small-subunit ribosomal RNA genes suggest a deep evolutionary divergence of Eukarya and Archaea; C27-C29 steranes (derived from sterols synthesized by eukaryotes) and strong depletion of 13C (a biogeochemical signature of methanogenic Archaea) in 2,700 Myr old kerogens independently place a minimum age on this split. Steranes, large spheroidal microfossils, and rare macrofossils of possible eukaryotic origin occur in Palaeoproterozoic rocks. Until now, however, evidence for morphological and taxonomic diversification within the domain has generally been restricted to very late Mesoproterozoic and Neoproterozoic successions. Here we show that the cytoskeletal and ecological prerequisites for eukaryotic diversification were already established in eukaryotic microorganisms fossilized nearly 1,500 Myr ago in shales of the early Mesoproterozoic Roper Group in northern Australia.

  11. Microsatellites in the Eukaryotic DNA Mismatch Repair Genes as Modulators of Evolutionary Mutation Rate

    Science.gov (United States)

    Chang, Dong Kyung; Metzgar, David; Wills, Christopher; Boland, C. Richard

    2003-01-01

    All "minor" components of the human DNA mismatch repair (MMR) system-MSH3, MSH6, PMS2, and the recently discovered MLH3-contain mononucleotide microsatellites in their coding sequences. This intriguing finding contrasts with the situation found in the major components of the DNA MMR system-MSH2 and MLH1-and, in fact, most human genes. Although eukaryotic genomes are rich in microsatellites, non-triplet microsatellites are rare in coding regions. The recurring presence of exonal mononucleotide repeat sequences within a single family of human genes would therefore be considered exceptional.

  12. Comparing the Dictyostelium and Entamoeba genomes reveals an ancient split in the Conosa lineage.

    Directory of Open Access Journals (Sweden)

    Jie Song

    2005-12-01

    Full Text Available The Amoebozoa are a sister clade to the fungi and the animals, but are poorly sampled for completely sequenced genomes. The social amoeba Dictyostelium discoideum and amitochondriate pathogen Entamoeba histolytica are the first Amoebozoa with genomes completely sequenced. Both organisms are classified under the Conosa subphylum. To identify Amoebozoa-specific genomic elements, we compared these two genomes to each other and to other eukaryotic genomes. An expanded phylogenetic tree built from the complete predicted proteomes of 23 eukaryotes places the two amoebae in the same lineage, although the divergence is estimated to be greater than that between animals and fungi, and probably happened shortly after the Amoebozoa split from the opisthokont lineage. Most of the 1,500 orthologous gene families shared between the two amoebae are also shared with plant, animal, and fungal genomes. We found that only 42 gene families are distinct to the amoeba lineage; among these are a large number of proteins that contain repeats of the FNIP domain, and a putative transcription factor essential for proper cell type differentiation in D. discoideum. These Amoebozoa-specific genes may be useful in the design of novel diagnostics and therapies for amoebal pathologies.

  13. Ultrastructural diversity between centrioles of eukaryotes.

    Science.gov (United States)

    Gupta, Akshari; Kitagawa, Daiju

    2018-02-16

    Several decades of centriole research have revealed the beautiful symmetry present in these microtubule-based organelles, which are required to form centrosomes, cilia, and flagella in many eukaryotes. Centriole architecture is largely conserved across most organisms, however, individual centriolar features such as the central cartwheel or microtubule walls exhibit considerable variability when examined with finer resolution. Here, we review the ultrastructural characteristics of centrioles in commonly studied organisms, highlighting the subtle and not-so-subtle differences between specific structural components of these centrioles. Additionally, we survey some non-canonical centriole structures that have been discovered in various species, from the coaxial bicentrioles of protists and lower land plants to the giant irregular centrioles of the fungus gnat Sciara. Finally, we speculate on the functional significance of these differences between centrioles, and the contribution of individual structural elements such as the cartwheel or microtubules towards the stability of centrioles.Centriole structure, cartwheel, triplet microtubules, SAS-6, centrosome.

  14. Telomeres and genomic damage repair. Their implication in human pathology

    International Nuclear Information System (INIS)

    Perez, Maria del R.; Dubner, Diana; Michelin, Severino; Gisone, Pablo; Carosella, Edgardo D.

    2002-01-01

    Telomeres, functional complexed that protect eukaryotic chromosome ends, participate in the regulation of cell proliferation and could play a role in the stabilization of genomic regions in response to genotoxic stress. Their significance in human pathology becomes evident in several diseases sharing genomic instability as a common trait, in which alterations of the telomere metabolism have been demonstrated. Many of them are also associated with hypersensitivity to ionizing radiation and cancer susceptibility. Besides the specific proteins belonging to the telomeric complex, other proteins involved in the DNA repair machinery, such as ATM, BRCA1, BRCA2, PARP/tankyrase system, DNA-PK and RAD50-MRE11-NBS1 complexes, are closely related with the telomere. This suggests that the telomere sequesters DNA repair proteins for its own structure maintenance, with could also be released toward damaged sites in the genomic DNA. This communication describes essential aspects of telomere structure and function and their links with homologous recombination, non-homologous end-joining (NHEJ), V(D)J system and mismatch-repair (MMR). Several pathological conditions exhibiting alterations in some of these mechanisms are also considered. The cell response to ionizing radiation and its relationship with the telomeric metabolism is particularly taken into account as a model for studying genotoxicity. (author)

  15. [Construction of the eukaryotic recombinant vector and expression of the outer membrane protein LipL32 gene from Leptospira serovar Lai].

    Science.gov (United States)

    Huang, Bi; Bao, Lang; Zhong, Qi; Shang, Zheng-ling; Zhang, Hui-dong; Zhang, Ying

    2008-02-01

    To construct the eukaryotic experssion vector of LipL32 gene from Leptospira serovar Lai and express the recombinant plasmid in COS-7 cell. The LipL32 gene was amplified from Leptospira strain 017 genomic DNA by PCR and cloned into pcDNA3.1, through restriction nuclease enzyme digestion. Then the recombinant plasmid was transformed into E.coli DH5alpha. After identified by nuclease digestion, PCR and sequencing analysis, the recombinant vector was transfected into COS-7 cell with lipsome. The expression of the target gene was detected by RT-PCR and Western blot. The eukaryotic experssion vector pcDNA3.1-LipL32 was successfully constructed and stably expressed in COS-7 cell. The eukaryotic recombinant vector of outer membrane protein LipL32 gene from Leptospira serovar Lai can be expressed in mammalian cell, which provides an experimental basis for the application of the Leptospira DNA vaccine.

  16. Introns: The Functional Benefits of Introns in Genomes

    Directory of Open Access Journals (Sweden)

    Bong-Seok Jo

    2015-12-01

    Full Text Available The intron has been a big biological mystery since it was first discovered in several aspects. First, all of the completely sequenced eukaryotes harbor introns in the genomic structure, whereas no prokaryotes identified so far carry introns. Second, the amount of total introns varies in different species. Third, the length and number of introns vary in different genes, even within the same species genome. Fourth, all introns are copied into RNAs by transcription and DNAs by replication processes, but intron sequences do not participate in protein-coding sequences. The existence of introns in the genome should be a burden to some cells, because cells have to consume a great deal of energy to copy and excise them exactly at the correct positions with the help of complicated spliceosomal machineries. The existence throughout the long evolutionary history is explained, only if selective advantages of carrying introns are assumed to be given to cells to overcome the negative effect of introns. In that regard, we summarize previous research about the functional roles or benefits of introns. Additionally, several other studies strongly suggesting that introns should not be junk will be introduced.

  17. Introns: The Functional Benefits of Introns in Genomes.

    Science.gov (United States)

    Jo, Bong-Seok; Choi, Sun Shim

    2015-12-01

    The intron has been a big biological mystery since it was first discovered in several aspects. First, all of the completely sequenced eukaryotes harbor introns in the genomic structure, whereas no prokaryotes identified so far carry introns. Second, the amount of total introns varies in different species. Third, the length and number of introns vary in different genes, even within the same species genome. Fourth, all introns are copied into RNAs by transcription and DNAs by replication processes, but intron sequences do not participate in protein-coding sequences. The existence of introns in the genome should be a burden to some cells, because cells have to consume a great deal of energy to copy and excise them exactly at the correct positions with the help of complicated spliceosomal machineries. The existence throughout the long evolutionary history is explained, only if selective advantages of carrying introns are assumed to be given to cells to overcome the negative effect of introns. In that regard, we summarize previous research about the functional roles or benefits of introns. Additionally, several other studies strongly suggesting that introns should not be junk will be introduced.

  18. Comprehensive evaluation of non-hybrid genome assembly tools for third-generation PacBio long-read sequence data.

    Science.gov (United States)

    Jayakumar, Vasanthan; Sakakibara, Yasubumi

    2017-11-03

    Long reads obtained from third-generation sequencing platforms can help overcome the long-standing challenge of the de novo assembly of sequences for the genomic analysis of non-model eukaryotic organisms. Numerous long-read-aided de novo assemblies have been published recently, which exhibited superior quality of the assembled genomes in comparison with those achieved using earlier second-generation sequencing technologies. Evaluating assemblies is important in guiding the appropriate choice for specific research needs. In this study, we evaluated 10 long-read assemblers using a variety of metrics on Pacific Biosciences (PacBio) data sets from different taxonomic categories with considerable differences in genome size. The results allowed us to narrow down the list to a few assemblers that can be effectively applied to eukaryotic assembly projects. Moreover, we highlight how best to use limited genomic resources for effectively evaluating the genome assemblies of non-model organisms. © The Author 2017. Published by Oxford University Press.

  19. Distribution and Diversity of Microbial Eukaryotes in Bathypelagic Waters of the South China Sea.

    Science.gov (United States)

    Xu, Dapeng; Jiao, Nianzhi; Ren, Rui; Warren, Alan

    2017-05-01

    Little is known about the biodiversity of microbial eukaryotes in the South China Sea, especially in waters at bathyal depths. Here, we employed SSU rDNA gene sequencing to reveal the diversity and community structure across depth and distance gradients in the South China Sea. Vertically, the highest alpha diversity was found at 75-m depth. The communities of microbial eukaryotes were clustered into shallow-, middle-, and deep-water groups according to the depth from which they were collected, indicating a depth-related diversity and distribution pattern. Rhizaria sequences dominated the microeukaryote community and occurred in all samples except those from less than 50-m deep, being most abundant near the sea floor where they contributed ca. 64-97% and 40-74% of the total sequences and OTUs recovered, respectively. A large portion of rhizarian OTUs has neither a nearest named neighbor nor a nearest neighbor in the GenBank database which indicated the presence of new phylotypes in the South China Sea. Given their overwhelming abundance and richness, further phylogenetic analysis of rhizarians were performed and three new genetic clusters were revealed containing sequences retrieved from the deep waters of the South China Sea. Our results shed light on the diversity and community structure of microbial eukaryotes in this not yet fully explored area. © 2016 The Author(s) Journal of Eukaryotic Microbiology © 2016 International Society of Protistologists.

  20. Comparative genomics of the relationship between gene structure and expression

    NARCIS (Netherlands)

    Ren, X.

    2006-01-01

    The relationship between the structure of genes and their expression is a relatively new aspect of genome organization and regulation. With more genome sequences and expression data becoming available, bioinformatics approaches can help the further elucidation of the relationships between gene

  1. SoyTEdb: a comprehensive database of transposable elements in the soybean genome

    Directory of Open Access Journals (Sweden)

    Zhu Liucun

    2010-02-01

    Full Text Available Abstract Background Transposable elements are the most abundant components of all characterized genomes of higher eukaryotes. It has been documented that these elements not only contribute to the shaping and reshaping of their host genomes, but also play significant roles in regulating gene expression, altering gene function, and creating new genes. Thus, complete identification of transposable elements in sequenced genomes and construction of comprehensive transposable element databases are essential for accurate annotation of genes and other genomic components, for investigation of potential functional interaction between transposable elements and genes, and for study of genome evolution. The recent availability of the soybean genome sequence has provided an unprecedented opportunity for discovery, and structural and functional characterization of transposable elements in this economically important legume crop. Description Using a combination of structure-based and homology-based approaches, a total of 32,552 retrotransposons (Class I and 6,029 DNA transposons (Class II with clear boundaries and insertion sites were structurally annotated and clearly categorized, and a soybean transposable element database, SoyTEdb, was established. These transposable elements have been anchored in and integrated with the soybean physical map and genetic map, and are browsable and visualizable at any scale along the 20 soybean chromosomes, along with predicted genes and other sequence annotations. BLAST search and other infrastracture tools were implemented to facilitate annotation of transposable elements or fragments from soybean and other related legume species. The majority (> 95% of these elements (particularly a few hundred low-copy-number families are first described in this study. Conclusion SoyTEdb provides resources and information related to transposable elements in the soybean genome, representing the most comprehensive and the largest manually

  2. Structural and dynamic characterization of eukaryotic gene regulatory protein domains in solution

    Energy Technology Data Exchange (ETDEWEB)

    Lee, Andrew Loyd [Univ. of California, Berkeley, CA (United States). Dept. of Chemistry

    1996-05-01

    Solution NMR was primarily used to characterize structure and dynamics in two different eukaryotic protein systems: the δ-Al-ε activation domain from c-jun and the Drosophila RNA-binding protein Sex-lethal. The second system is the Drosophila Sex-lethal (Sxl) protein, an RNA-binding protein which is the ``master switch`` in sex determination. Sxl contains two adjacent RNA-binding domains (RBDs) of the RNP consensus-type. The NMR spectrum of the second RBD (Sxl-RBD2) was assigned using multidimensional heteronuclear NMR, and an intermediate-resolution family of structures was calculated from primarily NOE distance restraints. The overall fold was determined to be similar to other RBDs: a βαβ-βαβ pattern of secondary structure, with the two helices packed against a 4-stranded anti-parallel β-sheet. In addition 15N T1, T2, and 15N/1H NOE relaxation measurements were carried out to characterize the backbone dynamics of Sxl-RBD2 in solution. RNA corresponding to the polypyrimidine tract of transformer pre-mRNA was generated and titrated into 3 different Sxl-RBD protein constructs. Combining Sxl-RBD1+2 (bht RBDs) with this RNA formed a specific, high affinity protein/RNA complex that is amenable to further NMR characterization. The backbone 1H, 13C, and 15N resonances of Sxl-RBD1+2 were assigned using a triple-resonance approach, and 15N relaxation experiments were carried out to characterize the backbone dynamics of this complex. The changes in chemical shift in Sxl-RBD1+2 upon binding RNA are observed using Sxl-RBD2 as a substitute for unbound Sxl-RBD1+2. This allowed the binding interface to be qualitatively mapped for the second domain.

  3. Eukaryotes first: how could that be?

    Science.gov (United States)

    Mariscal, Carlos; Doolittle, W Ford

    2015-09-26

    In the half century since the formulation of the prokaryote : eukaryote dichotomy, many authors have proposed that the former evolved from something resembling the latter, in defiance of common (and possibly common sense) views. In such 'eukaryotes first' (EF) scenarios, the last universal common ancestor is imagined to have possessed significantly many of the complex characteristics of contemporary eukaryotes, as relics of an earlier 'progenotic' period or RNA world. Bacteria and Archaea thus must have lost these complex features secondarily, through 'streamlining'. If the canonical three-domain tree in which Archaea and Eukarya are sisters is accepted, EF entails that Bacteria and Archaea are convergently prokaryotic. We ask what this means and how it might be tested. © 2015 The Author(s).

  4. Three-dimensional Structure of a Viral Genome-delivery Portal Vertex

    Energy Technology Data Exchange (ETDEWEB)

    A Olia; P Prevelige Jr.; J Johnson; G Cingolani

    2011-12-31

    DNA viruses such as bacteriophages and herpesviruses deliver their genome into and out of the capsid through large proteinaceous assemblies, known as portal proteins. Here, we report two snapshots of the dodecameric portal protein of bacteriophage P22. The 3.25-{angstrom}-resolution structure of the portal-protein core bound to 12 copies of gene product 4 (gp4) reveals a {approx}1.1-MDa assembly formed by 24 proteins. Unexpectedly, a lower-resolution structure of the full-length portal protein unveils the unique topology of the C-terminal domain, which forms a {approx}200-{angstrom}-long {alpha}-helical barrel. This domain inserts deeply into the virion and is highly conserved in the Podoviridae family. We propose that the barrel domain facilitates genome spooling onto the interior surface of the capsid during genome packaging and, in analogy to a rifle barrel, increases the accuracy of genome ejection into the host cell.

  5. BEACON: automated tool for Bacterial GEnome Annotation ComparisON

    KAUST Repository

    Kalkatawi, Manal M.

    2015-08-18

    Background Genome annotation is one way of summarizing the existing knowledge about genomic characteristics of an organism. There has been an increased interest during the last several decades in computer-based structural and functional genome annotation. Many methods for this purpose have been developed for eukaryotes and prokaryotes. Our study focuses on comparison of functional annotations of prokaryotic genomes. To the best of our knowledge there is no fully automated system for detailed comparison of functional genome annotations generated by different annotation methods (AMs). Results The presence of many AMs and development of new ones introduce needs to: a/ compare different annotations for a single genome, and b/ generate annotation by combining individual ones. To address these issues we developed an Automated Tool for Bacterial GEnome Annotation ComparisON (BEACON) that benefits both AM developers and annotation analysers. BEACON provides detailed comparison of gene function annotations of prokaryotic genomes obtained by different AMs and generates extended annotations through combination of individual ones. For the illustration of BEACON’s utility, we provide a comparison analysis of multiple different annotations generated for four genomes and show on these examples that the extended annotation can increase the number of genes annotated by putative functions up to 27 %, while the number of genes without any function assignment is reduced. Conclusions We developed BEACON, a fast tool for an automated and a systematic comparison of different annotations of single genomes. The extended annotation assigns putative functions to many genes with unknown functions. BEACON is available under GNU General Public License version 3.0 and is accessible at: http://www.cbrc.kaust.edu.sa/BEACON/

  6. BEACON: automated tool for Bacterial GEnome Annotation ComparisON.

    Science.gov (United States)

    Kalkatawi, Manal; Alam, Intikhab; Bajic, Vladimir B

    2015-08-18

    Genome annotation is one way of summarizing the existing knowledge about genomic characteristics of an organism. There has been an increased interest during the last several decades in computer-based structural and functional genome annotation. Many methods for this purpose have been developed for eukaryotes and prokaryotes. Our study focuses on comparison of functional annotations of prokaryotic genomes. To the best of our knowledge there is no fully automated system for detailed comparison of functional genome annotations generated by different annotation methods (AMs). The presence of many AMs and development of new ones introduce needs to: a/ compare different annotations for a single genome, and b/ generate annotation by combining individual ones. To address these issues we developed an Automated Tool for Bacterial GEnome Annotation ComparisON (BEACON) that benefits both AM developers and annotation analysers. BEACON provides detailed comparison of gene function annotations of prokaryotic genomes obtained by different AMs and generates extended annotations through combination of individual ones. For the illustration of BEACON's utility, we provide a comparison analysis of multiple different annotations generated for four genomes and show on these examples that the extended annotation can increase the number of genes annotated by putative functions up to 27%, while the number of genes without any function assignment is reduced. We developed BEACON, a fast tool for an automated and a systematic comparison of different annotations of single genomes. The extended annotation assigns putative functions to many genes with unknown functions. BEACON is available under GNU General Public License version 3.0 and is accessible at: http://www.cbrc.kaust.edu.sa/BEACON/ .

  7. Structural and biomechanical basis of mitochondrial movement in eukaryotic cells

    Directory of Open Access Journals (Sweden)

    Wu M

    2013-10-01

    Full Text Available Min Wu,1 Aruna Kalyanasundaram,2 Jie Zhu1 1Laboratory of Biomechanics and Engineering, Institute of Biophysics, College of Science, Northwest A&F University, Yangling, Shaanxi, People's Republic of China; 2College of Pharmacology, University of Illinois at Chicago, Chicago, IL, USA Abstract: Mitochondria serve as energy-producing organelles in eukaryotic cells. In addition to providing the energy supply for cells, the mitochondria are also involved in other processes, such as proliferation, differentiation, information transfer, and apoptosis, and play an important role in regulation of cell growth and the cell cycle. In order to achieve these functions, the mitochondria need to move to the corresponding location. Therefore, mitochondrial movement has a crucial role in normal physiologic activity, and any mitochondrial movement disorder will cause irreparable damage to the organism. For example, recent studies have shown that abnormal movement of the mitochondria is likely to be the reason for Charcot–Marie–Tooth disease, amyotrophic lateral sclerosis, Alzheimer's disease, Huntington's disease, Parkinson's disease, and schizophrenia. So, in the cell, especially in the particular polarized cell, the appropriate distribution of mitochondria is crucial to the function and survival of the cell. Mitochondrial movement is mainly associated with the cytoskeleton and related proteins. However, those components play different roles according to cell type. In this paper, we summarize the structural basis of mitochondrial movement, including microtubules, actin filaments, motor proteins, and adaptin, and review studies of the biomechanical mechanisms of mitochondrial movement in different types of cells. Keywords: mitochondrial movement, microtubules, actin filaments, motor proteins, adaptin

  8. PhytoREF: a reference database of the plastidial 16S rRNA gene of photosynthetic eukaryotes with curated taxonomy.

    Science.gov (United States)

    Decelle, Johan; Romac, Sarah; Stern, Rowena F; Bendif, El Mahdi; Zingone, Adriana; Audic, Stéphane; Guiry, Michael D; Guillou, Laure; Tessier, Désiré; Le Gall, Florence; Gourvil, Priscillia; Dos Santos, Adriana L; Probert, Ian; Vaulot, Daniel; de Vargas, Colomban; Christen, Richard

    2015-11-01

    Photosynthetic eukaryotes have a critical role as the main producers in most ecosystems of the biosphere. The ongoing environmental metabarcoding revolution opens the perspective for holistic ecosystems biological studies of these organisms, in particular the unicellular microalgae that often lack distinctive morphological characters and have complex life cycles. To interpret environmental sequences, metabarcoding necessarily relies on taxonomically curated databases containing reference sequences of the targeted gene (or barcode) from identified organisms. To date, no such reference framework exists for photosynthetic eukaryotes. In this study, we built the PhytoREF database that contains 6490 plastidial 16S rDNA reference sequences that originate from a large diversity of eukaryotes representing all known major photosynthetic lineages. We compiled 3333 amplicon sequences available from public databases and 879 sequences extracted from plastidial genomes, and generated 411 novel sequences from cultured marine microalgal strains belonging to different eukaryotic lineages. A total of 1867 environmental Sanger 16S rDNA sequences were also included in the database. Stringent quality filtering and a phylogeny-based taxonomic classification were applied for each 16S rDNA sequence. The database mainly focuses on marine microalgae, but sequences from land plants (representing half of the PhytoREF sequences) and freshwater taxa were also included to broaden the applicability of PhytoREF to different aquatic and terrestrial habitats. PhytoREF, accessible via a web interface (http://phytoref.fr), is a new resource in molecular ecology to foster the discovery, assessment and monitoring of the diversity of photosynthetic eukaryotes using high-throughput sequencing. © 2015 John Wiley & Sons Ltd.

  9. Elucidation of Operon Structures across Closely Related Bacterial Genomes

    Science.gov (United States)

    Li, Guojun

    2014-01-01

    About half of the protein-coding genes in prokaryotic genomes are organized into operons to facilitate co-regulation during transcription. With the evolution of genomes, operon structures are undergoing changes which could coordinate diverse gene expression patterns in response to various stimuli during the life cycle of a bacterial cell. Here we developed a graph-based model to elucidate the diversity of operon structures across a set of closely related bacterial genomes. In the constructed graph, each node represents one orthologous gene group (OGG) and a pair of nodes will be connected if any two genes, from the corresponding two OGGs respectively, are located in the same operon as immediate neighbors in any of the considered genomes. Through identifying the connected components in the above graph, we found that genes in a connected component are likely to be functionally related and these identified components tend to form treelike topology, such as paths and stars, corresponding to different biological mechanisms in transcriptional regulation as follows. Specifically, (i) a path-structure component integrates genes encoding a protein complex, such as ribosome; and (ii) a star-structure component not only groups related genes together, but also reflects the key functional roles of the central node of this component, such as the ABC transporter with a transporter permease and substrate-binding proteins surrounding it. Most interestingly, the genes from organisms with highly diverse living environments, i.e., biomass degraders and animal pathogens of clostridia in our study, can be clearly classified into different topological groups on some connected components. PMID:24959722

  10. Covering complete proteomes with X-ray structures: a current snapshot

    Energy Technology Data Exchange (ETDEWEB)

    Mizianty, Marcin J.; Fan, Xiao; Yan, Jing; Chalmers, Eric; Woloschuk, Christopher [University of Alberta, Edmonton, Alberta T6G 2V4 (Canada); Joachimiak, Andrzej, E-mail: andrzejj@anl.gov [Argonne National Laboratory, Argonne, IL 60439 (United States); Kurgan, Lukasz, E-mail: andrzejj@anl.gov [University of Alberta, Edmonton, Alberta T6G 2V4 (Canada)

    2014-11-01

    The current and the attainable coverage by X-ray structures of proteins and their functions on the scale of the ‘protein universe’ are estimated. A detailed analysis of the coverage across nearly 2000 proteomes from all superkingdoms of life and functional annotations is performed, with particular focus on the human proteome and the family of GPCR proteins. Structural genomics programs have developed and applied structure-determination pipelines to a wide range of protein targets, facilitating the visualization of macromolecular interactions and the understanding of their molecular and biochemical functions. The fundamental question of whether three-dimensional structures of all proteins and all functional annotations can be determined using X-ray crystallography is investigated. A first-of-its-kind large-scale analysis of crystallization propensity for all proteins encoded in 1953 fully sequenced genomes was performed. It is shown that current X-ray crystallographic knowhow combined with homology modeling can provide structures for 25% of modeling families (protein clusters for which structural models can be obtained through homology modeling), with at least one structural model produced for each Gene Ontology functional annotation. The coverage varies between superkingdoms, with 19% for eukaryotes, 35% for bacteria and 49% for archaea, and with those of viruses following the coverage values of their hosts. It is shown that the crystallization propensities of proteomes from the taxonomic superkingdoms are distinct. The use of knowledge-based target selection is shown to substantially increase the ability to produce X-ray structures. It is demonstrated that the human proteome has one of the highest attainable coverage values among eukaryotes, and GPCR membrane proteins suitable for X-ray structure determination were determined.

  11. Covering complete proteomes with X-ray structures: a current snapshot

    International Nuclear Information System (INIS)

    Mizianty, Marcin J.; Fan, Xiao; Yan, Jing; Chalmers, Eric; Woloschuk, Christopher; Joachimiak, Andrzej; Kurgan, Lukasz

    2014-01-01

    The current and the attainable coverage by X-ray structures of proteins and their functions on the scale of the ‘protein universe’ are estimated. A detailed analysis of the coverage across nearly 2000 proteomes from all superkingdoms of life and functional annotations is performed, with particular focus on the human proteome and the family of GPCR proteins. Structural genomics programs have developed and applied structure-determination pipelines to a wide range of protein targets, facilitating the visualization of macromolecular interactions and the understanding of their molecular and biochemical functions. The fundamental question of whether three-dimensional structures of all proteins and all functional annotations can be determined using X-ray crystallography is investigated. A first-of-its-kind large-scale analysis of crystallization propensity for all proteins encoded in 1953 fully sequenced genomes was performed. It is shown that current X-ray crystallographic knowhow combined with homology modeling can provide structures for 25% of modeling families (protein clusters for which structural models can be obtained through homology modeling), with at least one structural model produced for each Gene Ontology functional annotation. The coverage varies between superkingdoms, with 19% for eukaryotes, 35% for bacteria and 49% for archaea, and with those of viruses following the coverage values of their hosts. It is shown that the crystallization propensities of proteomes from the taxonomic superkingdoms are distinct. The use of knowledge-based target selection is shown to substantially increase the ability to produce X-ray structures. It is demonstrated that the human proteome has one of the highest attainable coverage values among eukaryotes, and GPCR membrane proteins suitable for X-ray structure determination were determined

  12. New families of human regulatory RNA structures identified by comparative analysis of vertebrate genomes

    DEFF Research Database (Denmark)

    Parker, Brian John; Moltke, Ida; Roth, Adam

    2011-01-01

    a comparative method, EvoFam, for genome-wide identification of families of regulatory RNA structures, based on primary sequence and secondary structure similarity. We apply EvoFam to a 41-way genomic vertebrate alignment. Genome-wide, we identify 220 human, high-confidence families outside protein...

  13. Initial genomics of the human nucleolus.

    Directory of Open Access Journals (Sweden)

    Attila Németh

    2010-03-01

    Full Text Available We report for the first time the genomics of a nuclear compartment of the eukaryotic cell. 454 sequencing and microarray analysis revealed the pattern of nucleolus-associated chromatin domains (NADs in the linear human genome and identified different gene families and certain satellite repeats as the major building blocks of NADs, which constitute about 4% of the genome. Bioinformatic evaluation showed that NAD-localized genes take part in specific biological processes, like the response to other organisms, odor perception, and tissue development. 3D FISH and immunofluorescence experiments illustrated the spatial distribution of NAD-specific chromatin within interphase nuclei and its alteration upon transcriptional changes. Altogether, our findings describe the nature of DNA sequences associated with the human nucleolus and provide insights into the function of the nucleolus in genome organization and establishment of nuclear architecture.

  14. Initial Genomics of the Human Nucleolus

    Science.gov (United States)

    Németh, Attila; Conesa, Ana; Santoyo-Lopez, Javier; Medina, Ignacio; Montaner, David; Péterfia, Bálint; Solovei, Irina; Cremer, Thomas; Dopazo, Joaquin; Längst, Gernot

    2010-01-01

    We report for the first time the genomics of a nuclear compartment of the eukaryotic cell. 454 sequencing and microarray analysis revealed the pattern of nucleolus-associated chromatin domains (NADs) in the linear human genome and identified different gene families and certain satellite repeats as the major building blocks of NADs, which constitute about 4% of the genome. Bioinformatic evaluation showed that NAD–localized genes take part in specific biological processes, like the response to other organisms, odor perception, and tissue development. 3D FISH and immunofluorescence experiments illustrated the spatial distribution of NAD–specific chromatin within interphase nuclei and its alteration upon transcriptional changes. Altogether, our findings describe the nature of DNA sequences associated with the human nucleolus and provide insights into the function of the nucleolus in genome organization and establishment of nuclear architecture. PMID:20361057

  15. Polyploidy: adaptation to the genomic environment.

    Science.gov (United States)

    Hollister, Jesse D

    2015-02-01

    Genomic evidence of ancestral whole genome duplication (WGD) and polyploidy is widespread among eukaryotic species, and especially among plants. WGD is thought to provide the raw material for adaptation in the form of duplicated genes, and polyploids are thought to benefit from both physiological and genetic buffering. Comparatively little attention has focused on the genomic challenge of polyploidy, however, although much evidence exists that polyploidy severely perturbs important cellular functions. Here, I review recent progress in the study of the re-establishment of stable meiosis in recently evolved polyploids, focusing on four plant species. This work has yielded an insight into the mechanisms underlying stabilization of genome transmission in polyploids, and is revealing remarkable parallels among diverse taxa. Importantly, these studies also provide a road map for investigating how polyploids respond to the challenge of WGD.

  16. 2012 U.S. Department of Energy: Joint Genome Institute: Progress Report

    Energy Technology Data Exchange (ETDEWEB)

    Gilbert, David [DOE JGI Public Affairs Manager

    2013-01-01

    The mission of the U.S. Department of Energy Joint Genome Institute (DOE JGI) is to serve the diverse scientific community as a user facility, enabling the application of large-scale genomics and analysis of plants, microbes, and communities of microbes to address the DOE mission goals in bioenergy and the environment. The DOE JGI's sequencing efforts fall under the Eukaryote Super Program, which includes the Plant and Fungal Genomics Programs; and the Prokaryote Super Program, which includes the Microbial Genomics and Metagenomics Programs. In 2012, several projects made news for their contributions to energy and environment research.

  17. Potential of industrial biotechnology with cyanobacteria and eukaryotic microalgae.

    Science.gov (United States)

    Wijffels, René H; Kruse, Olaf; Hellingwerf, Klaas J

    2013-06-01

    Both cyanobacteria and eukaryotic microalgae are promising organisms for sustainable production of bulk products such as food, feed, materials, chemicals and fuels. In this review we will summarize the potential and current biotechnological developments. Cyanobacteria are promising host organisms for the production of small molecules that can be secreted such as ethanol, butanol, fatty acids and other organic acids. Eukaryotic microalgae are interesting for products for which cellular storage is important such as proteins, lipids, starch and alkanes. For the development of new and promising lines of production, strains of both cyanobacteria and eukaryotic microalgae have to be improved. Transformation systems have been much better developed in cyanobacteria. However, several products would be preferably produced with eukaryotic microalgae. In the case of cyanobacteria a synthetic-systems biology approach has a great potential to exploit cyanobacteria as cell factories. For eukaryotic microalgae transformation systems need to be further developed. A promising strategy is transformation of heterologous (prokaryotic and eukaryotic) genes in established eukaryotic hosts such as Chlamydomonas reinhardtii. Experimental outdoor pilots under containment for the production of genetically modified cyanobacteria and microalgae are in progress. For full scale production risks of release of genetically modified organisms need to be assessed. Copyright © 2013. Published by Elsevier Ltd.

  18. Functional role of a highly repetitive DNA sequence in anchorage of the mouse genome.

    Science.gov (United States)

    Neuer-Nitsche, B; Lu, X N; Werner, D

    1988-09-12

    The major portion of the eukaryotic genome consists of various categories of repetitive DNA sequences which have been studied with respect to their base compositions, organizations, copy numbers, transcription and species specificities; their biological roles, however, are still unclear. A novel quality of a highly repetitive mouse DNA sequence is described which points to a functional role: All copies (approximately 50,000 per haploid genome) of this DNA sequence reside on genomic Alu I DNA fragments each associated with nuclear polypeptides that are not released from DNA by proteinase K, SDS and phenol extraction. By this quality the repetitive DNA sequence is classified as a member of the sub-set of DNA sequences involved in tight DNA-polypeptide complexes which have been previously shown to be components of the subnuclear structure termed 'nuclear matrix'. From these results it has to be concluded that the repetitive DNA sequence characterized in this report represents or comprises a signal for a large number of site specific attachment points of the mouse genome in the nuclear matrix.

  19. Death of a dogma: eukaryotic mRNAs can code for more than one protein.

    Science.gov (United States)

    Mouilleron, Hélène; Delcourt, Vivian; Roucou, Xavier

    2016-01-08

    mRNAs carry the genetic information that is translated by ribosomes. The traditional view of a mature eukaryotic mRNA is a molecule with three main regions, the 5' UTR, the protein coding open reading frame (ORF) or coding sequence (CDS), and the 3' UTR. This concept assumes that ribosomes translate one ORF only, generally the longest one, and produce one protein. As a result, in the early days of genomics and bioinformatics, one CDS was associated with each protein-coding gene. This fundamental concept of a single CDS is being challenged by increasing experimental evidence indicating that annotated proteins are not the only proteins translated from mRNAs. In particular, mass spectrometry (MS)-based proteomics and ribosome profiling have detected productive translation of alternative open reading frames. In several cases, the alternative and annotated proteins interact. Thus, the expression of two or more proteins translated from the same mRNA may offer a mechanism to ensure the co-expression of proteins which have functional interactions. Translational mechanisms already described in eukaryotic cells indicate that the cellular machinery is able to translate different CDSs from a single viral or cellular mRNA. In addition to summarizing data showing that the protein coding potential of eukaryotic mRNAs has been underestimated, this review aims to challenge the single translated CDS dogma. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  20. Genome Sequence of the Biocontrol Strain Pseudomonas fluorescens F113

    Science.gov (United States)

    Redondo-Nieto, Miguel; Barret, Matthieu; Morrisey, John P.; Germaine, Kieran; Martínez-Granero, Francisco; Barahona, Emma; Navazo, Ana; Sánchez-Contreras, María; Moynihan, Jennifer A.; Giddens, Stephen R.; Coppoolse, Eric R.; Muriel, Candela; Stiekema, Willem J.; Rainey, Paul B.; Dowling, David; O'Gara, Fergal; Martín, Marta

    2012-01-01

    Pseudomonas fluorescens F113 is a plant growth-promoting rhizobacterium (PGPR) that has biocontrol activity against fungal plant pathogens and is a model for rhizosphere colonization. Here, we present its complete genome sequence, which shows that besides a core genome very similar to those of other strains sequenced within this species, F113 possesses a wide array of genes encoding specialized functions for thriving in the rhizosphere and interacting with eukaryotic organisms. PMID:22328765

  1. Tree decomposition based fast search of RNA structures including pseudoknots in genomes.

    Science.gov (United States)

    Song, Yinglei; Liu, Chunmei; Malmberg, Russell; Pan, Fangfang; Cai, Liming

    2005-01-01

    Searching genomes for RNA secondary structure with computational methods has become an important approach to the annotation of non-coding RNAs. However, due to the lack of efficient algorithms for accurate RNA structure-sequence alignment, computer programs capable of fast and effectively searching genomes for RNA secondary structures have not been available. In this paper, a novel RNA structure profiling model is introduced based on the notion of a conformational graph to specify the consensus structure of an RNA family. Tree decomposition yields a small tree width t for such conformation graphs (e.g., t = 2 for stem loops and only a slight increase for pseudo-knots). Within this modelling framework, the optimal alignment of a sequence to the structure model corresponds to finding a maximum valued isomorphic subgraph and consequently can be accomplished through dynamic programming on the tree decomposition of the conformational graph in time O(k(t)N(2)), where k is a small parameter; and N is the size of the projiled RNA structure. Experiments show that the application of the alignment algorithm to search in genomes yields the same search accuracy as methods based on a Covariance model with a significant reduction in computation time. In particular; very accurate searches of tmRNAs in bacteria genomes and of telomerase RNAs in yeast genomes can be accomplished in days, as opposed to months required by other methods. The tree decomposition based searching tool is free upon request and can be downloaded at our site h t t p ://w.uga.edu/RNA-informatics/software/index.php.

  2. Comparative genomics of phylogenetically diverse unicellular eukaryotes provide new insights into the genetic basis for the evolution of the programmed cell death machinery.

    Science.gov (United States)

    Nedelcu, Aurora M

    2009-03-01

    Programmed cell death (PCD) represents a significant component of normal growth and development in multicellular organisms. Recently, PCD-like processes have been reported in single-celled eukaryotes, implying that some components of the PCD machinery existed early in eukaryotic evolution. This study provides a comparative analysis of PCD-related sequences across more than 50 unicellular genera from four eukaryotic supergroups: Unikonts, Excavata, Chromalveolata, and Plantae. A complex set of PCD-related sequences that correspond to domains or proteins associated with all main functional classes--from ligands and receptors to executors of PCD--was found in many unicellular lineages. Several PCD domains and proteins previously thought to be restricted to animals or land plants are also present in unicellular species. Noteworthy, the yeast, Saccharomyces cerevisiae--used as an experimental model system for PCD research, has a rather reduced set of PCD-related sequences relative to other unicellular species. The phylogenetic distribution of the PCD-related sequences identified in unicellular lineages suggests that the genetic basis for the evolution of the complex PCD machinery present in extant multicellular lineages has been established early in the evolution of eukaryotes. The shaping of the PCD machinery in multicellular lineages involved the duplication, co-option, recruitment, and shuffling of domains already present in their unicellular ancestors.

  3. Identification and characterization of insect-specific proteins by genome data analysis

    DEFF Research Database (Denmark)

    Zhang, Guojie; Wang, Hongsheng; Shi, Junjie

    2007-01-01

    melanogaster, Anopheles gambiae, Bombyx mori, Tribolium castaneum, and Apis mellifera were compared to the complete genomes of three non-insect eukaryotes (opisthokonts) Homo sapiens, Caenorhabditis elegans and Saccharomyces cerevisiae. This operation yielded 154 groups of orthologous proteins in Drosophila...

  4. Characterizing a thermostable Cas9 for bacterial genome editing and silencing

    NARCIS (Netherlands)

    Mougiakos, Ioannis; Mohanraju, Prarthana; Bosma, Elleke F.; Vrouwe, Valentijn; Finger Bou, Max; Naduthodi, Mihris I.S.; Gussak, Alex; Brinkman, Rudolf B.L.; Kranenburg, Van Richard; Oost, Van Der John

    2017-01-01

    CRISPR-Cas9-based genome engineering tools have revolutionized fundamental research and biotechnological exploitation of both eukaryotes and prokaryotes. However, the mesophilic nature of the established Cas9 systems does not allow for applications that require enhanced stability, including

  5. Characterizing a thermostable Cas9 for bacterial genome editing and silencing

    DEFF Research Database (Denmark)

    Mougiakos, Ioannis; Mohanraju, Prarthana; Bosma, Elleke Fenna

    2017-01-01

    CRISPR-Cas9-based genome engineering tools have revolutionized fundamental research and biotechnological exploitation of both eukaryotes and prokaryotes. However, the mesophilic nature of the established Cas9 systems does not allow for applications that require enhanced stability, including...

  6. WebScipio: An online tool for the determination of gene structures using protein sequences

    Directory of Open Access Journals (Sweden)

    Waack Stephan

    2008-09-01

    Full Text Available Abstract Background Obtaining the gene structure for a given protein encoding gene is an important step in many analyses. A software suited for this task should be readily accessible, accurate, easy to handle and should provide the user with a coherent representation of the most probable gene structure. It should be rigorous enough to optimise features on the level of single bases and at the same time flexible enough to allow for cross-species searches. Results WebScipio, a web interface to the Scipio software, allows a user to obtain the corresponding coding sequence structure of a here given a query protein sequence that belongs to an already assembled eukaryotic genome. The resulting gene structure is presented in various human readable formats like a schematic representation, and a detailed alignment of the query and the target sequence highlighting any discrepancies. WebScipio can also be used to identify and characterise the gene structures of homologs in related organisms. In addition, it offers a web service for integration with other programs. Conclusion WebScipio is a tool that allows users to get a high-quality gene structure prediction from a protein query. It offers more than 250 eukaryotic genomes that can be searched and produces predictions that are close to what can be achieved by manual annotation, for in-species and cross-species searches alike. WebScipio is freely accessible at http://www.webscipio.org.

  7. Sequence analysis of the genome of carnation (Dianthus caryophyllus L.).

    Science.gov (United States)

    Yagi, Masafumi; Kosugi, Shunichi; Hirakawa, Hideki; Ohmiya, Akemi; Tanase, Koji; Harada, Taro; Kishimoto, Kyutaro; Nakayama, Masayoshi; Ichimura, Kazuo; Onozaki, Takashi; Yamaguchi, Hiroyasu; Sasaki, Nobuhiro; Miyahara, Taira; Nishizaki, Yuzo; Ozeki, Yoshihiro; Nakamura, Noriko; Suzuki, Takamasa; Tanaka, Yoshikazu; Sato, Shusei; Shirasawa, Kenta; Isobe, Sachiko; Miyamura, Yoshinori; Watanabe, Akiko; Nakayama, Shinobu; Kishida, Yoshie; Kohara, Mitsuyo; Tabata, Satoshi

    2014-06-01

    The whole-genome sequence of carnation (Dianthus caryophyllus L.) cv. 'Francesco' was determined using a combination of different new-generation multiplex sequencing platforms. The total length of the non-redundant sequences was 568,887,315 bp, consisting of 45,088 scaffolds, which covered 91% of the 622 Mb carnation genome estimated by k-mer analysis. The N50 values of contigs and scaffolds were 16,644 bp and 60,737 bp, respectively, and the longest scaffold was 1,287,144 bp. The average GC content of the contig sequences was 36%. A total of 1050, 13, 92 and 143 genes for tRNAs, rRNAs, snoRNA and miRNA, respectively, were identified in the assembled genomic sequences. For protein-encoding genes, 43 266 complete and partial gene structures excluding those in transposable elements were deduced. Gene coverage was ∼ 98%, as deduced from the coverage of the core eukaryotic genes. Intensive characterization of the assigned carnation genes and comparison with those of other plant species revealed characteristic features of the carnation genome. The results of this study will serve as a valuable resource for fundamental and applied research of carnation, especially for breeding new carnation varieties. Further information on the genomic sequences is available at http://carnation.kazusa.or.jp. © The Author 2013. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.

  8. Solution Structure of Archaeoglobus fulgidis Peptidyl-tRNA Hydrolase(Pth2) Provides Evidence for an Extensive Conserved Family of Pth2 Enzymes in Archaea, Bacteria and Eukaryotes.

    Energy Technology Data Exchange (ETDEWEB)

    Powers, Robert; Mirkovic, Nebojsa; Goldsmith-Fischman, Sharon; Acton, Thomas; Chiang, Yiwen; Huang, Yuanpeng; Ma, LiChung; Rajan, Paranji K.; Cort, John R.; Kennedy, Michael A.; Liu, Jinfeng; Rost, Burkhard; Honig, Barry; Murray, Diana; Montelione, Gaetano

    2005-11-01

    The solution structure of protein AF2095 from the thermophilic archaea Archaeglobus fulgidis, a 123-residue (13.6 kDa) protein, has been determined by NMR methods. The structure of AF2095 is comprised of four a-helices and a mixed b-sheet consisting of four parallel and anti-parallel b-strands, where the a-helices sandwich the b-sheet. Sequence and structural comparison of AF2095 with proteins from Homo sapiens, Methanocaldococcus jannaschii and Sulfolobus solfataricus, reveals that AF2095 is a peptidyl-tRNA hydrolase (Pth2). This structural comparison also identifies putative catalytic residues and a tRNA interaction region for AF2095. The structure of AF2095 is also similar to the structure of protein TA0108 from archaea Thermoplasma acidophilum, which is deposited in the Protein Database but not functionally annotated. The NMR structure of AF2095 has been further leveraged to obtain good quality structural models for 55 other proteins. Although earlier studies have proposed that the Pth2 protein family is restricted to archeal and eukaryotic organisms, the similarity of the AF2095 structure to human Pth2, the conservation of key active-site residues, and the good quality of the resulting homology models demonstrate a large family of homologous Pth2 proteins that are conserved in eukaryotic, archaeal and bacterial organisms, providing novel insights in the evolution of the Pth and Pth2 enzyme families.

  9. A sequence-based survey of the complex structural organization of tumor genomes

    Energy Technology Data Exchange (ETDEWEB)

    Collins, Colin; Raphael, Benjamin J.; Volik, Stanislav; Yu, Peng; Wu, Chunxiao; Huang, Guiqing; Linardopoulou, Elena V.; Trask, Barbara J.; Waldman, Frederic; Costello, Joseph; Pienta, Kenneth J.; Mills, Gordon B.; Bajsarowicz, Krystyna; Kobayashi, Yasuko; Sridharan, Shivaranjani; Paris, Pamela; Tao, Quanzhou; Aerni, Sarah J.; Brown, Raymond P.; Bashir, Ali; Gray, Joe W.; Cheng, Jan-Fang; de Jong, Pieter; Nefedov, Mikhail; Ried, Thomas; Padilla-Nash, Hesed M.; Collins, Colin C.

    2008-04-03

    The genomes of many epithelial tumors exhibit extensive chromosomal rearrangements. All classes of genome rearrangements can be identified using End Sequencing Profiling (ESP), which relies on paired-end sequencing of cloned tumor genomes. In this study, brain, breast, ovary and prostate tumors along with three breast cancer cell lines were surveyed with ESP yielding the largest available collection of sequence-ready tumor genome breakpoints and providing evidence that some rearrangements may be recurrent. Sequencing and fluorescence in situ hybridization (FISH) confirmed translocations and complex tumor genome structures that include coamplification and packaging of disparate genomic loci with associated molecular heterogeneity. Comparison of the tumor genomes suggests recurrent rearrangements. Some are likely to be novel structural polymorphisms, whereas others may be bona fide somatic rearrangements. A recurrent fusion transcript in breast tumors and a constitutional fusion transcript resulting from a segmental duplication were identified. Analysis of end sequences for single nucleotide polymorphisms (SNPs) revealed candidate somatic mutations and an elevated rate of novel SNPs in an ovarian tumor. These results suggest that the genomes of many epithelial tumors may be far more dynamic and complex than previously appreciated and that genomic fusions including fusion transcripts and proteins may be common, possibly yielding tumor-specific biomarkers and therapeutic targets.

  10. Structured Matrix Completion with Applications to Genomic Data Integration.

    Science.gov (United States)

    Cai, Tianxi; Cai, T Tony; Zhang, Anru

    2016-01-01

    Matrix completion has attracted significant recent attention in many fields including statistics, applied mathematics and electrical engineering. Current literature on matrix completion focuses primarily on independent sampling models under which the individual observed entries are sampled independently. Motivated by applications in genomic data integration, we propose a new framework of structured matrix completion (SMC) to treat structured missingness by design. Specifically, our proposed method aims at efficient matrix recovery when a subset of the rows and columns of an approximately low-rank matrix are observed. We provide theoretical justification for the proposed SMC method and derive lower bound for the estimation errors, which together establish the optimal rate of recovery over certain classes of approximately low-rank matrices. Simulation studies show that the method performs well in finite sample under a variety of configurations. The method is applied to integrate several ovarian cancer genomic studies with different extent of genomic measurements, which enables us to construct more accurate prediction rules for ovarian cancer survival.

  11. A search for H/ACA snoRNAs in yeast using MFE secondary structure prediction.

    Science.gov (United States)

    Edvardsson, Sverker; Gardner, Paul P; Poole, Anthony M; Hendy, Michael D; Penny, David; Moulton, Vincent

    2003-05-01

    Noncoding RNA genes produce functional RNA molecules rather than coding for proteins. One such family is the H/ACA snoRNAs. Unlike the related C/D snoRNAs these have resisted automated detection to date. We develop an algorithm to screen the yeast genome for novel H/ACA snoRNAs. To achieve this, we introduce some new methods for facilitating the search for noncoding RNAs in genomic sequences which are based on properties of predicted minimum free-energy (MFE) secondary structures. The algorithm has been implemented and can be generalized to enable screening of other eukaryote genomes. We find that use of primary sequence alone is insufficient for identifying novel H/ACA snoRNAs. Only the use of secondary structure filters reduces the number of candidates to a manageable size. From genomic context, we identify three strong H/ACA snoRNA candidates. These together with a further 47 candidates obtained by our analysis are being experimentally screened.

  12. Phylogenetic analysis of ferlin genes reveals ancient eukaryotic origins

    Directory of Open Access Journals (Sweden)

    Lek Monkol

    2010-07-01

    Full Text Available Abstract Background The ferlin gene family possesses a rare and identifying feature consisting of multiple tandem C2 domains and a C-terminal transmembrane domain. Much currently remains unknown about the fundamental function of this gene family, however, mutations in its two most well-characterised members, dysferlin and otoferlin, have been implicated in human disease. The availability of genome sequences from a wide range of species makes it possible to explore the evolution of the ferlin family, providing contextual insight into characteristic features that define the ferlin gene family in its present form in humans. Results Ferlin genes were detected from all species of representative phyla, with two ferlin subgroups partitioned within the ferlin phylogenetic tree based on the presence or absence of a DysF domain. Invertebrates generally possessed two ferlin genes (one with DysF and one without, with six ferlin genes in most vertebrates (three DysF, three non-DysF. Expansion of the ferlin gene family is evident between the divergence of lamprey (jawless vertebrates and shark (cartilaginous fish. Common to almost all ferlins is an N-terminal C2-FerI-C2 sandwich, a FerB motif, and two C-terminal C2 domains (C2E and C2F adjacent to the transmembrane domain. Preservation of these structural elements throughout eukaryotic evolution suggests a fundamental role of these motifs for ferlin function. In contrast, DysF, C2DE, and FerA are optional, giving rise to subtle differences in domain topologies of ferlin genes. Despite conservation of multiple C2 domains in all ferlins, the C-terminal C2 domains (C2E and C2F displayed higher sequence conservation and greater conservation of putative calcium binding residues across paralogs and orthologs. Interestingly, the two most studied non-mammalian ferlins (Fer-1 and Misfire in model organisms C. elegans and D. melanogaster, present as outgroups in the phylogenetic analysis, with results suggesting

  13. The MCM Helicase Motor of the Eukaryotic Replisome.

    Science.gov (United States)

    Abid Ali, Ferdos; Costa, Alessandro

    2016-05-08

    The MCM motor of the CMG helicase powers ahead of the eukaryotic replication machinery to unwind DNA, in a process that requires ATP hydrolysis. The reconstitution of DNA replication in vitro has established the succession of events that lead to replication origin activation by the MCM and recent studies have started to elucidate the structural basis of duplex DNA unwinding. Despite the exciting progress, how the MCM translocates on DNA remains a matter of debate. Copyright © 2016. Published by Elsevier Ltd.

  14. Eukaryotic Cell Panorama

    Science.gov (United States)

    Goodsell, David S.

    2011-01-01

    Diverse biological data may be used to create illustrations of molecules in their cellular context. This report describes the scientific results that support an illustration of a eukaryotic cell, enlarged by one million times to show the distribution and arrangement of macromolecules. The panoramic cross section includes eight panels that extend…

  15. Interaction of tRNA with Eukaryotic Ribosome

    Directory of Open Access Journals (Sweden)

    Dmitri Graifer

    2015-03-01

    Full Text Available This paper is a review of currently available data concerning interactions of tRNAs with the eukaryotic ribosome at various stages of translation. These data include the results obtained by means of cryo-electron microscopy and X-ray crystallography applied to various model ribosomal complexes, site-directed cross-linking with the use of tRNA derivatives bearing chemically or photochemically reactive groups in the CCA-terminal fragment and chemical probing of 28S rRNA in the region of the peptidyl transferase center. Similarities and differences in the interactions of tRNAs with prokaryotic and eukaryotic ribosomes are discussed with concomitant consideration of the extent of resemblance between molecular mechanisms of translation in eukaryotes and bacteria.

  16. Identification and characterization of insect-specific proteins by genome data analysis

    Directory of Open Access Journals (Sweden)

    Clark Terry

    2007-04-01

    Full Text Available Abstract Background Insects constitute the vast majority of known species with their importance including biodiversity, agricultural, and human health concerns. It is likely that the successful adaptation of the Insecta clade depends on specific components in its proteome that give rise to specialized features. However, proteome determination is an intensive undertaking. Here we present results from a computational method that uses genome analysis to characterize insect and eukaryote proteomes as an approximation complementary to experimental approaches. Results Homologs in common to Drosophila melanogaster, Anopheles gambiae, Bombyx mori, Tribolium castaneum, and Apis mellifera were compared to the complete genomes of three non-insect eukaryotes (opisthokonts Homo sapiens, Caenorhabditis elegans and Saccharomyces cerevisiae. This operation yielded 154 groups of orthologous proteins in Drosophila to be insect-specific homologs; 466 groups were determined to be common to eukaryotes (represented by three opisthokonts. ESTs from the hemimetabolous insect Locust migratoria were also considered in order to approximate their corresponding genes in the insect-specific homologs. Stress and stimulus response proteins were found to constitute a higher fraction in the insect-specific homologs than in the homologs common to eukaryotes. Conclusion The significant representation of stress response and stimulus response proteins in proteins determined to be insect-specific, along with specific cuticle and pheromone/odorant binding proteins, suggest that communication and adaptation to environments may distinguish insect evolution relative to other eukaryotes. The tendency for low Ka/Ks ratios in the insect-specific protein set suggests purifying selection pressure. The generally larger number of paralogs in the insect-specific proteins may indicate adaptation to environment changes. Instances in our insect-specific protein set have been arrived at through

  17. Genome Defense Mechanisms in Neurospora and Associated Specialized Proteins

    Directory of Open Access Journals (Sweden)

    Ranjan Tamuli

    2010-06-01

    Full Text Available Neurospora crassa, the filamentous fungus possesses widest array of genome defense mechanisms known to any eukaryotic organism, including a process called repeat-induced point mutation (RIP. RIP is a genome defense mechanism that hypermutates repetitive DNA sequences; analogous to genomic imprinting in mammals. As an impact of RIP, Neurospora possesses many fewer genes in multigene families than expected. A DNA methyltransferase homologue, RID was shown to be essential for RIP. Recently, a variant catalytic subunit of translesion DNA polymerase zeta (Pol zeta has been found to be essential for dominant RIP suppressor phenotype. Meiotic silencing and quelling are two other genome defense mechanisms in Neurospora, and proteins required for these two processes have been identified through genetic screens.

  18. Whole genome duplication affects evolvability of flowering time in an autotetraploid plant.

    Directory of Open Access Journals (Sweden)

    Sara L Martin

    Full Text Available Whole genome duplications have occurred recurrently throughout the evolutionary history of eukaryotes. The resulting genetic and phenotypic changes can influence physiological and ecological responses to the environment; however, the impact of genome copy number on evolvability has rarely been examined experimentally. Here, we evaluate the effect of genome duplication on the ability to respond to selection for early flowering time in lines drawn from naturally occurring diploid and autotetraploid populations of the plant Chamerion angustifolium (fireweed. We contrast this with the result of four generations of selection on synthesized neoautotetraploids, whose genic variability is similar to diploids but genome copy number is similar to autotetraploids. In addition, we examine correlated responses to selection in all three groups. Diploid and both extant tetraploid and neoautotetraploid lines responded to selection with significant reductions in time to flowering. Evolvability, measured as realized heritability, was significantly lower in extant tetraploids (^b(T =  0.31 than diploids (^b(T =  0.40. Neotetraploids exhibited the highest evolutionary response (^b(T  =  0.55. The rapid shift in flowering time in neotetraploids was associated with an increase in phenotypic variability across generations, but not with change in genome size or phenotypic correlations among traits. Our results suggest that whole genome duplications, without hybridization, may initially alter evolutionary rate, and that the dynamic nature of neoautopolyploids may contribute to the prevalence of polyploidy throughout eukaryotes.

  19. Cytoplasmic ATR Activation Promotes Vaccinia Virus Genome Replication

    Directory of Open Access Journals (Sweden)

    Antonio Postigo

    2017-05-01

    Full Text Available In contrast to most DNA viruses, poxviruses replicate their genomes in the cytoplasm without host involvement. We find that vaccinia virus induces cytoplasmic activation of ATR early during infection, before genome uncoating, which is unexpected because ATR plays a fundamental nuclear role in maintaining host genome integrity. ATR, RPA, INTS7, and Chk1 are recruited to cytoplasmic DNA viral factories, suggesting canonical ATR pathway activation. Consistent with this, pharmacological and RNAi-mediated inhibition of canonical ATR signaling suppresses genome replication. RPA and the sliding clamp PCNA interact with the viral polymerase E9 and are required for DNA replication. Moreover, the ATR activator TOPBP1 promotes genome replication and associates with the viral replisome component H5. Our study suggests that, in contrast to long-held beliefs, vaccinia recruits conserved components of the eukaryote DNA replication and repair machinery to amplify its genome in the host cytoplasm.

  20. Absence of genome reduction in diverse, facultative endohyphal bacteria

    Energy Technology Data Exchange (ETDEWEB)

    Baltrus, David A. [Univ. of Arizona, Tucson, AZ (United States); Dougherty, Kevin [Univ. of Arizona, Tucson, AZ (United States); Arendt, Kayla R. [Univ. of Arizona, Tucson, AZ (United States); Huntemann, Marcel [Joint Genome Institute, Walnut Creek, CA (United States); Clum, Alicia [Joint Genome Institute, Walnut Creek, CA (United States); Pillay, Manoj [Joint Genome Institute, Walnut Creek, CA (United States); Palaniappan, Krishnaveni [Joint Genome Institute, Walnut Creek, CA (United States); Varghese, Neha [Joint Genome Institute, Walnut Creek, CA (United States); Mikhailova, Natalia [Joint Genome Institute, Walnut Creek, CA (United States); Stamatis, Dimitrios [Joint Genome Institute, Walnut Creek, CA (United States); Reddy, T. B. K. [Joint Genome Institute, Walnut Creek, CA (United States); Ngan, Chew Yee [Joint Genome Institute, Walnut Creek, CA (United States); Daum, Chris [Joint Genome Institute, Walnut Creek, CA (United States); Shapiro, Nicole [Joint Genome Institute, Walnut Creek, CA (United States); Markowitz, Victor [Joint Genome Institute, Walnut Creek, CA (United States); Ivanova, Natalia [Joint Genome Institute, Walnut Creek, CA (United States); Kyrpides, Nikos [Joint Genome Institute, Walnut Creek, CA (United States); Woyke, Tanja [Joint Genome Institute, Walnut Creek, CA (United States); Arnold, A. Elizabeth [Univ. of Arizona, Tucson, AZ (United States)

    2017-02-28

    Fungi interact closely with bacteria, both on the surfaces of the hyphae and within their living tissues (i.e. endohyphal bacteria, EHB). These EHB can be obligate or facultative symbionts and can mediate diverse phenotypic traits in their hosts. Although EHB have been observed in many lineages of fungi, it remains unclear how widespread and general these associations are, and whether there are unifying ecological and genomic features can be found across EHB strains as a whole. We cultured 11 bacterial strains after they emerged from the hyphae of diverse Ascomycota that were isolated as foliar endophytes of cupressaceous trees, and generated nearly complete genome sequences for all. Unlike the genomes of largely obligate EHB, the genomes of these facultative EHB resembled those of closely related strains isolated from environmental sources. Although all analysed genomes encoded structures that could be used to interact with eukaryotic hosts, pathways previously implicated in maintenance and establishment of EHB symbiosis were not universally present across all strains. Independent isolation of two nearly identical pairs of strains from different classes of fungi, coupled with recent experimental evidence, suggests horizontal transfer of EHB across endophytic hosts. Given the potential for EHB to influence fungal phenotypes, these genomes could shed light on the mechanisms of plant growth promotion or stress mitigation by fungal endophytes during the symbiotic phase, as well as degradation of plant material during the saprotrophic phase. As such, these findings contribute to the illumination of a new dimension of functional biodiversity in fungi.

  1. An Evolutionary Framework for Understanding the Origin of Eukaryotes

    Directory of Open Access Journals (Sweden)

    Neil W. Blackstone

    2016-04-01

    Full Text Available Two major obstacles hinder the application of evolutionary theory to the origin of eukaryotes. The first is more apparent than real—the endosymbiosis that led to the mitochondrion is often described as “non-Darwinian” because it deviates from the incremental evolution championed by the modern synthesis. Nevertheless, endosymbiosis can be accommodated by a multi-level generalization of evolutionary theory, which Darwin himself pioneered. The second obstacle is more serious—all of the major features of eukaryotes were likely present in the last eukaryotic common ancestor thus rendering comparative methods ineffective. In addition to a multi-level theory, the development of rigorous, sequence-based phylogenetic and comparative methods represents the greatest achievement of modern evolutionary theory. Nevertheless, the rapid evolution of major features in the eukaryotic stem group requires the consideration of an alternative framework. Such a framework, based on the contingent nature of these evolutionary events, is developed and illustrated with three examples: the putative intron proliferation leading to the nucleus and the cell cycle; conflict and cooperation in the origin of eukaryotic bioenergetics; and the inter-relationship between aerobic metabolism, sterol synthesis, membranes, and sex. The modern synthesis thus provides sufficient scope to develop an evolutionary framework to understand the origin of eukaryotes.

  2. Structured RNAs in the ENCODE selected regions of the human genome

    DEFF Research Database (Denmark)

    Washietl, Stefan; Pedersen, Jakob Skou; Korbel, Jan O

    2007-01-01

    Functional RNA structures play an important role both in the context of noncoding RNA transcripts as well as regulatory elements in mRNAs. Here we present a computational study to detect functional RNA structures within the ENCODE regions of the human genome. Since structural RNAs in general lack...... with the GENCODE annotation points to functional RNAs in all genomic contexts, with a slightly increased density in 3'-UTRs. While we estimate a significant false discovery rate of approximately 50%-70% many of the predictions can be further substantiated by additional criteria: 248 loci are predicted by both RNAz...

  3. [MiRNA system in unicellular eukaryotes and its evolutionary implications].

    Science.gov (United States)

    Zhang, Yan-Qiong; Wen, Jian-Fan

    2010-02-01

    microRNAs (miRNAs) in higher multicellular eukaryotes have been extensively studied in recent years. Great progresses have also been achieved for miRNAs in unicellular eukaryotes. All these studies not only enrich our knowledge about the complex expression regulation system in diverse organisms, but also have evolutionary significance for understanding the origin of this system. In this review, Authors summarize the recent advance in the studies of miRNA in unicellular eukaryotes, including that on the most primitive unicellular eukaryote--Giardia. The origin and evolution of miRNA system is also discussed.

  4. Microbial taxonomy in the post-genomic era: Rebuilding from scratch?

    Energy Technology Data Exchange (ETDEWEB)

    Thompson, Cristiane C. [Univ. of Rio de Janeiro (UFRJ) (Brazil); Amaral, Gilda R. [Univ. of Rio de Janeiro (UFRJ) (Brazil); Campeão, Mariana [Univ. of Rio de Janeiro (UFRJ) (Brazil); Edwards, Robert A. [Univ. of Rio de Janeiro (UFRJ) (Brazil); San Diego State Univ., CA (United States); Argonne National Lab. (ANL), Argonne, IL (United States); Polz, Martin F. [Massachusetts Inst. of Technology (MIT), Cambridge, MA (United States); Dutilh, Bas E. [Univ. of Rio de Janeiro (UFRJ) (Brazil); Radbould Univ., Nijmegen (Netherlands); Ussery, David W. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Sawabe, Tomoo [Hokkaido Univ., Hakodate (Japan); Swings, Jean [Univ. of Rio de Janeiro (UFRJ) (Brazil); Ghent Univ. (Belgium); Thompson, Fabiano L. [Univ. of Rio de Janeiro (UFRJ) (Brazil); Advanced Systems Laboratory Production Management COPPE / UFRJ, Rio de Janeiro (Brazil)

    2014-12-23

    Microbial taxonomy should provide adequate descriptions of bacterial, archaeal, and eukaryotic microbial diversity in ecological, clinical, and industrial environments. We re-evaluated the prokaryote species twice. It is time to revisit polyphasic taxonomy, its principles, and its practice, including its underlying pragmatic species concept. We will be able to realize an old dream of our predecessor taxonomists and build a genomic-based microbial taxonomy, using standardized and automated curation of high-quality complete genome sequences as the new gold standard.

  5. Diverse circovirus-like genome architectures revealed by environmental metagenomics.

    Science.gov (United States)

    Rosario, Karyna; Duffy, Siobain; Breitbart, Mya

    2009-10-01

    Single-stranded DNA (ssDNA) viruses with circular genomes are the smallest viruses known to infect eukaryotes. The present study identified 10 novel genomes similar to ssDNA circoviruses through data-mining of public viral metagenomes. The metagenomic libraries included samples from reclaimed water and three different marine environments (Chesapeake Bay, British Columbia coastal waters and Sargasso Sea). All the genomes have similarities to the replication (Rep) protein of circoviruses; however, only half have genomic features consistent with known circoviruses. Some of the genomes exhibit a mixture of genomic features associated with different families of ssDNA viruses (i.e. circoviruses, geminiviruses and parvoviruses). Unique genome architectures and phylogenetic analysis of the Rep protein suggest that these viruses belong to novel genera and/or families. Investigating the complex community of ssDNA viruses in the environment can lead to the discovery of divergent species and help elucidate evolutionary links between ssDNA viruses.

  6. Suicidal autointegration of sleeping beauty and piggyBac transposons in eukaryotic cells.

    Directory of Open Access Journals (Sweden)

    Yongming Wang

    2014-03-01

    Full Text Available Transposons are discrete segments of DNA that have the distinctive ability to move and replicate within genomes across the tree of life. 'Cut and paste' DNA transposition involves excision from a donor locus and reintegration into a new locus in the genome. We studied molecular events following the excision steps of two eukaryotic DNA transposons, Sleeping Beauty (SB and piggyBac (PB that are widely used for genome manipulation in vertebrate species. SB originates from fish and PB from insects; thus, by introducing these transposons to human cells we aimed to monitor the process of establishing a transposon-host relationship in a naïve cellular environment. Similarly to retroviruses, neither SB nor PB is capable of self-avoidance because a significant portion of the excised transposons integrated back into its own genome in a suicidal process called autointegration. Barrier-to-autointegration factor (BANF1, a cellular co-factor of certain retroviruses, inhibited transposon autointegration, and was detected in higher-order protein complexes containing the SB transposase. Increasing size sensitized transposition for autointegration, consistent with elevated vulnerability of larger transposons. Both SB and PB were affected similarly by the size of the transposon in three different assays: excision, autointegration and productive transposition. Prior to reintegration, SB is completely separated from the donor molecule and followed an unbiased autointegration pattern, not associated with local hopping. Self-disruptive autointegration occurred at similar frequency for both transposons, while aberrant, pseudo-transposition events were more frequently observed for PB.

  7. Eukaryotic cell flattening

    Science.gov (United States)

    Bae, Albert; Westendorf, Christian; Erlenkamper, Christoph; Galland, Edouard; Franck, Carl; Bodenschatz, Eberhard; Beta, Carsten

    2010-03-01

    Eukaryotic cell flattening is valuable for improving microscopic observations, ranging from bright field to total internal reflection fluorescence microscopy. In this talk, we will discuss traditional overlay techniques, and more modern, microfluidic based flattening, which provides a greater level of control. We demonstrate these techniques on the social amoebae Dictyostelium discoideum, comparing the advantages and disadvantages of each method.

  8. G2S: A web-service for annotating genomic variants on 3D protein structures.

    Science.gov (United States)

    Wang, Juexin; Sheridan, Robert; Sumer, S Onur; Schultz, Nikolaus; Xu, Dong; Gao, Jianjiong

    2018-01-27

    Accurately mapping and annotating genomic locations on 3D protein structures is a key step in structure-based analysis of genomic variants detected by recent large-scale sequencing efforts. There are several mapping resources currently available, but none of them provides a web API (Application Programming Interface) that support programmatic access. We present G2S, a real-time web API that provides automated mapping of genomic variants on 3D protein structures. G2S can align genomic locations of variants, protein locations, or protein sequences to protein structures and retrieve the mapped residues from structures. G2S API uses REST-inspired design conception and it can be used by various clients such as web browsers, command terminals, programming languages and other bioinformatics tools for bringing 3D structures into genomic variant analysis. The webserver and source codes are freely available at https://g2s.genomenexus.org. g2s@genomenexus.org. Supplementary data are available at Bioinformatics online. © The Author (2018). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com

  9. The genome sequence of the model ascomycete fungus Podospora anserina

    NARCIS (Netherlands)

    Espagne, Eric; Lespinet, Olivier; Malagnac, Fabienne; Da Silva, Corinne; Jaillon, Olivier; Porcel, Betina M; Couloux, Arnaud; Aury, Jean-Marc; Ségurens, Béatrice; Poulain, Julie; Anthouard, Véronique; Grossetete, Sandrine; Khalili, Hamid; Coppin, Evelyne; Déquard-Chablat, Michelle; Picard, Marguerite; Contamine, Véronique; Arnaise, Sylvie; Bourdais, Anne; Berteaux-Lecellier, Véronique; Gautheret, Daniel; de Vries, Ronald P; Battaglia, Evy; Coutinho, Pedro M; Danchin, Etienne Gj; Henrissat, Bernard; Khoury, Riyad El; Sainsard-Chanet, Annie; Boivin, Antoine; Pinan-Lucarré, Bérangère; Sellem, Carole H; Debuchy, Robert; Wincker, Patrick; Weissenbach, Jean; Silar, Philippe

    2008-01-01

    BACKGROUND: The dung-inhabiting ascomycete fungus Podospora anserina is a model used to study various aspects of eukaryotic and fungal biology, such as ageing, prions and sexual development. RESULTS: We present a 10X draft sequence of P. anserina genome, linked to the sequences of a large expressed

  10. Genome-wide study of correlations between genomic features and their relationship with the regulation of gene expression.

    Science.gov (United States)

    Kravatsky, Yuri V; Chechetkin, Vladimir R; Tchurikov, Nikolai A; Kravatskaya, Galina I

    2015-02-01

    The broad class of tasks in genetics and epigenetics can be reduced to the study of various features that are distributed over the genome (genome tracks). The rapid and efficient processing of the huge amount of data stored in the genome-scale databases cannot be achieved without the software packages based on the analytical criteria. However, strong inhomogeneity of genome tracks hampers the development of relevant statistics. We developed the criteria for the assessment of genome track inhomogeneity and correlations between two genome tracks. We also developed a software package, Genome Track Analyzer, based on this theory. The theory and software were tested on simulated data and were applied to the study of correlations between CpG islands and transcription start sites in the Homo sapiens genome, between profiles of protein-binding sites in chromosomes of Drosophila melanogaster, and between DNA double-strand breaks and histone marks in the H. sapiens genome. Significant correlations between transcription start sites on the forward and the reverse strands were observed in genomes of D. melanogaster, Caenorhabditis elegans, Mus musculus, H. sapiens, and Danio rerio. The observed correlations may be related to the regulation of gene expression in eukaryotes. Genome Track Analyzer is freely available at http://ancorr.eimb.ru/. © The Author 2015. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.

  11. Autophagy in unicellular eukaryotes

    NARCIS (Netherlands)

    Kiel, J.A.K.W.

    2010-01-01

    Cells need a constant supply of precursors to enable the production of macromolecules to sustain growth and survival. Unlike metazoans, unicellular eukaryotes depend exclusively on the extracellular medium for this supply. When environmental nutrients become depleted, existing cytoplasmic components

  12. Origin of phagotrophic eukaryotes as social cheaters in microbial biofilms

    Directory of Open Access Journals (Sweden)

    Jékely Gáspár

    2007-01-01

    Full Text Available Abstract Background The origin of eukaryotic cells was one of the most dramatic evolutionary transitions in the history of life. It is generally assumed that eukaryotes evolved later then prokaryotes by the transformation or fusion of prokaryotic lineages. However, as yet there is no consensus regarding the nature of the prokaryotic group(s ancestral to eukaryotes. Regardless of this, a hardly debatable fundamental novel characteristic of the last eukaryotic common ancestor was the ability to exploit prokaryotic biomass by the ingestion of entire cells, i.e. phagocytosis. The recent advances in our understanding of the social life of prokaryotes may help to explain the origin of this form of total exploitation. Presentation of the hypothesis Here I propose that eukaryotic cells originated in a social environment, a differentiated microbial mat or biofilm that was maintained by the cooperative action of its members. Cooperation was costly (e.g. the production of developmental signals or an extracellular matrix but yielded benefits that increased the overall fitness of the social group. I propose that eukaryotes originated as selfish cheaters that enjoyed the benefits of social aggregation but did not contribute to it themselves. The cheaters later evolved into predators that lysed other cells and eventually became professional phagotrophs. During several cycles of social aggregation and dispersal the number of cheaters was contained by a chicken game situation, i.e. reproductive success of cheaters was high when they were in low abundance but was reduced when they were over-represented. Radical changes in cell structure, including the loss of the rigid prokaryotic cell wall and the development of endomembranes, allowed the protoeukaryotes to avoid cheater control and to exploit nutrients more efficiently. Cellular changes were buffered by both the social benefits and the protective physico-chemical milieu of the interior of biofilms. Symbiosis

  13. Nitrate storage and dissimilatory nitrate reduction by eukaryotic microbes

    DEFF Research Database (Denmark)

    Kamp, Anja; Høgslund, Signe; Risgaard-Petersen, Nils

    2015-01-01

    The microbial nitrogen cycle is one of the most complex and environmentally important element cycles on Earth and has long been thought to be mediated exclusively by prokaryotic microbes. Rather recently, it was discovered that certain eukaryotic microbes are able to store nitrate intracellularly......, suggesting that eukaryotes may rival prokaryotes in terms of dissimilatory nitrate reduction. Finally, this review article sketches some evolutionary perspectives of eukaryotic nitrate metabolism and identifies open questions that need to be addressed in future investigations....... and use it for dissimilatory nitrate reduction in the absence of oxygen. The paradigm shift that this entailed is ecologically significant because the eukaryotes in question comprise global players like diatoms, foraminifers, and fungi. This review article provides an unprecedented overview of nitrate...

  14. Crystal Structure of the Human, FIC-Domain Containing Protein HYPE and Implications for Its Functions

    Science.gov (United States)

    Bunney, Tom D.; Cole, Ambrose R.; Broncel, Malgorzata; Esposito, Diego; Tate, Edward W.; Katan, Matilda

    2014-01-01

    Summary Protein AMPylation, the transfer of AMP from ATP to protein targets, has been recognized as a new mechanism of host-cell disruption by some bacterial effectors that typically contain a FIC-domain. Eukaryotic genomes also encode one FIC-domain protein, HYPE, which has remained poorly characterized. Here we describe the structure of human HYPE, solved by X-ray crystallography, representing the first structure of a eukaryotic FIC-domain protein. We demonstrate that HYPE forms stable dimers with structurally and functionally integrated FIC-domains and with TPR-motifs exposed for protein-protein interactions. As HYPE also uniquely possesses a transmembrane helix, dimerization is likely to affect its positioning and function in the membrane vicinity. The low rate of autoAMPylation of the wild-type HYPE could be due to autoinhibition, consistent with the mechanism proposed for a number of putative FIC AMPylators. Our findings also provide a basis to further consider possible alternative cofactors of HYPE and distinct modes of target-recognition. PMID:25435325

  15. Comparative genome analysis of three eukaryotic parasites with differing abilities to transform leukocytes reveals key mediators of theileria-induced leukocyte transformation

    KAUST Repository

    Hayashida, Kyoko

    2012-09-04

    We sequenced the genome of Theileria orientalis, a tick-borne apicomplexan protozoan parasite of cattle. The focus of this study was a comparative genome analysis of T. orientalis relative to other highly pathogenic Theileria species, T. parva and T. annulata. T. parva and T. annulata induce transformation of infected cells of lymphocyte or macrophage/monocyte lineages; in contrast, T. orientalis does not induce uncontrolled proliferation of infected leukocytes and multiplies predominantly within infected erythrocytes. While synteny across homologous chromosomes of the three Theileria species was found to be well conserved overall, subtelomeric structures were found to differ substantially, as T. orientalis lacks the large tandemly arrayed subtelomere-encoded variable secreted protein-encoding gene family. Moreover, expansion of particular gene families by gene duplication was found in the genomes of the two transforming Theileria species, most notably, the TashAT/TpHN and Tar/Tpr gene families. Gene families that are present only in T. parva and T. annulata and not in T. orientalis, Babesia bovis, or Plasmo-dium were also identified. Identification of differences between the genome sequences of Theileria species with different abilities to transform and immortalize bovine leukocytes will provide insight into proteins and mechanisms that have evolved to induce and regulate this process. The T. orientalis genome database is available at http://totdb.czc.hokudai.ac.jp/. 2012 Hayashida et al. T.

  16. RNA 3D modules in genome-wide predictions of RNA 2D structure

    DEFF Research Database (Denmark)

    Theis, Corinna; Zirbel, Craig L; Zu Siederdissen, Christian Höner

    2015-01-01

    . These modules can, for example, occur inside structural elements which in RNA 2D predictions appear as internal loops. Hence one question is if the use of such RNA 3D information can improve the prediction accuracy of RNA secondary structure at a genome-wide level. Here, we use RNAz in combination with 3D......Recent experimental and computational progress has revealed a large potential for RNA structure in the genome. This has been driven by computational strategies that exploit multiple genomes of related organisms to identify common sequences and secondary structures. However, these computational...... approaches have two main challenges: they are computationally expensive and they have a relatively high false discovery rate (FDR). Simultaneously, RNA 3D structure analysis has revealed modules composed of non-canonical base pairs which occur in non-homologous positions, apparently by independent evolution...

  17. The mitochondrial genomes of sponges provide evidence for multiple invasions by Repetitive Hairpin-forming Elements (RHE

    Directory of Open Access Journals (Sweden)

    Lavrov Dennis V

    2009-12-01

    Full Text Available Abstract Background The mitochondrial (mt genomes of sponges possess a variety of features, which appear to be intermediate between those of Eumetazoa and non-metazoan opisthokonts. Among these features is the presence of long intergenic regions, which are common in other eukaryotes, but generally absent in Eumetazoa. Here we analyse poriferan mitochondrial intergenic regions, paying particular attention to repetitive sequences within them. In this context we introduce the mitochondrial genome of Ircinia strobilina (Lamarck, 1816; Demospongiae: Dictyoceratida and compare it with mtDNA of other sponges. Results Mt genomes of dictyoceratid sponges are identical in gene order and content but display major differences in size and organization of intergenic regions. An even higher degree of diversity in the structure of intergenic regions was found among different orders of demosponges. One interesting observation made from such comparisons was of what appears to be recurrent invasions of sponge mitochondrial genomes by repetitive hairpin-forming elements, which cause large genome size differences even among closely related taxa. These repetitive hairpin-forming elements are structurally and compositionally divergent and display a scattered distribution throughout various groups of demosponges. Conclusion Large intergenic regions of poriferan mt genomes are targets for insertions of repetitive hairpin- forming elements, similar to the ones found in non-metazoan opisthokonts. Such elements were likely present in some lineages early in animal mitochondrial genome evolution but were subsequently lost during the reduction of intergenic regions, which occurred in the Eumetazoa lineage after the split of Porifera. Porifera acquired their elements in several independent events. Patterns of their intra-genomic dispersal can be seen in the mt genome of Vaceletia sp.

  18. Insertion Sequence-Caused Large Scale-Rearrangements in the Genome of Escherichia coli

    Science.gov (United States)

    2016-07-18

    affordable ap- proach to genome-wide characterization of genetic varia - tion in bacterial and eukaryotic genomes (1–3). In addition to small-scale...Paired-End Reads), that uses a graph-based al- gorithm (27) capable of detecting most large-scale varia - tion involving repetitive regions, including novel...Avila,P., Grinsted,J. and De La Cruz,F. (1988) Analysis of the variable endpoints generated by one-ended transposition of Tn21.. J. Bacteriol., 170

  19. Crystal structure of the regulatory subunit of archaeal initiation factor 2B (aIF2B) from hyperthermophilic archaeon Pyrococcus horikoshii OT3: a proposed structure of the regulatory subcomplex of eukaryotic IF2B

    International Nuclear Information System (INIS)

    Kakuta, Yoshimitsu; Tahara, Maino; Maetani, Shigehiro; Yao, Min; Tanaka, Isao; Kimura, Makoto

    2004-01-01

    Eukaryotic translation initiation factor 2B (eIF2B) is the guanine-nucleotide exchange factor for eukaryotic initiation factor 2 (eIF2). eIF2B is a heteropentameric protein composed of α-ε subunits. The α, β, and δ subunits form a regulatory subcomplex, while the γ and ε form a catalytic subcomplex. Archaea possess homologues of α, β, and δ subunits of eIF2B. Here, we report the three-dimensional structure of an archaeal regulatory subunit (aIF2Bα) from the hyperthermophilic archaeon Pyrococcus horikoshii OT3 determined by X-ray crystallography at 2.2 A resolution. aIF2Bα consists of two subdomains, an N-domain (residues 1-95) and a C-domain (residues 96-276), connected by a long α-helix (α5: 78-106). The N-domain contains a five helix bundle structure, while the C-domain folds into the α/β structure, thus showing similarity to D-ribose-5-phosphate isomerase structure. The presence of two molecules in the crystallographic asymmetric unit and the gel filtration analysis suggest a dimeric structure of aIF2Bα in solution, interacting with each other by C-domains. Furthermore, the crystallographic 3-fold symmetry generates a homohexameric structure of aIF2Bα; the interaction is primarily mediated by the long α-helix at the N-domains. This structure suggests an architecture of the three subunits, α, β, and δ, in the regulatory subcomplex within eIF2B

  20. TFIIS-Dependent Non-coding Transcription Regulates Developmental Genome Rearrangements.

    Directory of Open Access Journals (Sweden)

    Kamila Maliszewska-Olejniczak

    2015-07-01

    Full Text Available Because of their nuclear dimorphism, ciliates provide a unique opportunity to study the role of non-coding RNAs (ncRNAs in the communication between germline and somatic lineages. In these unicellular eukaryotes, a new somatic nucleus develops at each sexual cycle from a copy of the zygotic (germline nucleus, while the old somatic nucleus degenerates. In the ciliate Paramecium tetraurelia, the genome is massively rearranged during this process through the reproducible elimination of repeated sequences and the precise excision of over 45,000 short, single-copy Internal Eliminated Sequences (IESs. Different types of ncRNAs resulting from genome-wide transcription were shown to be involved in the epigenetic regulation of genome rearrangements. To understand how ncRNAs are produced from the entire genome, we have focused on a homolog of the TFIIS elongation factor, which regulates RNA polymerase II transcriptional pausing. Six TFIIS-paralogs, representing four distinct families, can be found in P. tetraurelia genome. Using RNA interference, we showed that TFIIS4, which encodes a development-specific TFIIS protein, is essential for the formation of a functional somatic genome. Molecular analyses and high-throughput DNA sequencing upon TFIIS4 RNAi demonstrated that TFIIS4 is involved in all kinds of genome rearrangements, including excision of ~48% of IESs. Localization of a GFP-TFIIS4 fusion revealed that TFIIS4 appears specifically in the new somatic nucleus at an early developmental stage, before IES excision. RT-PCR experiments showed that TFIIS4 is necessary for the synthesis of IES-containing non-coding transcripts. We propose that these IES+ transcripts originate from the developing somatic nucleus and serve as pairing substrates for germline-specific short RNAs that target elimination of their homologous sequences. Our study, therefore, connects the onset of zygotic non coding transcription to the control of genome plasticity in Paramecium

  1. Genome size of 14 species of fireflies (Insecta, Coleoptera, Lampyridae

    Directory of Open Access Journals (Sweden)

    Gui-Chun Liu

    2017-11-01

    Full Text Available Eukaryotic genome size data are important both as the basis for comparative research into genome evolution and as estimators of the cost and difficulty of genome sequencing programs for non-model organisms. In this study, the genome size of 14 species of fireflies (Lampyridae (two genera in Lampyrinae, three genera in Luciolinae, and one genus in subfamily incertae sedis were estimated by propidium iodide (PI-based flow cytometry. The haploid genome sizes of Lampyridae ranged from 0.42 to 1.31 pg, a 3.1-fold span. Genome sizes of the fireflies varied within the tested subfamilies and genera. Lamprigera and Pyrocoelia species had large and small genome sizes, respectively. No correlation was found between genome size and morphological traits such as body length, body width, eye width, and antennal length. Our data provide additional information on genome size estimation of the firefly family Lampyridae. Furthermore, this study will help clarify the cost and difficulty of genome sequencing programs for non-model organisms and will help promote studies on firefly genome evolution.

  2. Synthesis of eukaryotic lipid biomarkers in the bacterial domain

    Science.gov (United States)

    Welander, P. V.; Banta, A. B.; Lee, A. K.; Wei, J. H.

    2017-12-01

    Lipid biomarkers are organic molecules preserved in sediments and sedimentary rocks that can function as geological proxies for certain microbial taxa or for specific environmental conditions. These molecular fossils provide a link between organisms and their environments in both modern and ancient settings and have afforded significant insight into ancient climatic events, mass extinctions, and various evolutionary transitions throughout Earth's history. However, the proper interpretation of lipid biomarkers is dependent on a broad understanding of their diagenetic precursors in modern systems. This includes understanding the taphonomic transformations that these molecules undergo, their biosynthetic pathways, and the ecological conditions that affect their cellular production. In this study, we focus on one group of lipid biomarkers - the sterols. These are polycyclic isoprenoidal lipids that have a high preservation potential and play a critical role in the physiology of most eukaryotes. However, the synthesis and function of these lipids in the bacterial domain has not been fully explored. Here we utilize a combination of bioinformatics, microbial genetics, and biochemistry to demonstrate that bacterial sterol producers are more prevalent in environmental metagenomic samples than in the genomic databases of cultured organisms and to identify novel proteins required to synthesize and modify sterols in bacteria. These proteins represent a distinct pathway for sterol synthesis exclusive to bacteria and indicate that sterol synthesis in bacteria may have evolved independently of eukaryotic sterol biosynthesis. Taken together, these results demonstrate how studies in extant bacteria can provide insight into the biological sources and the biosynthetic pathways of specific lipid biomarkers and in turn may allow for more robust interpretation of biomarker signatures.

  3. Crystal structure of an eIF4G-like protein from Danio rerio

    Energy Technology Data Exchange (ETDEWEB)

    Bae, Euiyoung; Bitto, Eduard; Bingman, Craig A.; McCoy, Jason G.; Wesenberg, Gary E.; Phillips, Jr., George N. (UW)

    2012-04-18

    The gene LOC 91917 Danio rerio (zebrafish) encodes a protein annotated in the UniProt knowledgebase as the middle domain of eukaryotic initiation factor 4G domain containing protein b (MIF4Gdb). Its molecular weight is 25.8 kDa, and it comprises 222 amino acid residues. BLAST searches revealed homologues of D. rerio MIF4Gdb in many eukaryotes including humans. The homologue sand MIF4Gdb were identified as members of the Pfam family, MIF4G (PF2854), which is named after the middle domain of eukaryotic initiation factor 4G (eIF4G). eIF4G is a component of eukaryotic translational initiation complex, and contains binding sites for other initiation factors, suggesting its critical role in translational initiation. The MIF4G domain also occurs in several other proteins involved in RNA metabolism, including the Nonsense-mediated mRNA decay 2 protein (NMD2/UPF2), and the nuclear cap-binding protein 80-kDa subunit (CBP80). Sequence and structure analysis of the MIF4G domains in many proteins indicate that the domain assumes an all helical fold and has tandem repeated motifs. The zebrafish protein described here has homology to domains of other proteins variously referred to as NIC-containing proteins (NMD2, eIF4G, CBP80). The biological function of D. rerio MIF4Gdb has not yet been experimentally characterized, and the annotation is based on amino acid sequence comparison. D. rerio MIF4Gdb did not share more than 25% sequence identity with any protein for which the three-dimensional structure is known and was selected as a target for structure determination by the Center for Eukaryotic Structural Genomics (CESG). Here, they report the crystal structure of D. rerio MIF4Gdb (UniGene code Dr.79360, UniProt code Q5EAQ1, CESG target number GO.79294).

  4. Structural analyses of Legionella LepB reveal a new GAP fold that catalytically mimics eukaryotic RasGAP.

    Science.gov (United States)

    Yu, Qin; Hu, Liyan; Yao, Qing; Zhu, Yongqun; Dong, Na; Wang, Da-Cheng; Shao, Feng

    2013-06-01

    Rab GTPases are emerging targets of diverse bacterial pathogens. Here, we perform biochemical and structural analyses of LepB, a Rab GTPase-activating protein (GAP) effector from Legionella pneumophila. We map LepB GAP domain to residues 313-618 and show that the GAP domain is Rab1 specific with a catalytic activity higher than the canonical eukaryotic TBC GAP and the newly identified VirA/EspG family of bacterial RabGAP effectors. Exhaustive mutation analyses identify Arg444 as the arginine finger, but no catalytically essential glutamine residues. Crystal structures of LepB313-618 alone and the GAP domain of Legionella drancourtii LepB in complex with Rab1-GDP-AlF3 support the catalytic role of Arg444, and also further reveal a 3D architecture and a GTPase-binding mode distinct from all known GAPs. Glu449, structurally equivalent to TBC RabGAP glutamine finger in apo-LepB, undergoes a drastic movement upon Rab1 binding, which induces Rab1 Gln70 side-chain flipping towards GDP-AlF3 through a strong ionic interaction. This conformationally rearranged Gln70 acts as the catalytic cis-glutamine, therefore uncovering an unexpected RasGAP-like catalytic mechanism for LepB. Our studies highlight an extraordinary structural and catalytic diversity of RabGAPs, particularly those from bacterial pathogens.

  5. AUG is the only initiation codon in eukaryotes

    Energy Technology Data Exchange (ETDEWEB)

    Sherman, F; McKnight, G; Stewart, J W

    1980-01-01

    An analysis of mutants of the yeast Saccharomyces cerevisiae indicates that AUG is the sole codon capable of initiating translation of iso-1-cytochrome c. This result with yeast and the sequence results of numerous eukaryotic genes indicate that AUG is the only initiation codon in eukaryotes; in contrast, results with Escherichia colia and bacteriophages indicate that both AUG and GUG are initiation codons in prokaryotes. The difference can be explained by the lack of the t/sup 6/ A hypermodified nucleoside (N-(9-(..beta..-D-ribofuranosyl)purin-6-ylcarbamoyl)threonine) in prokaryotic initiator tRNA and its presence in eukaryotic initiator tRNA.

  6. Evidence that the intra-amoebal Legionella drancourtii acquired a sterol reductase gene from eukaryotes

    Directory of Open Access Journals (Sweden)

    Fournier Pierre-Edouard

    2009-03-01

    Full Text Available Abstract Background Free-living amoebae serve as a natural reservoir for some bacteria that have evolved into «amoeba-resistant» bacteria. Among these, some are strictly intra-amoebal, such as Candidatus "Protochlamydia amoebophila" (Candidatus "P. amoebophila", whose genomic sequence is available. We sequenced the genome of Legionella drancourtii (L. drancourtii, another recently described intra-amoebal bacterium. By comparing these two genomes with those of their closely related species, we were able to study the genetic characteristics specific to their amoebal lifestyle. Findings We identified a sterol delta-7 reductase-encoding gene common to these two bacteria and absent in their relatives. This gene encodes an enzyme which catalyses the last step of cholesterol biosynthesis in eukaryotes, and is probably functional within L. drancourtii since it is transcribed. The phylogenetic analysis of this protein suggests that it was acquired horizontally by a few bacteria from viridiplantae. This gene was also found in the Acanthamoeba polyphaga Mimivirus genome, a virus that grows in amoebae and possesses the largest viral genome known to date. Conclusion L. drancourtii acquired a sterol delta-7 reductase-encoding gene of viridiplantae origin. The most parsimonious hypothesis is that this gene was initially acquired by a Chlamydiales ancestor parasite of plants. Subsequently, its descendents transmitted this gene in amoebae to other intra-amoebal microorganisms, including L. drancourtii and Coxiella burnetii. The role of the sterol delta-7 reductase in prokaryotes is as yet unknown but we speculate that it is involved in host cholesterol parasitism.

  7. Genome-scale analysis of positional clustering of mouse testis-specific genes

    Directory of Open Access Journals (Sweden)

    Lee Bernett TK

    2005-01-01

    Full Text Available Abstract Background Genes are not randomly distributed on a chromosome as they were thought even after removal of tandem repeats. The positional clustering of co-expressed genes is known in prokaryotes and recently reported in several eukaryotic organisms such as Caenorhabditis elegans, Drosophila melanogaster, and Homo sapiens. In order to further investigate the mode of tissue-specific gene clustering in higher eukaryotes, we have performed a genome-scale analysis of positional clustering of the mouse testis-specific genes. Results Our computational analysis shows that a large proportion of testis-specific genes are clustered in groups of 2 to 5 genes in the mouse genome. The number of clusters is much higher than expected by chance even after removal of tandem repeats. Conclusion Our result suggests that testis-specific genes tend to cluster on the mouse chromosomes. This provides another piece of evidence for the hypothesis that clusters of tissue-specific genes do exist.

  8. Population Structure Analysis of Bull Genomes of European and Western Ancestry

    DEFF Research Database (Denmark)

    Chung, Neo Christopher; Szyda, Joanna; Frąszczak, Magdalena

    2017-01-01

    Since domestication, population bottlenecks, breed formation, and selective breeding have radically shaped the genealogy and genetics of Bos taurus. In turn, characterization of population structure among diverse bull (males of Bos taurus) genomes enables detailed assessment of genetic resources...... and origins. By analyzing 432 unrelated bull genomes from 13 breeds and 16 countries, we demonstrate genetic diversity and structural complexity among the European/Western cattle population. Importantly, we relaxed a strong assumption of discrete or admixed population, by adapting latent variable models...... harboring largest genetic differentiation suggest positive selection underlying population structure. We carried out gene set analysis using SNP annotations to identify enriched functional categories such as energy-related processes and multiple development stages. Our population structure analysis of bull...

  9. Supplementary Material for: BEACON: automated tool for Bacterial GEnome Annotation ComparisON

    KAUST Repository

    Kalkatawi, Manal M.; Alam, Intikhab; Bajic, Vladimir B.

    2015-01-01

    Abstract Background Genome annotation is one way of summarizing the existing knowledge about genomic characteristics of an organism. There has been an increased interest during the last several decades in computer-based structural and functional genome annotation. Many methods for this purpose have been developed for eukaryotes and prokaryotes. Our study focuses on comparison of functional annotations of prokaryotic genomes. To the best of our knowledge there is no fully automated system for detailed comparison of functional genome annotations generated by different annotation methods (AMs). Results The presence of many AMs and development of new ones introduce needs to: a/ compare different annotations for a single genome, and b/ generate annotation by combining individual ones. To address these issues we developed an Automated Tool for Bacterial GEnome Annotation ComparisON (BEACON) that benefits both AM developers and annotation analysers. BEACON provides detailed comparison of gene function annotations of prokaryotic genomes obtained by different AMs and generates extended annotations through combination of individual ones. For the illustration of BEACONâ s utility, we provide a comparison analysis of multiple different annotations generated for four genomes and show on these examples that the extended annotation can increase the number of genes annotated by putative functions up to 27 %, while the number of genes without any function assignment is reduced. Conclusions We developed BEACON, a fast tool for an automated and a systematic comparison of different annotations of single genomes. The extended annotation assigns putative functions to many genes with unknown functions. BEACON is available under GNU General Public License version 3.0 and is accessible at: http://www.cbrc.kaust.edu.sa/BEACON/ .

  10. Full-length RNA structure prediction of the HIV-1 genome reveals a conserved core domain

    DEFF Research Database (Denmark)

    Sükösd, Zsuzsanna; Andersen, Ebbe Sloth; Seemann, Ernst Stefan

    2015-01-01

    of the HIV-1 genome is highly variable in most regions, with a limited number of stable and conserved RNA secondary structures. Most interesting, a set of long distance interactions form a core organizing structure (COS) that organize the genome into three major structural domains. Despite overlapping...

  11. cDNA, genomic sequence cloning and overexpression of ribosomal ...

    African Journals Online (AJOL)

    RPS16 of eukaryote is a component of the 40S small ribosomal subunit encoded by RPS16 gene and is also a homolog of prokaryotic RPS9. The cDNA and genomic sequence of RPS16 was cloned successfully for the first time from the Giant Panda (Ailuropoda melanoleuca) using reverse transcription-polymerase chain ...

  12. Widespread of horizontal gene transfer in the human genome.

    Science.gov (United States)

    Huang, Wenze; Tsai, Lillian; Li, Yulong; Hua, Nan; Sun, Chen; Wei, Chaochun

    2017-04-04

    A fundamental concept in biology is that heritable material is passed from parents to offspring, a process called vertical gene transfer. An alternative mechanism of gene acquisition is through horizontal gene transfer (HGT), which involves movement of genetic materials between different species. Horizontal gene transfer has been found prevalent in prokaryotes but very rare in eukaryote. In this paper, we investigate horizontal gene transfer in the human genome. From the pair-wise alignments between human genome and 53 vertebrate genomes, 1,467 human genome regions (2.6 M bases) from all chromosomes were found to be more conserved with non-mammals than with most mammals. These human genome regions involve 642 known genes, which are enriched with ion binding. Compared to known horizontal gene transfer regions in the human genome, there were few overlapping regions, which indicated horizontal gene transfer is more common than we expected in the human genome. Horizontal gene transfer impacts hundreds of human genes and this study provided insight into potential mechanisms of HGT in the human genome.

  13. Eukaryotic DNA Replicases

    KAUST Repository

    Zaher, Manal S.; Oke, Muse; Hamdan, Samir

    2014-01-01

    The current model of the eukaryotic DNA replication fork includes three replicative DNA polymerases, polymerase α/primase complex (Pol α), polymerase δ (Pol δ), and polymerase ε (Pol ε). The primase synthesizes 8–12 nucleotide RNA primers that are extended by the DNA polymerization activity of Pol α into 30–35 nucleotide RNA-DNA primers. Replication factor C (RFC) opens the polymerase clamp-like processivity factor, proliferating cell nuclear antigen (PCNA), and loads it onto the primer-template. Pol δ utilizes PCNA to mediate highly processive DNA synthesis, while Pol ε has intrinsic high processivity that is modestly stimulated by PCNA. Pol ε replicates the leading strand and Pol δ replicates the lagging strand in a division of labor that is not strict. The three polymerases are comprised of multiple subunits and share unifying features in their large catalytic and B subunits. The remaining subunits are evolutionarily not related and perform diverse functions. The catalytic subunits are members of family B, which are distinguished by their larger sizes due to inserts in their N- and C-terminal regions. The sizes of these inserts vary among the three polymerases, and their functions remain largely unknown. Strikingly, the quaternary structures of Pol α, Pol δ, and Pol ε are arranged similarly. The catalytic subunits adopt a globular structure that is linked via its conserved C-terminal region to the B subunit. The remaining subunits are linked to the catalytic and B subunits in a highly flexible manner.

  14. Eukaryotic DNA Replicases

    KAUST Repository

    Zaher, Manal S.

    2014-11-21

    The current model of the eukaryotic DNA replication fork includes three replicative DNA polymerases, polymerase α/primase complex (Pol α), polymerase δ (Pol δ), and polymerase ε (Pol ε). The primase synthesizes 8–12 nucleotide RNA primers that are extended by the DNA polymerization activity of Pol α into 30–35 nucleotide RNA-DNA primers. Replication factor C (RFC) opens the polymerase clamp-like processivity factor, proliferating cell nuclear antigen (PCNA), and loads it onto the primer-template. Pol δ utilizes PCNA to mediate highly processive DNA synthesis, while Pol ε has intrinsic high processivity that is modestly stimulated by PCNA. Pol ε replicates the leading strand and Pol δ replicates the lagging strand in a division of labor that is not strict. The three polymerases are comprised of multiple subunits and share unifying features in their large catalytic and B subunits. The remaining subunits are evolutionarily not related and perform diverse functions. The catalytic subunits are members of family B, which are distinguished by their larger sizes due to inserts in their N- and C-terminal regions. The sizes of these inserts vary among the three polymerases, and their functions remain largely unknown. Strikingly, the quaternary structures of Pol α, Pol δ, and Pol ε are arranged similarly. The catalytic subunits adopt a globular structure that is linked via its conserved C-terminal region to the B subunit. The remaining subunits are linked to the catalytic and B subunits in a highly flexible manner.

  15. Identification of multiple distinct Snf2 subfamilies with conserved structural motifs.

    Science.gov (United States)

    Flaus, Andrew; Martin, David M A; Barton, Geoffrey J; Owen-Hughes, Tom

    2006-01-01

    The Snf2 family of helicase-related proteins includes the catalytic subunits of ATP-dependent chromatin remodelling complexes found in all eukaryotes. These act to regulate the structure and dynamic properties of chromatin and so influence a broad range of nuclear processes. We have exploited progress in genome sequencing to assemble a comprehensive catalogue of over 1300 Snf2 family members. Multiple sequence alignment of the helicase-related regions enables 24 distinct subfamilies to be identified, a considerable expansion over earlier surveys. Where information is known, there is a good correlation between biological or biochemical function and these assignments, suggesting Snf2 family motor domains are tuned for specific tasks. Scanning of complete genomes reveals all eukaryotes contain members of multiple subfamilies, whereas they are less common and not ubiquitous in eubacteria or archaea. The large sample of Snf2 proteins enables additional distinguishing conserved sequence blocks within the helicase-like motor to be identified. The establishment of a phylogeny for Snf2 proteins provides an opportunity to make informed assignments of function, and the identification of conserved motifs provides a framework for understanding the mechanisms by which these proteins function.

  16. The complete mitochondrial genome of Gossypium hirsutum and evolutionary analysis of higher plant mitochondrial genomes.

    Science.gov (United States)

    Liu, Guozheng; Cao, Dandan; Li, Shuangshuang; Su, Aiguo; Geng, Jianing; Grover, Corrinne E; Hu, Songnian; Hua, Jinping

    2013-01-01

    Mitochondria are the main manufacturers of cellular ATP in eukaryotes. The plant mitochondrial genome contains large number of foreign DNA and repeated sequences undergone frequently intramolecular recombination. Upland Cotton (Gossypium hirsutum L.) is one of the main natural fiber crops and also an important oil-producing plant in the world. Sequencing of the cotton mitochondrial (mt) genome could be helpful for the evolution research of plant mt genomes. We utilized 454 technology for sequencing and combined with Fosmid library of the Gossypium hirsutum mt genome screening and positive clones sequencing and conducted a series of evolutionary analysis on Cycas taitungensis and 24 angiosperms mt genomes. After data assembling and contigs joining, the complete mitochondrial genome sequence of G. hirsutum was obtained. The completed G.hirsutum mt genome is 621,884 bp in length, and contained 68 genes, including 35 protein genes, four rRNA genes and 29 tRNA genes. Five gene clusters are found conserved in all plant mt genomes; one and four clusters are specifically conserved in monocots and dicots, respectively. Homologous sequences are distributed along the plant mt genomes and species closely related share the most homologous sequences. For species that have both mt and chloroplast genome sequences available, we checked the location of cp-like migration and found several fragments closely linked with mitochondrial genes. The G. hirsutum mt genome possesses most of the common characters of higher plant mt genomes. The existence of syntenic gene clusters, as well as the conservation of some intergenic sequences and genic content among the plant mt genomes suggest that evolution of mt genomes is consistent with plant taxonomy but independent among different species.

  17. A structural model of the genome packaging process in a membrane-containing double stranded DNA virus.

    Directory of Open Access Journals (Sweden)

    Chuan Hong

    2014-12-01

    Full Text Available Two crucial steps in the virus life cycle are genome encapsidation to form an infective virion and genome exit to infect the next host cell. In most icosahedral double-stranded (ds DNA viruses, the viral genome enters and exits the capsid through a unique vertex. Internal membrane-containing viruses possess additional complexity as the genome must be translocated through the viral membrane bilayer. Here, we report the structure of the genome packaging complex with a membrane conduit essential for viral genome encapsidation in the tailless icosahedral membrane-containing bacteriophage PRD1. We utilize single particle electron cryo-microscopy (cryo-EM and symmetry-free image reconstruction to determine structures of PRD1 virion, procapsid, and packaging deficient mutant particles. At the unique vertex of PRD1, the packaging complex replaces the regular 5-fold structure and crosses the lipid bilayer. These structures reveal that the packaging ATPase P9 and the packaging efficiency factor P6 form a dodecameric portal complex external to the membrane moiety, surrounded by ten major capsid protein P3 trimers. The viral transmembrane density at the special vertex is assigned to be a hexamer of heterodimer of proteins P20 and P22. The hexamer functions as a membrane conduit for the DNA and as a nucleating site for the unique vertex assembly. Our structures show a conformational alteration in the lipid membrane after the P9 and P6 are recruited to the virion. The P8-genome complex is then packaged into the procapsid through the unique vertex while the genome terminal protein P8 functions as a valve that closes the channel once the genome is inside. Comparing mature virion, procapsid, and mutant particle structures led us to propose an assembly pathway for the genome packaging apparatus in the PRD1 virion.

  18. A structural model of the genome packaging process in a membrane-containing double stranded DNA virus.

    Science.gov (United States)

    Hong, Chuan; Oksanen, Hanna M; Liu, Xiangan; Jakana, Joanita; Bamford, Dennis H; Chiu, Wah

    2014-12-01

    Two crucial steps in the virus life cycle are genome encapsidation to form an infective virion and genome exit to infect the next host cell. In most icosahedral double-stranded (ds) DNA viruses, the viral genome enters and exits the capsid through a unique vertex. Internal membrane-containing viruses possess additional complexity as the genome must be translocated through the viral membrane bilayer. Here, we report the structure of the genome packaging complex with a membrane conduit essential for viral genome encapsidation in the tailless icosahedral membrane-containing bacteriophage PRD1. We utilize single particle electron cryo-microscopy (cryo-EM) and symmetry-free image reconstruction to determine structures of PRD1 virion, procapsid, and packaging deficient mutant particles. At the unique vertex of PRD1, the packaging complex replaces the regular 5-fold structure and crosses the lipid bilayer. These structures reveal that the packaging ATPase P9 and the packaging efficiency factor P6 form a dodecameric portal complex external to the membrane moiety, surrounded by ten major capsid protein P3 trimers. The viral transmembrane density at the special vertex is assigned to be a hexamer of heterodimer of proteins P20 and P22. The hexamer functions as a membrane conduit for the DNA and as a nucleating site for the unique vertex assembly. Our structures show a conformational alteration in the lipid membrane after the P9 and P6 are recruited to the virion. The P8-genome complex is then packaged into the procapsid through the unique vertex while the genome terminal protein P8 functions as a valve that closes the channel once the genome is inside. Comparing mature virion, procapsid, and mutant particle structures led us to propose an assembly pathway for the genome packaging apparatus in the PRD1 virion.

  19. Low-pass sequencing for microbial comparative genomics

    Directory of Open Access Journals (Sweden)

    Kennedy Sean

    2004-01-01

    Full Text Available Abstract Background We studied four extremely halophilic archaea by low-pass shotgun sequencing: (1 the metabolically versatile Haloarcula marismortui; (2 the non-pigmented Natrialba asiatica; (3 the psychrophile Halorubrum lacusprofundi and (4 the Dead Sea isolate Halobaculum gomorrense. Approximately one thousand single pass genomic sequences per genome were obtained. The data were analyzed by comparative genomic analyses using the completed Halobacterium sp. NRC-1 genome as a reference. Low-pass shotgun sequencing is a simple, inexpensive, and rapid approach that can readily be performed on any cultured microbe. Results As expected, the four archaeal halophiles analyzed exhibit both bacterial and eukaryotic characteristics as well as uniquely archaeal traits. All five halophiles exhibit greater than sixty percent GC content and low isoelectric points (pI for their predicted proteins. Multiple insertion sequence (IS elements, often involved in genome rearrangements, were identified in H. lacusprofundi and H. marismortui. The core biological functions that govern cellular and genetic mechanisms of H. sp. NRC-1 appear to be conserved in these four other halophiles. Multiple TATA box binding protein (TBP and transcription factor IIB (TFB homologs were identified from most of the four shotgunned halophiles. The reconstructed molecular tree of all five halophiles shows a large divergence between these species, but with the closest relationship being between H. sp. NRC-1 and H. lacusprofundi. Conclusion Despite the diverse habitats of these species, all five halophiles share (1 high GC content and (2 low protein isoelectric points, which are characteristics associated with environmental exposure to UV radiation and hypersalinity, respectively. Identification of multiple IS elements in the genome of H. lacusprofundi and H. marismortui suggest that genome structure and dynamic genome reorganization might be similar to that previously observed in the

  20. Large-scale trends in the evolution of gene structures within 11 animal genomes.

    Directory of Open Access Journals (Sweden)

    Mark Yandell

    2006-03-01

    Full Text Available We have used the annotations of six animal genomes (Homo sapiens, Mus musculus, Ciona intestinalis, Drosophila melanogaster, Anopheles gambiae, and Caenorhabditis elegans together with the sequences of five unannotated Drosophila genomes to survey changes in protein sequence and gene structure over a variety of timescales--from the less than 5 million years since the divergence of D. simulans and D. melanogaster to the more than 500 million years that have elapsed since the Cambrian explosion. To do so, we have developed a new open-source software library called CGL (for "Comparative Genomics Library". Our results demonstrate that change in intron-exon structure is gradual, clock-like, and largely independent of coding-sequence evolution. This means that genome annotations can be used in new ways to inform, corroborate, and test conclusions drawn from comparative genomics analyses that are based upon protein and nucleotide sequence similarities.

  1. Characterization of bud emergence 46 (BEM46) protein: Sequence, structural, phylogenetic and subcellular localization analyses

    International Nuclear Information System (INIS)

    Kumar, Abhishek; Kollath-Leiß, Krisztina; Kempken, Frank

    2013-01-01

    Highlights: •All eukaryotes have at least a single copy of a bem46 ortholog. •The catalytic triad of BEM46 is illustrated using sequence and structural analysis. •We identified indels in the conserved domain of BEM46 protein. •Localization studies of BEM46 protein were carried out using GFP-fusion tagging. -- Abstract: The bud emergence 46 (BEM46) protein from Neurospora crassa belongs to the α/β-hydrolase superfamily. Recently, we have reported that the BEM46 protein is localized in the perinuclear ER and also forms spots close by the plasma membrane. The protein appears to be required for cell type-specific polarity formation in N. crassa. Furthermore, initial studies suggested that the BEM46 amino acid sequence is conserved in eukaryotes and is considered to be one of the widespread conserved “known unknown” eukaryotic genes. This warrants for a comprehensive phylogenetic analysis of this superfamily to unravel origin and molecular evolution of these genes in different eukaryotes. Herein, we observe that all eukaryotes have at least a single copy of a bem46 ortholog. Upon scanning of these proteins in various genomes, we find that there are expansions leading into several paralogs in vertebrates. Usingcomparative genomic analyses, we identified insertion/deletions (indels) in the conserved domain of BEM46 protein, which allow to differentiate fungal classes such as ascomycetes from basidiomycetes. We also find that exonic indels are able to differentiate BEM46 homologs of different eukaryotic lineage. Furthermore, we unravel that BEM46 protein from N. crassa possess a novel endoplasmic-retention signal (PEKK) using GFP-fusion tagging experiments. We propose that three residues namely a serine 188S, a histidine 292H and an aspartic acid 262D are most critical residues, forming a catalytic triad in BEM46 protein from N. crassa. We carried out a comprehensive study on bem46 genes from a molecular evolution perspective with combination of functional

  2. Characterization of bud emergence 46 (BEM46) protein: Sequence, structural, phylogenetic and subcellular localization analyses

    Energy Technology Data Exchange (ETDEWEB)

    Kumar, Abhishek; Kollath-Leiß, Krisztina; Kempken, Frank, E-mail: fkempken@bot.uni-kiel.de

    2013-08-30

    Highlights: •All eukaryotes have at least a single copy of a bem46 ortholog. •The catalytic triad of BEM46 is illustrated using sequence and structural analysis. •We identified indels in the conserved domain of BEM46 protein. •Localization studies of BEM46 protein were carried out using GFP-fusion tagging. -- Abstract: The bud emergence 46 (BEM46) protein from Neurospora crassa belongs to the α/β-hydrolase superfamily. Recently, we have reported that the BEM46 protein is localized in the perinuclear ER and also forms spots close by the plasma membrane. The protein appears to be required for cell type-specific polarity formation in N. crassa. Furthermore, initial studies suggested that the BEM46 amino acid sequence is conserved in eukaryotes and is considered to be one of the widespread conserved “known unknown” eukaryotic genes. This warrants for a comprehensive phylogenetic analysis of this superfamily to unravel origin and molecular evolution of these genes in different eukaryotes. Herein, we observe that all eukaryotes have at least a single copy of a bem46 ortholog. Upon scanning of these proteins in various genomes, we find that there are expansions leading into several paralogs in vertebrates. Usingcomparative genomic analyses, we identified insertion/deletions (indels) in the conserved domain of BEM46 protein, which allow to differentiate fungal classes such as ascomycetes from basidiomycetes. We also find that exonic indels are able to differentiate BEM46 homologs of different eukaryotic lineage. Furthermore, we unravel that BEM46 protein from N. crassa possess a novel endoplasmic-retention signal (PEKK) using GFP-fusion tagging experiments. We propose that three residues namely a serine 188S, a histidine 292H and an aspartic acid 262D are most critical residues, forming a catalytic triad in BEM46 protein from N. crassa. We carried out a comprehensive study on bem46 genes from a molecular evolution perspective with combination of functional

  3. In vivo genome-wide profiling of RNA secondary structure reveals novel regulatory features.

    Science.gov (United States)

    Ding, Yiliang; Tang, Yin; Kwok, Chun Kit; Zhang, Yu; Bevilacqua, Philip C; Assmann, Sarah M

    2014-01-30

    RNA structure has critical roles in processes ranging from ligand sensing to the regulation of translation, polyadenylation and splicing. However, a lack of genome-wide in vivo RNA structural data has limited our understanding of how RNA structure regulates gene expression in living cells. Here we present a high-throughput, genome-wide in vivo RNA structure probing method, structure-seq, in which dimethyl sulphate methylation of unprotected adenines and cytosines is identified by next-generation sequencing. Application of this method to Arabidopsis thaliana seedlings yielded the first in vivo genome-wide RNA structure map at nucleotide resolution for any organism, with quantitative structural information across more than 10,000 transcripts. Our analysis reveals a three-nucleotide periodic repeat pattern in the structure of coding regions, as well as a less-structured region immediately upstream of the start codon, and shows that these features are strongly correlated with translation efficiency. We also find patterns of strong and weak secondary structure at sites of alternative polyadenylation, as well as strong secondary structure at 5' splice sites that correlates with unspliced events. Notably, in vivo structures of messenger RNAs annotated for stress responses are poorly predicted in silico, whereas mRNA structures of genes related to cell function maintenance are well predicted. Global comparison of several structural features between these two categories shows that the mRNAs associated with stress responses tend to have more single-strandedness, longer maximal loop length and higher free energy per nucleotide, features that may allow these RNAs to undergo conformational changes in response to environmental conditions. Structure-seq allows the RNA structurome and its biological roles to be interrogated on a genome-wide scale and should be applicable to any organism.

  4. Does selection against transcriptional interference shape retroelement-free regions in mammalian genomes?

    DEFF Research Database (Denmark)

    Mourier, Tobias; Willerslev, Eske

    2008-01-01

    in generating and maintaining retroelement-free regions in the human genome. METHODOLOGY/PRINCIPAL FINDINGS: Based on the known transcriptional properties of retroelements, we expect long interspersed elements (LINEs) to be able to display a high degree of transcriptional interference. In contrast, we expect......BACKGROUND: Eukaryotic genomes are scattered with retroelements that proliferate through retrotransposition. Although retroelements make up around 40 percent of the human genome, large regions are found to be completely devoid of retroelements. This has been hypothesised to be a result of genomic...... activity of LINEs has been identified previously. CONCLUSIONS/SIGNIFICANCE: Our observations are consistent with the notion that selection against transcriptional interference has contributed to the maintenance and/or generation of retroelement-free regions in the human genome....

  5. David and Goliath: chemical perturbation of eukaryotes by bacteria.

    Science.gov (United States)

    Ho, Louis K; Nodwell, Justin R

    2016-03-01

    Environmental microbes produce biologically active small molecules that have been mined extensively as antibiotics and a smaller number of drugs that act on eukaryotic cells. It is known that there are additional bioactives to be discovered from this source. While the discovery of new antibiotics is challenged by the frequent discovery of known compounds, we contend that the eukaryote-active compounds may be less saturated. Indeed, despite there being far fewer eukaryotic-active natural products these molecules interact with a far richer diversity of molecular and cellular targets.

  6. Bacteriophage T5 encodes a homolog of the eukaryotic transcription coactivator PC4 implicated in recombination-dependent DNA replication.

    Science.gov (United States)

    Steigemann, Birthe; Schulz, Annina; Werten, Sebastiaan

    2013-11-15

    The RNA polymerase II cofactor PC4 globally regulates transcription of protein-encoding genes through interactions with unwinding DNA, the basal transcription machinery and transcription activators. Here, we report the surprising identification of PC4 homologs in all sequenced representatives of the T5 family of bacteriophages, as well as in an archaeon and seven phyla of eubacteria. We have solved the crystal structure of the full-length T5 protein at 1.9Å, revealing a striking resemblance to the characteristic single-stranded DNA (ssDNA)-binding core domain of PC4. Intriguing novel structural features include a potential regulatory region at the N-terminus and a C-terminal extension of the homodimerisation interface. The genome organisation of T5-related bacteriophages points at involvement of the PC4 homolog in recombination-dependent DNA replication, strongly suggesting that the protein corresponds to the hitherto elusive replicative ssDNA-binding protein of the T5 family. Our findings imply that PC4-like factors intervene in multiple unwinding-related processes by acting as versatile modifiers of nucleic acid conformation and raise the possibility that the eukaryotic transcription coactivator derives from ancestral DNA replication, recombination and repair factors. © 2013.

  7. Reproduction, symbiosis, and the eukaryotic cell

    Science.gov (United States)

    Godfrey-Smith, Peter

    2015-01-01

    This paper develops a conceptual framework for addressing questions about reproduction, individuality, and the units of selection in symbiotic associations, with special attention to the origin of the eukaryotic cell. Three kinds of reproduction are distinguished, and a possible evolutionary sequence giving rise to a mitochondrion-containing eukaryotic cell from an endosymbiotic partnership is analyzed as a series of transitions between each of the three forms of reproduction. The sequence of changes seen in this “egalitarian” evolutionary transition is compared with those that apply in “fraternal” transitions, such as the evolution of multicellularity in animals. PMID:26286983

  8. Scarless Cas9 Assisted Recombineering (no-SCAR) in Escherichia coli, an Easy-to-Use System for Genome Editing.

    Science.gov (United States)

    Reisch, Christopher R; Prather, Kristala L J

    2017-01-05

    The discovery and development of genome editing systems that leverage the site-specific DNA endonuclease system CRISPR/Cas9 has fundamentally changed the ease and speed of genome editing in many organisms. In eukaryotes, the CRISPR/Cas9 system utilizes a "guide" RNA to enable the Cas9 nuclease to make a double-strand break at a particular genome locus, which is repaired by non-homologous end joining (NHEJ) repair enzymes, often generating random mutations in the process. A specific alteration of the target genome can also be generated by supplying a DNA template in vivo with a desired mutation, which is incorporated by homology-directed repair. However, E. coli lacks robust systems for double-strand break repair. Thus, in contrast to eukaryotes, targeting E. coli chromosomal DNA with Cas9 causes cell death. However, Cas9-mediated killing of bacteria can be exploited to select against cells with a specified genotype within a mixed population. In combination with the well described λ-Red system for recombination in E. coli, we created a highly efficient system for marker-free and scarless genome editing. © 2017 by John Wiley & Sons, Inc. Copyright © 2017 John Wiley & Sons, Inc.

  9. Structure, Function, and Evolution of Rice Centromeres

    Energy Technology Data Exchange (ETDEWEB)

    Jiang, Jiming

    2010-02-04

    The centromere is the most characteristic landmark of eukaryotic chromosomes. Centromeres function as the site for kinetochore assembly and spindle attachment, allowing for the faithful pairing and segregation of sister chromatids during cell division. Characterization of centromeric DNA is not only essential to understand the structure and organization of plant genomes, but it is also a critical step in the development of plant artificial chromosomes. The centromeres of most model eukaryotic species, consist predominantly of long arrays of satellite DNA. Determining the precise DNA boundary of a centromere has proven to be a difficult task in multicellular eukaryotes. We have successfully cloned and sequenced the centromere of rice chromosome 8 (Cen8), representing the first fully sequenced centromere from any multicellular eukaryotes. The functional core of Cen8 spans ~800 kb of DNA, which was determined by chromatin immunoprecipitation (ChIP) using an antibody against the rice centromere-specific H3 histone. We discovered 16 actively transcribed genes distributed throughout the Cen8 region. In addition to Cen8, we have characterized eight additional rice centromeres using the next generation sequencing technology. We discovered four subfamilies of the CRR retrotransposon that is highly enriched in rice centromeres. CRR elements are constitutively transcribed and different CRR subfamilies are differentially processed by RNAi. These results suggest that different CRR subfamilies may play different roles in the RNAi-mediated pathway for formation and maintenance of centromeric chromatin.

  10. Insight into eukaryotic topoisomerase II-inhibiting fused heterocyclic compounds in human cancer cell lines by molecular docking.

    Science.gov (United States)

    Taskin, T; Yilmaz, S; Yildiz, I; Yalcin, I; Aki, E

    2012-01-01

    Etoposide is effective as an anti-tumour drug by inhibiting eukaryotic DNA topoisomerase II via establishing a covalent complex with DNA. Unfortunately, its wide therapeutic application is often hindered by multidrug resistance (MDR), low water solubility and toxicity. In our previous study, new derivatives of benzoxazoles, benzimidazoles and related fused heterocyclic compounds, which exhibited significant eukaryotic DNA topoisomerase II inhibitory activity, were synthesized and exhibited better inhibitory activity compared with the drug etoposide itself. To expose the binding interactions between the eukaryotic topoisomerase II and the active heterocyclic compounds, docking studies were performed, using the software Discovery Studio 2.1, based on the crystal structure of the Topo IIA-bound G-segment DNA (PDB ID: 2RGR). The research was conducted on a selected set of 31 fused heterocyclic compounds with variation in structure and activity. The structural analyses indicate coordinate and hydrogen bonding interactions, van der Waals interactions and hydrophobic interactions between ligands and the protein, as Topo IIA-bound G-segment DNA are responsible for the preference of inhibition and potency. Collectively, the results demonstrate that the compounds 1a, 1c, 3b, 3c, 3e and 4a are significant anti-tumour drug candidates that should be further studied.

  11. A synergism between adaptive effects and evolvability drives whole genome duplication to fixation

    NARCIS (Netherlands)

    Cuypers, Thomas D; Hogeweg, Paulien; Hogeweg, P.

    Whole genome duplication has shaped eukaryotic evolutionary history and has been associated with drastic environmental change and species radiation. While the most common fate of WGD duplicates is a return to single copy, retained duplicates have been found enriched for highly interacting genes.

  12. If the cap fits, wear it: an overview of telomeric structures over evolution.

    Science.gov (United States)

    Fulcher, Nick; Derboven, Elisa; Valuchova, Sona; Riha, Karel

    2014-03-01

    Genome organization into linear chromosomes likely represents an important evolutionary innovation that has permitted the development of the sexual life cycle; this process has consequently advanced nuclear expansion and increased complexity of eukaryotic genomes. Chromosome linearity, however, poses a major challenge to the internal cellular machinery. The need to efficiently recognize and repair DNA double-strand breaks that occur as a consequence of DNA damage presents a constant threat to native chromosome ends known as telomeres. In this review, we present a comparative survey of various solutions to the end protection problem, maintaining an emphasis on DNA structure. This begins with telomeric structures derived from a subset of prokaryotes, mitochondria, and viruses, and will progress into the typical telomere structure exhibited by higher organisms containing TTAGG-like tandem sequences. We next examine non-canonical telomeres from Drosophila melanogaster, which comprise arrays of retrotransposons. Finally, we discuss telomeric structures in evolution and possible switches between canonical and non-canonical solutions to chromosome end protection.

  13. Expression and genomic analysis of midasin, a novel and highly conserved AAA protein distantly related to dynein

    Directory of Open Access Journals (Sweden)

    Gibbons I R

    2002-07-01

    Full Text Available Abstract Background The largest open reading frame in the Saccharomyces genome encodes midasin (MDN1p, YLR106p, an AAA ATPase of 560 kDa that is essential for cell viability. Orthologs of midasin have been identified in the genome projects for Drosophila, Arabidopsis, and Schizosaccharomyces pombe. Results Midasin is present as a single-copy gene encoding a well-conserved protein of ~600 kDa in all eukaryotes for which data are available. In humans, the gene maps to 6q15 and encodes a predicted protein of 5596 residues (632 kDa. Sequence alignments of midasin from humans, yeast, Giardia and Encephalitozoon indicate that its domain structure comprises an N-terminal domain (35 kDa, followed by an AAA domain containing six tandem AAA protomers (~30 kDa each, a linker domain (260 kDa, an acidic domain (~70 kDa containing 35–40% aspartate and glutamate, and a carboxy-terminal M-domain (30 kDa that possesses MIDAS sequence motifs and is homologous to the I-domain of integrins. Expression of hemagglutamin-tagged midasin in yeast demonstrates a polypeptide of the anticipated size that is localized principally in the nucleus. Conclusions The highly conserved structure of midasin in eukaryotes, taken in conjunction with its nuclear localization in yeast, suggests that midasin may function as a nuclear chaperone and be involved in the assembly/disassembly of macromolecular complexes in the nucleus. The AAA domain of midasin is evolutionarily related to that of dynein, but it appears to lack a microtubule-binding site.

  14. Genomics and the making of yeast biodiversity.

    Science.gov (United States)

    Hittinger, Chris Todd; Rokas, Antonis; Bai, Feng-Yan; Boekhout, Teun; Gonçalves, Paula; Jeffries, Thomas W; Kominek, Jacek; Lachance, Marc-André; Libkind, Diego; Rosa, Carlos A; Sampaio, José Paulo; Kurtzman, Cletus P

    2015-12-01

    Yeasts are unicellular fungi that do not form fruiting bodies. Although the yeast lifestyle has evolved multiple times, most known species belong to the subphylum Saccharomycotina (syn. Hemiascomycota, hereafter yeasts). This diverse group includes the premier eukaryotic model system, Saccharomyces cerevisiae; the common human commensal and opportunistic pathogen, Candida albicans; and over 1000 other known species (with more continuing to be discovered). Yeasts are found in every biome and continent and are more genetically diverse than angiosperms or chordates. Ease of culture, simple life cycles, and small genomes (∼10-20Mbp) have made yeasts exceptional models for molecular genetics, biotechnology, and evolutionary genomics. Here we discuss recent developments in understanding the genomic underpinnings of the making of yeast biodiversity, comparing and contrasting natural and human-associated evolutionary processes. Only a tiny fraction of yeast biodiversity and metabolic capabilities has been tapped by industry and science. Expanding the taxonomic breadth of deep genomic investigations will further illuminate how genome function evolves to encode their diverse metabolisms and ecologies. Copyright © 2015 Elsevier Ltd. All rights reserved.

  15. The genome of the extremophile crucifer Thellungiella parvula

    KAUST Repository

    Dassanayake, Maheshi; Oh, Dongha; Haas, Jeffrey S.; Herná ndez, Á lvaro Gonzalez; Hong, Hyewon; Ali, Shahjahan; Yun, Daejin; Bressan, Ray Anthony; Zhu, Jian-Kang; Bohnert, Hans Jü rgen; Cheeseman, John McP

    2011-01-01

    Thellungiella parvula is related to Arabidopsis thaliana and is endemic to saline, resource-poor habitats, making it a model for the evolution of plant adaptation to extreme environments. Here we present the draft genome for this extremophile species. Exclusively by next generation sequencing, we obtained the de novo assembled genome in 1,496 gap-free contigs, closely approximating the estimated genome size of 140 Mb. We anchored these contigs to seven pseudo chromosomes without the use of maps. We show that short reads can be assembled to a near-complete chromosome level for a eukaryotic species lacking prior genetic information. The sequence identifies a number of tandem duplications that, by the nature of the duplicated genes, suggest a possible basis for T. parvula's extremophile lifestyle. Our results provide essential background for developing genomically influenced testable hypotheses for the evolution of environmental stress tolerance. © 2011 Nature America, Inc. All rights reserved.

  16. The Impact of Structural Genomics: Expectations and Outcomes

    Energy Technology Data Exchange (ETDEWEB)

    Chandonia, John-Marc; Brenner, Steven E.

    2005-12-21

    Structural Genomics (SG) projects aim to expand our structural knowledge of biological macromolecules, while lowering the average costs of structure determination. We quantitatively analyzed the novelty, cost, and impact of structures solved by SG centers, and contrast these results with traditional structural biology. The first structure from a protein family is particularly important to reveal the fold and ancient relationships to other proteins. In the last year, approximately half of such structures were solved at a SG center rather than in a traditional laboratory. Furthermore, the cost of solving a structure at the most efficient U.S. center has now dropped to one-quarter the estimated cost of solving a structure by traditional methods. However, top structural biology laboratories are much more efficient than the average, and comparable to SG centers despite working on very challenging structures. Moreover, traditional structural biology papers are cited significantly more often, suggesting greater current impact.

  17. Hamiltonella defensa, genome evolution of protective bacterial endosymbiont from pathogenic ancestors.

    Science.gov (United States)

    Degnan, Patrick H; Yu, Yeisoo; Sisneros, Nicholas; Wing, Rod A; Moran, Nancy A

    2009-06-02

    Eukaryotes engage in a multitude of beneficial and deleterious interactions with bacteria. Hamiltonella defensa, an endosymbiont of aphids and other sap-feeding insects, protects its aphid host from attack by parasitoid wasps. Thus H. defensa is only conditionally beneficial to hosts, unlike ancient nutritional symbionts, such as Buchnera, that are obligate. Similar to pathogenic bacteria, H. defensa is able to invade naive hosts and circumvent host immune responses. We have sequenced the genome of H. defensa to identify possible mechanisms that underlie its persistence in healthy aphids and protection from parasitoids. The 2.1-Mb genome has undergone significant reduction in size relative to its closest free-living relatives, which include Yersinia and Serratia species (4.6-5.4 Mb). Auxotrophic for 8 of the 10 essential amino acids, H. defensa is reliant upon the essential amino acids produced by Buchnera. Despite these losses, the H. defensa genome retains more genes and pathways for a variety of cell structures and processes than do obligate symbionts, such as Buchnera. Furthermore, putative pathogenicity loci, encoding type-3 secretion systems, and toxin homologs, which are absent in obligate symbionts, are abundant in the H. defensa genome, as are regulatory genes that likely control the timing of their expression. The genome is also littered with mobile DNA, including phage-derived genes, plasmids, and insertion-sequence elements, highlighting its dynamic nature and the continued role horizontal gene transfer plays in shaping it.

  18. Exploring the role of genome and structural ions in preventing viral capsid collapse during dehydration

    Science.gov (United States)

    Martín-González, Natalia; Guérin Darvas, Sofía M.; Durana, Aritz; Marti, Gerardo A.; Guérin, Diego M. A.; de Pablo, Pedro J.

    2018-03-01

    Even though viruses evolve mainly in liquid milieu, their horizontal transmission routes often include episodes of dry environment. Along their life cycle, some insect viruses, such as viruses from the Dicistroviridae family, withstand dehydrated conditions with presently unknown consequences to their structural stability. Here, we use atomic force microscopy to monitor the structural changes of viral particles of Triatoma virus (TrV) after desiccation. Our results demonstrate that TrV capsids preserve their genome inside, conserving their height after exposure to dehydrating conditions, which is in stark contrast with other viruses that expel their genome when desiccated. Moreover, empty capsids (without genome) resulted in collapsed particles after desiccation. We also explored the role of structural ions in the dehydration process of the virions (capsid containing genome) by chelating the accessible cations from the external solvent milieu. We observed that ion suppression helps to keep the virus height upon desiccation. Our results show that under drying conditions, the genome of TrV prevents the capsid from collapsing during dehydration, while the structural ions are responsible for promoting solvent exchange through the virion wall.

  19. Large clusters of co-expressed genes in the Drosophila genome.

    Science.gov (United States)

    Boutanaev, Alexander M; Kalmykova, Alla I; Shevelyov, Yuri Y; Nurminsky, Dmitry I

    2002-12-12

    Clustering of co-expressed, non-homologous genes on chromosomes implies their co-regulation. In lower eukaryotes, co-expressed genes are often found in pairs. Clustering of genes that share aspects of transcriptional regulation has also been reported in higher eukaryotes. To advance our understanding of the mode of coordinated gene regulation in multicellular organisms, we performed a genome-wide analysis of the chromosomal distribution of co-expressed genes in Drosophila. We identified a total of 1,661 testes-specific genes, one-third of which are clustered on chromosomes. The number of clusters of three or more genes is much higher than expected by chance. We observed a similar trend for genes upregulated in the embryo and in the adult head, although the expression pattern of individual genes cannot be predicted on the basis of chromosomal position alone. Our data suggest that the prevalent mechanism of transcriptional co-regulation in higher eukaryotes operates with extensive chromatin domains that comprise multiple genes.

  20. Molecular Dynamics Investigation of Cl− and Water Transport through a Eukaryotic CLC Transporter

    OpenAIRE

    Cheng, Mary Hongying; Coalson, Rob D.

    2012-01-01

    Early crystal structures of prokaryotic CLC proteins identified three Cl– binding sites: internal (Sint), central (Scen), and external (Sext). A conserved external GLU (GLUex) residue acts as a gate competing for Sext. Recently, the first crystal structure of a eukaryotic transporter, CmCLC, revealed that in this transporter GLUex competes instead for Scen. Here, we use molecular dynamics simulations to investigate Cl– transport through CmCLC. The gating and Cl–/H+ transport cycle are inferre...

  1. Evolution of pH buffers and water homeostasis in eukaryotes: homology between humans and Acanthamoeba proteins.

    Science.gov (United States)

    Baig, Abdul M; Zohaib, R; Tariq, S; Ahmad, H R

    2018-02-01

    This study intended to trace the evolution of acid-base buffers and water homeostasis in eukaryotes. Acanthamoeba castellanii  was selected as a model unicellular eukaryote for this purpose. Homologies of proteins involved in pH and water regulatory mechanisms at cellular levels were compared between humans and A. castellanii. Amino acid sequence homology, structural homology, 3D modeling and docking prediction were done to show the extent of similarities between carbonic anhydrase 1 (CA1), aquaporin (AQP), band-3 protein and H + pump. Experimental assays were done with acetazolamide (AZM), brinzolamide and mannitol to observe their effects on the trophozoites of  A. castellanii.  The human CA1, AQP, band-3 protein and H + -transport proteins revealed similar proteins in Acanthamoeba. Docking showed the binding of AZM on amoebal AQP-like proteins.  Acanthamoeba showed transient shape changes and encystation at differential doses of brinzolamide, mannitol and AZM.  Conclusion: Water and pH regulating adapter proteins in Acanthamoeba and humans show significant homology, these mechanisms evolved early in the primitive unicellular eukaryotes and have remained conserved in multicellular eukaryotes.

  2. Arsenic and Antimony Transporters in Eukaryotes

    Directory of Open Access Journals (Sweden)

    Ewa Maciaszczyk-Dziubinska

    2012-03-01

    Full Text Available Arsenic and antimony are toxic metalloids, naturally present in the environment and all organisms have developed pathways for their detoxification. The most effective metalloid tolerance systems in eukaryotes include downregulation of metalloid uptake, efflux out of the cell, and complexation with phytochelatin or glutathione followed by sequestration into the vacuole. Understanding of arsenic and antimony transport system is of high importance due to the increasing usage of arsenic-based drugs in the treatment of certain types of cancer and diseases caused by protozoan parasites as well as for the development of bio- and phytoremediation strategies for metalloid polluted areas. However, in contrast to prokaryotes, the knowledge about specific transporters of arsenic and antimony and the mechanisms of metalloid transport in eukaryotes has been very limited for a long time. Here, we review the recent advances in understanding of arsenic and antimony transport pathways in eukaryotes, including a dual role of aquaglyceroporins in uptake and efflux of metalloids, elucidation of arsenic transport mechanism by the yeast Acr3 transporter and its role in arsenic hyperaccumulation in ferns, identification of vacuolar transporters of arsenic-phytochelatin complexes in plants and forms of arsenic substrates recognized by mammalian ABC transporters.

  3. Arsenic and Antimony Transporters in Eukaryotes

    Science.gov (United States)

    Maciaszczyk-Dziubinska, Ewa; Wawrzycka, Donata; Wysocki, Robert

    2012-01-01

    Arsenic and antimony are toxic metalloids, naturally present in the environment and all organisms have developed pathways for their detoxification. The most effective metalloid tolerance systems in eukaryotes include downregulation of metalloid uptake, efflux out of the cell, and complexation with phytochelatin or glutathione followed by sequestration into the vacuole. Understanding of arsenic and antimony transport system is of high importance due to the increasing usage of arsenic-based drugs in the treatment of certain types of cancer and diseases caused by protozoan parasites as well as for the development of bio- and phytoremediation strategies for metalloid polluted areas. However, in contrast to prokaryotes, the knowledge about specific transporters of arsenic and antimony and the mechanisms of metalloid transport in eukaryotes has been very limited for a long time. Here, we review the recent advances in understanding of arsenic and antimony transport pathways in eukaryotes, including a dual role of aquaglyceroporins in uptake and efflux of metalloids, elucidation of arsenic transport mechanism by the yeast Acr3 transporter and its role in arsenic hyperaccumulation in ferns, identification of vacuolar transporters of arsenic-phytochelatin complexes in plants and forms of arsenic substrates recognized by mammalian ABC transporters. PMID:22489166

  4. Eukaryotic evolutionary transitions are associated with extreme codon bias in functionally-related proteins.

    Directory of Open Access Journals (Sweden)

    Nicholas J Hudson

    Full Text Available Codon bias in the genome of an organism influences its phenome by changing the speed and efficiency of mRNA translation and hence protein abundance. We hypothesized that differences in codon bias, either between-species differences in orthologous genes, or within-species differences between genes, may play an evolutionary role. To explore this hypothesis, we compared the genome-wide codon bias in six species that occupy vital positions in the Eukaryotic Tree of Life. We acquired the entire protein coding sequences for these organisms, computed the codon bias for all genes in each organism and explored the output for relationships between codon bias and protein function, both within- and between-lineages. We discovered five notable coordinated patterns, with extreme codon bias most pronounced in traits considered highly characteristic of a given lineage. Firstly, the Homo sapiens genome had stronger codon bias for DNA-binding transcription factors than the Saccharomyces cerevisiae genome, whereas the opposite was true for ribosomal proteins--perhaps underscoring transcriptional regulation in the origin of complexity. Secondly, both mammalian species examined possessed extreme codon bias in genes relating to hair--a tissue unique to mammals. Thirdly, Arabidopsis thaliana showed extreme codon bias in genes implicated in cell wall formation and chloroplast function--which are unique to plants. Fourthly, Gallus gallus possessed strong codon bias in a subset of genes encoding mitochondrial proteins--perhaps reflecting the enhanced bioenergetic efficiency in birds that co-evolved with flight. And lastly, the G. gallus genome had extreme codon bias for the Ciliary Neurotrophic Factor--which may help to explain their spontaneous recovery from deafness. We propose that extreme codon bias in groups of genes that encode functionally related proteins has a pathway-level energetic explanation.

  5. Systematic determination of the mosaic structure of bacterial genomes: species backbone versus strain-specific loops

    Directory of Open Access Journals (Sweden)

    Gendrault-Jacquemard A

    2005-07-01

    Full Text Available Abstract Background Public databases now contain multitude of complete bacterial genomes, including several genomes of the same species. The available data offers new opportunities to address questions about bacterial genome evolution, a task that requires reliable fine comparison data of closely related genomes. Recent analyses have shown, using pairwise whole genome alignments, that it is possible to segment bacterial genomes into a common conserved backbone and strain-specific sequences called loops. Results Here, we generalize this approach and propose a strategy that allows systematic and non-biased genome segmentation based on multiple genome alignments. Segmentation analyses, as applied to 13 different bacterial species, confirmed the feasibility of our approach to discern the 'mosaic' organization of bacterial genomes. Segmentation results are available through a Web interface permitting functional analysis, extraction and visualization of the backbone/loops structure of documented genomes. To illustrate the potential of this approach, we performed a precise analysis of the mosaic organization of three E. coli strains and functional characterization of the loops. Conclusion The segmentation results including the backbone/loops structure of 13 bacterial species genomes are new and available for use by the scientific community at the URL: http://genome.jouy.inra.fr/mosaic.

  6. A Synthetic Biology Framework for Programming Eukaryotic Transcription Functions

    Science.gov (United States)

    Khalil, Ahmad S.; Lu, Timothy K.; Bashor, Caleb J.; Ramirez, Cherie L.; Pyenson, Nora C.; Joung, J. Keith; Collins, James J.

    2013-01-01

    SUMMARY Eukaryotic transcription factors (TFs) perform complex and combinatorial functions within transcriptional networks. Here, we present a synthetic framework for systematically constructing eukaryotic transcription functions using artificial zinc fingers, modular DNA-binding domains found within many eukaryotic TFs. Utilizing this platform, we construct a library of orthogonal synthetic transcription factors (sTFs) and use these to wire synthetic transcriptional circuits in yeast. We engineer complex functions, such as tunable output strength and transcriptional cooperativity, by rationally adjusting a decomposed set of key component properties, e.g., DNA specificity, affinity, promoter design, protein-protein interactions. We show that subtle perturbations to these properties can transform an individual sTF between distinct roles (activator, cooperative factor, inhibitory factor) within a transcriptional complex, thus drastically altering the signal processing behavior of multi-input systems. This platform provides new genetic components for synthetic biology and enables bottom-up approaches to understanding the design principles of eukaryotic transcriptional complexes and networks. PMID:22863014

  7. Clustered Regularly Interspaced Short Palindromic Repeats (CRISPRi) plasmids | Office of Cancer Genomics

    Science.gov (United States)

    CTD2 researchers at the University of California in San Francisco developed a modified Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) CRISPR/dCas9 system. Catalytically inactive dCas9 enables modular and programmable RNA-guided genome regulation in eukaryotes.

  8. Algal genomes reveal evolutionary mosaicism and the fate of nucleomorphs

    Czech Academy of Sciences Publication Activity Database

    Oborník, Miroslav; Kořený, Luděk

    2012-01-01

    Roč. 492, č. 7427 (2012), s. 59-65 ISSN 0028-0836 Institutional support: RVO:60077344 Keywords : GENE-TRANSFER * BIGELOWIELLA-NATANS * EUKARYOTIC GENOMES * GUILLARDIA-THETA * NUCLEUS * CHLORARACHNIOPHYTE * PROTEINS * SEQUENCE * ORIGIN * CRYPTOPHYTES Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 38.597, year: 2012 http://www.nature.com/nature/journal/v492/n7427/full/nature11681.html

  9. MIPS: a database for protein sequences and complete genomes.

    Science.gov (United States)

    Mewes, H W; Hani, J; Pfeiffer, F; Frishman, D

    1998-01-01

    The MIPS group [Munich Information Center for Protein Sequences of the German National Center for Environment and Health (GSF)] at the Max-Planck-Institute for Biochemistry, Martinsried near Munich, Germany, is involved in a number of data collection activities, including a comprehensive database of the yeast genome, a database reflecting the progress in sequencing the Arabidopsis thaliana genome, the systematic analysis of other small genomes and the collection of protein sequence data within the framework of the PIR-International Protein Sequence Database (described elsewhere in this volume). Through its WWW server (http://www.mips.biochem.mpg.de ) MIPS provides access to a variety of generic databases, including a database of protein families as well as automatically generated data by the systematic application of sequence analysis algorithms. The yeast genome sequence and its related information was also compiled on CD-ROM to provide dynamic interactive access to the 16 chromosomes of the first eukaryotic genome unraveled. PMID:9399795

  10. How MCM loading and spreading specify eukaryotic DNA replication initiation sites [version 1; referees: 4 approved

    Directory of Open Access Journals (Sweden)

    Olivier Hyrien

    2016-08-01

    Full Text Available DNA replication origins strikingly differ between eukaryotic species and cell types. Origins are localized and can be highly efficient in budding yeast, are randomly located in early fly and frog embryos, which do not transcribe their genomes, and are clustered in broad (10-100 kb non-transcribed zones, frequently abutting transcribed genes, in mammalian cells. Nonetheless, in all cases, origins are established during the G1-phase of the cell cycle by the loading of double hexamers of the Mcm 2-7 proteins (MCM DHs, the core of the replicative helicase. MCM DH activation in S-phase leads to origin unwinding, polymerase recruitment, and initiation of bidirectional DNA synthesis. Although MCM DHs are initially loaded at sites defined by the binding of the origin recognition complex (ORC, they ultimately bind chromatin in much greater numbers than ORC and only a fraction are activated in any one S-phase. Data suggest that the multiplicity and functional redundancy of MCM DHs provide robustness to the replication process and affect replication time and that MCM DHs can slide along the DNA and spread over large distances around the ORC. Recent studies further show that MCM DHs are displaced along the DNA by collision with transcription complexes but remain functional for initiation after displacement. Therefore, eukaryotic DNA replication relies on intrinsically mobile and flexible origins, a strategy fundamentally different from bacteria but conserved from yeast to human. These properties of MCM DHs likely contribute to the establishment of broad, intergenic replication initiation zones in higher eukaryotes.

  11. High-throughput SHAPE analysis reveals structures in HIV-1 genomic RNA strongly conserved across distinct biological states.

    Directory of Open Access Journals (Sweden)

    Kevin A Wilkinson

    2008-04-01

    Full Text Available Replication and pathogenesis of the human immunodeficiency virus (HIV is tightly linked to the structure of its RNA genome, but genome structure in infectious virions is poorly understood. We invent high-throughput SHAPE (selective 2'-hydroxyl acylation analyzed by primer extension technology, which uses many of the same tools as DNA sequencing, to quantify RNA backbone flexibility at single-nucleotide resolution and from which robust structural information can be immediately derived. We analyze the structure of HIV-1 genomic RNA in four biologically instructive states, including the authentic viral genome inside native particles. Remarkably, given the large number of plausible local structures, the first 10% of the HIV-1 genome exists in a single, predominant conformation in all four states. We also discover that noncoding regions functioning in a regulatory role have significantly lower (p-value < 0.0001 SHAPE reactivities, and hence more structure, than do viral coding regions that function as the template for protein synthesis. By directly monitoring protein binding inside virions, we identify the RNA recognition motif for the viral nucleocapsid protein. Seven structurally homologous binding sites occur in a well-defined domain in the genome, consistent with a role in directing specific packaging of genomic RNA into nascent virions. In addition, we identify two distinct motifs that are targets for the duplex destabilizing activity of this same protein. The nucleocapsid protein destabilizes local HIV-1 RNA structure in ways likely to facilitate initial movement both of the retroviral reverse transcriptase from its tRNA primer and of the ribosome in coding regions. Each of the three nucleocapsid interaction motifs falls in a specific genome domain, indicating that local protein interactions can be organized by the long-range architecture of an RNA. High-throughput SHAPE reveals a comprehensive view of HIV-1 RNA genome structure, and further

  12. Evolutionary genomics of miniature inverted-repeat transposable elements (MITEs) in Brassica.

    Science.gov (United States)

    Nouroz, Faisal; Noreen, Shumaila; Heslop-Harrison, J S

    2015-12-01

    Miniature inverted-repeat transposable elements (MITEs) are truncated derivatives of autonomous DNA transposons, and are dispersed abundantly in most eukaryotic genomes. We aimed to characterize various MITEs families in Brassica in terms of their presence, sequence characteristics and evolutionary activity. Dot plot analyses involving comparison of homoeologous bacterial artificial chromosome (BAC) sequences allowed identification of 15 novel families of mobile MITEs. Of which, 5 were Stowaway-like with TA Target Site Duplications (TSDs), 4 Tourist-like with TAA/TTA TSDs, 5 Mutator-like with 9-10 bp TSDs and 1 novel MITE (BoXMITE1) flanked by 3 bp TSDs. Our data suggested that there are about 30,000 MITE-related sequences in Brassica rapa and B. oleracea genomes. In situ hybridization showed one abundant family was dispersed in the A-genome, while another was located near 45S rDNA sites. PCR analysis using primers flanking sequences of MITE elements detected MITE insertion polymorphisms between and within the three Brassica (AA, BB, CC) genomes, with many insertions being specific to single genomes and others showing evidence of more recent evolutionary insertions. Our BAC sequence comparison strategy enables identification of evolutionarily active MITEs with no prior knowledge of MITE sequences. The details of MITE families reported in Brassica enable their identification, characterization and annotation. Insertion polymorphisms of MITEs and their transposition activity indicated important mechanism of genome evolution and diversification. MITE families derived from known Mariner, Harbinger and Mutator DNA transposons were discovered, as well as some novel structures. The identification of Brassica MITEs will have broad applications in Brassica genomics, breeding, hybridization and phylogeny through their use as DNA markers.

  13. Quantitation of base substitutions in eukaryotic 5S rRNA: selection for the maintenance of RNA secondary structure.

    Science.gov (United States)

    Curtiss, W C; Vournakis, J N

    1984-01-01

    Eukaryotic 5S rRNA sequences from 34 diverse species were compared by the following method: (1) The sequences were aligned; (2) the positions of substitutions were located by comparison of all possible pairs of sequences; (3) the substitution sites were mapped to an assumed general base pairing model; and (4) the R-Y model of base stacking was used to study stacking pattern relationships in the structure. An analysis of the sequence and structure variability in each region of the molecule is presented. It was found that the degree of base substitution varies over a wide range, from absolute conservation to occurrence of over 90% of the possible observable substitutions. The substitutions are located primarily in stem regions of the 5S rRNA secondary structure. More than 88% of the substitutions in helical regions maintain base pairing. The disruptive substitutions are primarily located at the edges of helical regions, resulting in shortening of the helical regions and lengthening of the adjacent nonpaired regions. Base stacking patterns determined by the R-Y model are mapped onto the general secondary structure. Intrastrand and interstrand stacking could stabilize alternative coaxial structures and limit the conformational flexibility of nonpaired regions. Two short contiguous regions are 100% conserved in all species. This may reflect evolutionary constraints imposed at the DNA level by the requirement for binding of a 5S gene transcription initiation factor during gene expression.

  14. Genome Segregation and Packaging Machinery in Acanthamoeba polyphaga Mimivirus Is Reminiscent of Bacterial Apparatus

    Science.gov (United States)

    Chelikani, Venkata; Ranjan, Tushar; Zade, Amrutraj; Shukla, Avi

    2014-01-01

    ABSTRACT Genome packaging is a critical step in the virion assembly process. The putative ATP-driven genome packaging motor of Acanthamoeba polyphaga mimivirus (APMV) and other nucleocytoplasmic large DNA viruses (NCLDVs) is a distant ortholog of prokaryotic chromosome segregation motors, such as FtsK and HerA, rather than other viral packaging motors, such as large terminase. Intriguingly, APMV also encodes other components, i.e., three putative serine recombinases and a putative type II topoisomerase, all of which are essential for chromosome segregation in prokaryotes. Based on our analyses of these components and taking the limited available literature into account, here we propose for the first time a model for genome segregation and packaging in APMV that can possibly be extended to NCLDV subfamilies, except perhaps Poxviridae and Ascoviridae. This model might represent a unique variation of the prokaryotic system acquired and contrived by the large DNA viruses of eukaryotes. It is also consistent with previous observations that unicellular eukaryotes, such as amoebae, are melting pots for the advent of chimeric organisms with novel mechanisms. IMPORTANCE Extremely large viruses with DNA genomes infect a wide range of eukaryotes, from human beings to amoebae and from crocodiles to algae. These large DNA viruses, unlike their much smaller cousins, have the capability of making most of the protein components required for their multiplication. Once they infect the cell, these viruses set up viral replication centers, known as viral factories, to carry out their multiplication with very little help from the host. Our sequence analyses show that there is remarkable similarity between prokaryotes (bacteria and archaea) and large DNA viruses, such as mimivirus, vaccinia virus, and pandoravirus, in the way that they process their newly synthesized genetic material to make sure that only one copy of the complete genome is generated and is meticulously placed inside

  15. Mathematical model of reproductive death of irradiated eukaryotic cells, which considers saturation of DNA reparation system

    International Nuclear Information System (INIS)

    Knyigavko, V.G.; Ponomarenko, N.S.; Meshcheryakova, O.P.; Protasenya, S.Yu.

    2009-01-01

    A mathematical model of the processes determining reproductive death of the exposed cells was built. The model takes into account the phenomenon of saturation of the system of DNA radiation lesion reparation and structural functional peculiarities of chromatin structure in eukaryotes. The problem of assessment of the model parameters using experimental data was discussed.

  16. Recognizing genes and other components of genomic structure

    Energy Technology Data Exchange (ETDEWEB)

    Burks, C. (Los Alamos National Lab., NM (USA)); Myers, E. (Arizona Univ., Tucson, AZ (USA). Dept. of Computer Science); Stormo, G.D. (Colorado Univ., Boulder, CO (USA). Dept. of Molecular, Cellular and Developmental Biology)

    1991-01-01

    The Aspen Center for Physics (ACP) sponsored a three-week workshop, with 26 scientists participating, from 28 May to 15 June, 1990. The workshop, entitled Recognizing Genes and Other Components of Genomic Structure, focussed on discussion of current needs and future strategies for developing the ability to identify and predict the presence of complex functional units on sequenced, but otherwise uncharacterized, genomic DNA. We addressed the need for computationally-based, automatic tools for synthesizing available data about individual consensus sequences and local compositional patterns into the composite objects (e.g., genes) that are -- as composite entities -- the true object of interest when scanning DNA sequences. The workshop was structured to promote sustained informal contact and exchange of expertise between molecular biologists, computer scientists, and mathematicians. No participant stayed for less than one week, and most attended for two or three weeks. Computers, software, and databases were available for use as electronic blackboards'' and as the basis for collaborative exploration of ideas being discussed and developed at the workshop. 23 refs., 2 tabs.

  17. Genome Size Diversity and Its Impact on the Evolution of Land Plants

    Directory of Open Access Journals (Sweden)

    Jaume Pellicer

    2018-02-01

    Full Text Available Genome size is a biodiversity trait that shows staggering diversity across eukaryotes, varying over 64,000-fold. Of all major taxonomic groups, land plants stand out due to their staggering genome size diversity, ranging ca. 2400-fold. As our understanding of the implications and significance of this remarkable genome size diversity in land plants grows, it is becoming increasingly evident that this trait plays not only an important role in shaping the evolution of plant genomes, but also in influencing plant community assemblages at the ecosystem level. Recent advances and improvements in novel sequencing technologies, as well as analytical tools, make it possible to gain critical insights into the genomic and epigenetic mechanisms underpinning genome size changes. In this review we provide an overview of our current understanding of genome size diversity across the different land plant groups, its implications on the biology of the genome and what future directions need to be addressed to fill key knowledge gaps.

  18. Recent advances in the genome-wide study of DNA replication origins in yeast

    Directory of Open Access Journals (Sweden)

    Chong ePeng

    2015-02-01

    Full Text Available DNA replication, one of the central events in the cell cycle, is the basis of biological inheritance. In order to be duplicated, a DNA double helix must be opened at defined sites, which are called DNA replication origins (ORIs. Unlike in bacteria, where replication initiates from a single replication origin, multiple origins are utilized in the eukaryotic genome. Among them, the ORIs in budding yeast Saccharomyces cerevisiae and the fission yeast Schizosaccharomyces pombe have been best characterized. In recent years, advances in DNA microarray and next-generation sequencing technologies have increased the number of yeast species involved in ORIs research dramatically. The ORIs in some nonconventional yeast species such as Kluyveromyces lactis and Pichia pastoris have also been genome-widely identified. Relevant databases of replication origins in yeast were constructed, then the comparative genomic analysis can be carried out. Here, we review several experimental approaches that have been used to map replication origins in yeast and some of the available web resources related to yeast ORIs. We also discuss the sequence characteristics and chromosome structures of ORIs in the four yeast species, which can be utilized to improve the replication origins prediction.

  19. Recent advances in the genome-wide study of DNA replication origins in yeast

    Science.gov (United States)

    Peng, Chong; Luo, Hao; Zhang, Xi; Gao, Feng

    2015-01-01

    DNA replication, one of the central events in the cell cycle, is the basis of biological inheritance. In order to be duplicated, a DNA double helix must be opened at defined sites, which are called DNA replication origins (ORIs). Unlike in bacteria, where replication initiates from a single replication origin, multiple origins are utilized in the eukaryotic genomes. Among them, the ORIs in budding yeast Saccharomyces cerevisiae and the fission yeast Schizosaccharomyces pombe have been best characterized. In recent years, advances in DNA microarray and next-generation sequencing technologies have increased the number of yeast species involved in ORIs research dramatically. The ORIs in some non-conventional yeast species such as Kluyveromyces lactis and Pichia pastoris have also been genome-widely identified. Relevant databases of replication origins in yeast were constructed, then the comparative genomic analysis can be carried out. Here, we review several experimental approaches that have been used to map replication origins in yeast and some of the available web resources related to yeast ORIs. We also discuss the sequence characteristics and chromosome structures of ORIs in the four yeast species, which can be utilized to improve yeast replication origins prediction. PMID:25745419

  20. KGCAK: a K-mer based database for genome-wide phylogeny and complexity evaluation.

    Science.gov (United States)

    Wang, Dapeng; Xu, Jiayue; Yu, Jun

    2015-09-16

    The K-mer approach, treating genomic sequences as simple characters and counting the relative abundance of each string upon a fixed K, has been extensively applied to phylogeny inference for genome assembly, annotation, and comparison. To meet increasing demands for comparing large genome sequences and to promote the use of the K-mer approach, we develop a versatile database, KGCAK ( http://kgcak.big.ac.cn/KGCAK/ ), containing ~8,000 genomes that include genome sequences of diverse life forms (viruses, prokaryotes, protists, animals, and plants) and cellular organelles of eukaryotic lineages. It builds phylogeny based on genomic elements in an alignment-free fashion and provides in-depth data processing enabling users to compare the complexity of genome sequences based on K-mer distribution. We hope that KGCAK becomes a powerful tool for exploring relationship within and among groups of species in a tree of life based on genomic data.

  1. HIV-1 Replication and the Cellular Eukaryotic Translation Apparatus

    Directory of Open Access Journals (Sweden)

    Santiago Guerrero

    2015-01-01

    Full Text Available Eukaryotic translation is a complex process composed of three main steps: initiation, elongation, and termination. During infections by RNA- and DNA-viruses, the eukaryotic translation machinery is used to assure optimal viral protein synthesis. Human immunodeficiency virus type I (HIV-1 uses several non-canonical pathways to translate its own proteins, such as leaky scanning, frameshifting, shunt, and cap-independent mechanisms. Moreover, HIV-1 modulates the host translation machinery by targeting key translation factors and overcomes different cellular obstacles that affect protein translation. In this review, we describe how HIV-1 proteins target several components of the eukaryotic translation machinery, which consequently improves viral translation and replication.

  2. Next-Generation Genomics Facility at C-CAMP: Accelerating Genomic Research in India

    Science.gov (United States)

    S, Chandana; Russiachand, Heikham; H, Pradeep; S, Shilpa; M, Ashwini; S, Sahana; B, Jayanth; Atla, Goutham; Jain, Smita; Arunkumar, Nandini; Gowda, Malali

    2014-01-01

    Next-Generation Sequencing (NGS; http://www.genome.gov/12513162) is a recent life-sciences technological revolution that allows scientists to decode genomes or transcriptomes at a much faster rate with a lower cost. Genomic-based studies are in a relatively slow pace in India due to the non-availability of genomics experts, trained personnel and dedicated service providers. Using NGS there is a lot of potential to study India's national diversity (of all kinds). We at the Centre for Cellular and Molecular Platforms (C-CAMP) have launched the Next Generation Genomics Facility (NGGF) to provide genomics service to scientists, to train researchers and also work on national and international genomic projects. We have HiSeq1000 from Illumina and GS-FLX Plus from Roche454. The long reads from GS FLX Plus, and high sequence depth from HiSeq1000, are the best and ideal hybrid approaches for de novo and re-sequencing of genomes and transcriptomes. At our facility, we have sequenced around 70 different organisms comprising of more than 388 genomes and 615 transcriptomes – prokaryotes and eukaryotes (fungi, plants and animals). In addition we have optimized other unique applications such as small RNA (miRNA, siRNA etc), long Mate-pair sequencing (2 to 20 Kb), Coding sequences (Exome), Methylome (ChIP-Seq), Restriction Mapping (RAD-Seq), Human Leukocyte Antigen (HLA) typing, mixed genomes (metagenomes) and target amplicons, etc. Translating DNA sequence data from NGS sequencer into meaningful information is an important exercise. Under NGGF, we have bioinformatics experts and high-end computing resources to dissect NGS data such as genome assembly and annotation, gene expression, target enrichment, variant calling (SSR or SNP), comparative analysis etc. Our services (sequencing and bioinformatics) have been utilized by more than 45 organizations (academia and industry) both within India and outside, resulting several publications in peer-reviewed journals and several genomic

  3. Spectral entropy criteria for structural segmentation in genomic DNA sequences

    International Nuclear Information System (INIS)

    Chechetkin, V.R.; Lobzin, V.V.

    2004-01-01

    The spectral entropy is calculated with Fourier structure factors and characterizes the level of structural ordering in a sequence of symbols. It may efficiently be applied to the assessment and reconstruction of the modular structure in genomic DNA sequences. We present the relevant spectral entropy criteria for the local and non-local structural segmentation in DNA sequences. The results are illustrated with the model examples and analysis of intervening exon-intron segments in the protein-coding regions

  4. Eukaryotic community diversity and spatial variation during drinking water production (by seawater desalination) and distribution in a full-scale network

    KAUST Repository

    Belila, Abdelaziz

    2016-12-01

    Eukaryotic microorganisms are naturally present in many water resources and can enter, grow and colonize water treatment and transport systems, including reservoirs, pipes and premise plumbing. In this study, we explored the eukaryotic microbial community structure in water during the (i) production of drinking water in a seawater desalination plant and (ii) transport of the drinking water in the distribution network. The desalination plant treatment involved pre-treatment (e.g. spruce filters), reverse osmosis (RO) membrane filtration and post-treatment steps (e.g. remineralization). 454 pyrosequencing analysis of the 18S rRNA gene revealed a highly diverse (35 phyla) and spatially variable eukaryotic community during water treatment and distribution. The desalination plant feed water contained a typical marine picoeukaryotic community dominated by Stramenopiles, Alveolates and Porifera. In the desalination plant Ascomycota was the most dominant phylum (15.5% relative abundance), followed by Alveolata (11.9%), unclassified fungi clade (10.9%) and Porifera (10.7%). In the drinking water distribution network, an uncultured fungi phylum was the major group (44.0%), followed by Chordata (17.0%), Ascomycota (11.0%) and Arthropoda (8.0%). Fungi constituted 40% of the total eukaryotic community in the treatment plant and the distribution network and their taxonomic composition was dominated by an uncultured fungi clade (55%). Comparing the plant effluent to the network samples, 84 OTUs (2.1%) formed the core eukaryotic community while 35 (8.4%) and 299 (71.5%) constituted unique OTUs in the produced water at the plant and combined tap water samples from the network, respectively. RO membrane filtration treatment significantly changed the water eukaryotic community composition and structure, highlighting the fact that (i) RO produced water is not sterile and (ii) the microbial community in the final tap water is influenced by the downstream distribution system. The study

  5. Homology-based annotation of non-coding RNAs in the genomes of Schistosoma mansoni and Schistosoma japonicum

    Directory of Open Access Journals (Sweden)

    Santana Clara

    2009-10-01

    Full Text Available Abstract Background Schistosomes are trematode parasites of the phylum Platyhelminthes. They are considered the most important of the human helminth parasites in terms of morbidity and mortality. Draft genome sequences are now available for Schistosoma mansoni and Schistosoma japonicum. Non-coding RNA (ncRNA plays a crucial role in gene expression regulation, cellular function and defense, homeostasis, and pathogenesis. The genome-wide annotation of ncRNAs is a non-trivial task unless well-annotated genomes of closely related species are already available. Results A homology search for structured ncRNA in the genome of S. mansoni resulted in 23 types of ncRNAs with conserved primary and secondary structure. Among these, we identified rRNA, snRNA, SL RNA, SRP, tRNAs and RNase P, and also possibly MRP and 7SK RNAs. In addition, we confirmed five miRNAs that have recently been reported in S. japonicum and found two additional homologs of known miRNAs. The tRNA complement of S. mansoni is comparable to that of the free-living planarian Schmidtea mediterranea, although for some amino acids differences of more than a factor of two are observed: Leu, Ser, and His are overrepresented, while Cys, Meth, and Ile are underrepresented in S. mansoni. On the other hand, the number of tRNAs in the genome of S. japonicum is reduced by more than a factor of four. Both schistosomes have a complete set of minor spliceosomal snRNAs. Several ncRNAs that are expected to exist in the S. mansoni genome were not found, among them the telomerase RNA, vault RNAs, and Y RNAs. Conclusion The ncRNA sequences and structures presented here represent the most complete dataset of ncRNA from any lophotrochozoan reported so far. This data set provides an important reference for further analysis of the genomes of schistosomes and indeed eukaryotic genomes at large.

  6. Detection of RNA structures in porcine EST data and related mammals

    DEFF Research Database (Denmark)

    Seemann, Ernst Stefan; Gilchrist, Michael J.; Hofacker, Ivo L.

    2007-01-01

    % porcine coding transcripts (of 18,600 identified) as well as less than one-third ORF-free transcripts are conserved at least in the closely related bovine genome. Approximately one percent of the coding and 10% of the remaining matches are unique between the PigEST data and cow genome. Based on the pig......BACKGROUND: Non-coding RNAs (ncRNAs) are involved in a wide spectrum of regulatory functions. Within recent years, there have been increasing reports of observed polyadenylated ncRNAs and mRNA like ncRNAs in eukaryotes. To investigate this further, we examined the large data set in the Sino......-cow alignments, we searched for similarities to 16 other organisms by UCSC available alignments, which resulted in a 87% coverage by the human genome for instance. CONCLUSION: Besides recovering several of the already annotated functional RNA structures, we predicted a large number of high confidence conserved...

  7. The genome of the extremophile crucifer Thellungiella parvula

    KAUST Repository

    Dassanayake, Maheshi

    2011-08-07

    Thellungiella parvula is related to Arabidopsis thaliana and is endemic to saline, resource-poor habitats, making it a model for the evolution of plant adaptation to extreme environments. Here we present the draft genome for this extremophile species. Exclusively by next generation sequencing, we obtained the de novo assembled genome in 1,496 gap-free contigs, closely approximating the estimated genome size of 140 Mb. We anchored these contigs to seven pseudo chromosomes without the use of maps. We show that short reads can be assembled to a near-complete chromosome level for a eukaryotic species lacking prior genetic information. The sequence identifies a number of tandem duplications that, by the nature of the duplicated genes, suggest a possible basis for T. parvula\\'s extremophile lifestyle. Our results provide essential background for developing genomically influenced testable hypotheses for the evolution of environmental stress tolerance. © 2011 Nature America, Inc. All rights reserved.

  8. Remembrance of things past retrieved from the Paramecium genome.

    Science.gov (United States)