WorldWideScience

Sample records for genome joho kaidoku

  1. Technologies and techniques for analysis and use of genome information, 1997; Genome joho kaidoku riyo gijutsu no chosa kenkyu

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1998-03-01

    The paper clarified the whole image of cell functions by elucidating the function and manifestation control mechanism of genes existing in genomes, and the network of their interactions, and surveyed applicability of the useful functions obtained of cells and proteins to the industrial field. The survey was made from a viewpoint of the fields of both biology and information science. Especially, based on the function-known DNA base sequence database, the following technologies were surveyed: technology to predict the function of the function-unknown DNA base sequence, search/separation technology to acquire the genes to be functionally elucidated in a state of being suitable for manifestation, technology to get perfect proteins by effectively manifesting the genes to be functionally elucidated, and technology to analyze the function of the proteins obtained by manifestation of genes. Further, the International Symposium was held which is titled `Genome Research Opens a New World to Bioindustry (New Developments in Genome Informatics Technologies). With the future progress of technology to decipher and use genome information, the construction of much newer genome industry is anticipated. 165 refs., 44 figs., 10 tabs.

  2. Research study on analysis/use technologies of genome information; Genome joho kaidoku riyo gijutsu no chosa kenkyu

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1997-03-01

    For wide use of genome information in the industrial field, the required R and D was surveyed from the standpoints of biology and information science. To clarify the present state and issues of the international research on genome analysis, the genome map as well as sequence and function information are first surveyed. The current analysis/use technologies of genome information are analyzed, and the following are summarized: prediction and identification of gene regions in genome sequences, techniques for searching and selecting useful genes, and techniques for predicting the expression of gene functions and the gene-product structure and functions. It is recommended that R and D and data collection/interpretation necessary to clarify inter-gene interactions and information networks should be promoted by integrating Japanese advanced know-how and technologies. As examples of the impact of the research results on industry and society, the present state and future expected effect are summarized for medicines, diagnosis/analysis instruments, chemicals, foods, agriculture, fishery, animal husbandry, electronics, environment and information. 278 refs., 42 figs., 5 tabs.

  3. Genomes

    National Research Council Canada - National Science Library

    Brown, T. A. (Terence A.)

    2002-01-01

    ... of genome expression and replication processes, and transcriptomics and proteomics. This text is richly illustrated with clear, easy-to-follow, full color diagrams, which are downloadable from the book's website...

  4. Extreme genomes

    OpenAIRE

    DeLong, Edward F

    2000-01-01

    The complete genome sequence of Thermoplasma acidophilum, an acid- and heat-loving archaeon, has recently been reported. Comparative genomic analysis of this 'extremophile' is providing new insights into the metabolic machinery, ecology and evolution of thermophilic archaea.

  5. Grass genomes

    OpenAIRE

    Bennetzen, Jeffrey L.; SanMiguel, Phillip; Chen, Mingsheng; Tikhonov, Alexander; Francki, Michael; Avramova, Zoya

    1998-01-01

    For the most part, studies of grass genome structure have been limited to the generation of whole-genome genetic maps or the fine structure and sequence analysis of single genes or gene clusters. We have investigated large contiguous segments of the genomes of maize, sorghum, and rice, primarily focusing on intergenic spaces. Our data indicate that much (>50%) of the maize genome is composed of interspersed repetitive DNAs, primarily nested retrotransposons that in...

  6. Cancer genomics

    DEFF Research Database (Denmark)

    Norrild, Bodil; Guldberg, Per; Ralfkiær, Elisabeth Methner

    2007-01-01

    Almost all cells in the human body contain a complete copy of the genome with an estimated number of 25,000 genes. The sequences of these genes make up about three percent of the genome and comprise the inherited set of genetic information. The genome also contains information that determines whe...

  7. Genome Imprinting

    Indian Academy of Sciences (India)

    the cell nucleus (mitochondrial and chloroplast genomes), and. (3) traits governed ... tively good embryonic development but very poor development of membranes and ... Human homologies for the type of situation described above are naturally ..... imprint; (b) New modifications of the paternal genome in germ cells of each ...

  8. Baculovirus Genomics

    NARCIS (Netherlands)

    Oers, van M.M.; Vlak, J.M.

    2007-01-01

    Baculovirus genomes are covalently closed circles of double stranded-DNA varying in size between 80 and 180 kilobase-pair. The genomes of more than fourty-one baculoviruses have been sequenced to date. The majority of these (37) are pathogenic to lepidopteran hosts; three infect sawflies

  9. Genomic Testing

    Science.gov (United States)

    ... this database. Top of Page Evaluation of Genomic Applications in Practice and Prevention (EGAPP™) In 2004, the Centers for Disease Control and Prevention launched the EGAPP initiative to establish and test a ... and other applications of genomic technology that are in transition from ...

  10. Ancient genomes

    OpenAIRE

    Hoelzel, A Rus

    2005-01-01

    Ever since its invention, the polymerase chain reaction has been the method of choice for work with ancient DNA. In an application of modern genomic methods to material from the Pleistocene, a recent study has instead undertaken to clone and sequence a portion of the ancient genome of the cave bear.

  11. Herbarium genomics

    DEFF Research Database (Denmark)

    Bakker, Freek T.; Lei, Di; Yu, Jiaying

    2016-01-01

    Herbarium genomics is proving promising as next-generation sequencing approaches are well suited to deal with the usually fragmented nature of archival DNA. We show that routine assembly of partial plastome sequences from herbarium specimens is feasible, from total DNA extracts and with specimens...... up to 146 years old. We use genome skimming and an automated assembly pipeline, Iterative Organelle Genome Assembly, that assembles paired-end reads into a series of candidate assemblies, the best one of which is selected based on likelihood estimation. We used 93 specimens from 12 different...... correlation between plastome coverage and nuclear genome size (C value) in our samples, but the range of C values included is limited. Finally, we conclude that routine plastome sequencing from herbarium specimens is feasible and cost-effective (compared with Sanger sequencing or plastome...

  12. Cephalopod genomics

    DEFF Research Database (Denmark)

    Albertin, Caroline B.; Bonnaud, Laure; Brown, C. Titus

    2012-01-01

    The Cephalopod Sequencing Consortium (CephSeq Consortium) was established at a NESCent Catalysis Group Meeting, ``Paths to Cephalopod Genomics-Strategies, Choices, Organization,'' held in Durham, North Carolina, USA on May 24-27, 2012. Twenty-eight participants representing nine countries (Austria......, Australia, China, Denmark, France, Italy, Japan, Spain and the USA) met to address the pressing need for genome sequencing of cephalopod mollusks. This group, drawn from cephalopod biologists, neuroscientists, developmental and evolutionary biologists, materials scientists, bioinformaticians and researchers...... active in sequencing, assembling and annotating genomes, agreed on a set of cephalopod species of particular importance for initial sequencing and developed strategies and an organization (CephSeq Consortium) to promote this sequencing. The conclusions and recommendations of this meeting are described...

  13. Genome Sequencing

    DEFF Research Database (Denmark)

    Sato, Shusei; Andersen, Stig Uggerhøj

    2014-01-01

    The current Lotus japonicus reference genome sequence is based on a hybrid assembly of Sanger TAC/BAC, Sanger shotgun and Illumina shotgun sequencing data generated from the Miyakojima-MG20 accession. It covers nearly all expressed L. japonicus genes and has been annotated mainly based on transcr......The current Lotus japonicus reference genome sequence is based on a hybrid assembly of Sanger TAC/BAC, Sanger shotgun and Illumina shotgun sequencing data generated from the Miyakojima-MG20 accession. It covers nearly all expressed L. japonicus genes and has been annotated mainly based...

  14. Comparative Genomics

    Indian Academy of Sciences (India)

    Home; Journals; Resonance – Journal of Science Education; Volume 11; Issue 8. Comparative Genomics - A Powerful New Tool in Biology. Anand K Bachhawat. General Article Volume 11 Issue 8 August 2006 pp 22-40. Fulltext. Click here to view fulltext PDF. Permanent link:

  15. Fiscal 1998 research achievement report. Research and development for acceleration of biological resource information infrastructure construction; 1998 nendo seibutsu shigen joho kiban seibi kasokuka kenkyu kaihatsu seika hokokusho

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    2000-03-01

    To fulfill the above-mentioned purpose, research and development of technologies were carried out involving genetic information measurement, genetic information analysis and utilization, analysis and utilization of information on proteins, and the measurement of biomolecular interaction. In a chapter 'Research and development of genetic information analysis using next-generation genome chip,' gene amplification is effected using RNA-DNA molecules and a method for efficiently building UPD (unilateral protruding DNA) was developed. In a chapter 'Development of international standard DNA chip and analyzing system for analysis of massive gene expression,' a relational database program for activating a system for gene preservation, purification, multiple injection, and management was completed. In a chapter '3D-1D modelling of all the proteins coded on genomes,' this fiscal year, analyses were conducted using Escherichia coli and Bacillus subtilis as model microbes. In a chapter 'Analysis of gene expression clustering,' a software program was developed for measuring and analyzing the budding yeast gene expression profile. (NEDO)

  16. Fiscal 1998 research achievement report. Research and development for acceleration of biological resource information infrastructure construction; 1998 nendo seibutsu shigen joho kiban seibi kasokuka kenkyu kaihatsu seika hokokusho

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    2000-03-01

    To fulfill the above-mentioned purpose, research and development of technologies were carried out involving genetic information measurement, genetic information analysis and utilization, analysis and utilization of information on proteins, and the measurement of biomolecular interaction. In a chapter 'Research and development of genetic information analysis using next-generation genome chip,' gene amplification is effected using RNA-DNA molecules and a method for efficiently building UPD (unilateral protruding DNA) was developed. In a chapter 'Development of international standard DNA chip and analyzing system for analysis of massive gene expression,' a relational database program for activating a system for gene preservation, purification, multiple injection, and management was completed. In a chapter '3D-1D modelling of all the proteins coded on genomes,' this fiscal year, analyses were conducted using Escherichia coli and Bacillus subtilis as model microbes. In a chapter 'Analysis of gene expression clustering,' a software program was developed for measuring and analyzing the budding yeast gene expression profile. (NEDO)

  17. Personal genomics services: whose genomes?

    Science.gov (United States)

    Gurwitz, David; Bregman-Eschet, Yael

    2009-07-01

    New companies offering personal whole-genome information services over the internet are dynamic and highly visible players in the personal genomics field. For fees currently ranging from US$399 to US$2500 and a vial of saliva, individuals can now purchase online access to their individual genetic information regarding susceptibility to a range of chronic diseases and phenotypic traits based on a genome-wide SNP scan. Most of the companies offering such services are based in the United States, but their clients may come from nearly anywhere in the world. Although the scientific validity, clinical utility and potential future implications of such services are being hotly debated, several ethical and regulatory questions related to direct-to-consumer (DTC) marketing strategies of genetic tests have not yet received sufficient attention. For example, how can we minimize the risk of unauthorized third parties from submitting other people's DNA for testing? Another pressing question concerns the ownership of (genotypic and phenotypic) information, as well as the unclear legal status of customers regarding their own personal information. Current legislation in the US and Europe falls short of providing clear answers to these questions. Until the regulation of personal genomics services catches up with the technology, we call upon commercial providers to self-regulate and coordinate their activities to minimize potential risks to individual privacy. We also point out some specific steps, along the trustee model, that providers of DTC personal genomics services as well as regulators and policy makers could consider for addressing some of the concerns raised below.

  18. Visualization for genomics: the Microbial Genome Viewer.

    NARCIS (Netherlands)

    Kerkhoven, R.; Enckevort, F.H.J. van; Boekhorst, J.; Molenaar, D; Siezen, R.J.

    2004-01-01

    SUMMARY: A Web-based visualization tool, the Microbial Genome Viewer, is presented that allows the user to combine complex genomic data in a highly interactive way. This Web tool enables the interactive generation of chromosome wheels and linear genome maps from genome annotation data stored in a

  19. Ancient genomics

    DEFF Research Database (Denmark)

    Der Sarkissian, Clio; Allentoft, Morten Erik; Avila Arcos, Maria del Carmen

    2015-01-01

    throughput of next generation sequencing platforms and the ability to target short and degraded DNA molecules. Many ancient specimens previously unsuitable for DNA analyses because of extensive degradation can now successfully be used as source materials. Additionally, the analytical power obtained...... by increasing the number of sequence reads to billions effectively means that contamination issues that have haunted aDNA research for decades, particularly in human studies, can now be efficiently and confidently quantified. At present, whole genomes have been sequenced from ancient anatomically modern humans...

  20. Marine genomics

    DEFF Research Database (Denmark)

    Oliveira Ribeiro, Ângela Maria; Foote, Andrew David; Kupczok, Anne

    2017-01-01

    Marine ecosystems occupy 71% of the surface of our planet, yet we know little about their diversity. Although the inventory of species is continually increasing, as registered by the Census of Marine Life program, only about 10% of the estimated two million marine species are known. This lag......-throughput sequencing approaches have been helping to improve our knowledge of marine biodiversity, from the rich microbial biota that forms the base of the tree of life to a wealth of plant and animal species. In this review, we present an overview of the applications of genomics to the study of marine life, from...

  1. Ensembl Genomes 2016: more genomes, more complexity.

    Science.gov (United States)

    Kersey, Paul Julian; Allen, James E; Armean, Irina; Boddu, Sanjay; Bolt, Bruce J; Carvalho-Silva, Denise; Christensen, Mikkel; Davis, Paul; Falin, Lee J; Grabmueller, Christoph; Humphrey, Jay; Kerhornou, Arnaud; Khobova, Julia; Aranganathan, Naveen K; Langridge, Nicholas; Lowy, Ernesto; McDowall, Mark D; Maheswari, Uma; Nuhn, Michael; Ong, Chuang Kee; Overduin, Bert; Paulini, Michael; Pedro, Helder; Perry, Emily; Spudich, Giulietta; Tapanari, Electra; Walts, Brandon; Williams, Gareth; Tello-Ruiz, Marcela; Stein, Joshua; Wei, Sharon; Ware, Doreen; Bolser, Daniel M; Howe, Kevin L; Kulesha, Eugene; Lawson, Daniel; Maslen, Gareth; Staines, Daniel M

    2016-01-04

    Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species, complementing the resources for vertebrate genomics developed in the context of the Ensembl project (http://www.ensembl.org). Together, the two resources provide a consistent set of programmatic and interactive interfaces to a rich range of data including reference sequence, gene models, transcriptional data, genetic variation and comparative analysis. This paper provides an update to the previous publications about the resource, with a focus on recent developments. These include the development of new analyses and views to represent polyploid genomes (of which bread wheat is the primary exemplar); and the continued up-scaling of the resource, which now includes over 23 000 bacterial genomes, 400 fungal genomes and 100 protist genomes, in addition to 55 genomes from invertebrate metazoa and 39 genomes from plants. This dramatic increase in the number of included genomes is one part of a broader effort to automate the integration of archival data (genome sequence, but also associated RNA sequence data and variant calls) within the context of reference genomes and make it available through the Ensembl user interfaces. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  2. Rodent malaria parasites : genome organization & comparative genomics

    NARCIS (Netherlands)

    Kooij, Taco W.A.

    2006-01-01

    The aim of the studies described in this thesis was to investigate the genome organization of rodent malaria parasites (RMPs) and compare the organization and gene content of the genomes of RMPs and the human malaria parasite P. falciparum. The release of the complete genome sequence of P.

  3. Funding Opportunity: Genomic Data Centers

    Science.gov (United States)

    Funding Opportunity CCG, Funding Opportunity Center for Cancer Genomics, CCG, Center for Cancer Genomics, CCG RFA, Center for cancer genomics rfa, genomic data analysis network, genomic data analysis network centers,

  4. Exploring Other Genomes: Bacteria.

    Science.gov (United States)

    Flannery, Maura C.

    2001-01-01

    Points out the importance of genomes other than the human genome project and provides information on the identified bacterial genomes Pseudomonas aeuroginosa, Leprosy, Cholera, Meningitis, Tuberculosis, Bubonic Plague, and plant pathogens. Considers the computer's use in genome studies. (Contains 14 references.) (YDS)

  5. Genomics With Cloud Computing

    OpenAIRE

    Sukhamrit Kaur; Sandeep Kaur

    2015-01-01

    Abstract Genomics is study of genome which provides large amount of data for which large storage and computation power is needed. These issues are solved by cloud computing that provides various cloud platforms for genomics. These platforms provides many services to user like easy access to data easy sharing and transfer providing storage in hundreds of terabytes more computational power. Some cloud platforms are Google genomics DNAnexus and Globus genomics. Various features of cloud computin...

  6. Genome Maps, a new generation genome browser.

    Science.gov (United States)

    Medina, Ignacio; Salavert, Francisco; Sanchez, Rubén; de Maria, Alejandro; Alonso, Roberto; Escobar, Pablo; Bleda, Marta; Dopazo, Joaquín

    2013-07-01

    Genome browsers have gained importance as more genomes and related genomic information become available. However, the increase of information brought about by new generation sequencing technologies is, at the same time, causing a subtle but continuous decrease in the efficiency of conventional genome browsers. Here, we present Genome Maps, a genome browser that implements an innovative model of data transfer and management. The program uses highly efficient technologies from the new HTML5 standard, such as scalable vector graphics, that optimize workloads at both server and client sides and ensure future scalability. Thus, data management and representation are entirely carried out by the browser, without the need of any Java Applet, Flash or other plug-in technology installation. Relevant biological data on genes, transcripts, exons, regulatory features, single-nucleotide polymorphisms, karyotype and so forth, are imported from web services and are available as tracks. In addition, several DAS servers are already included in Genome Maps. As a novelty, this web-based genome browser allows the local upload of huge genomic data files (e.g. VCF or BAM) that can be dynamically visualized in real time at the client side, thus facilitating the management of medical data affected by privacy restrictions. Finally, Genome Maps can easily be integrated in any web application by including only a few lines of code. Genome Maps is an open source collaborative initiative available in the GitHub repository (https://github.com/compbio-bigdata-viz/genome-maps). Genome Maps is available at: http://www.genomemaps.org.

  7. JGI Fungal Genomics Program

    Energy Technology Data Exchange (ETDEWEB)

    Grigoriev, Igor V.

    2011-03-14

    Genomes of energy and environment fungi are in focus of the Fungal Genomic Program at the US Department of Energy Joint Genome Institute (JGI). Its key project, the Genomics Encyclopedia of Fungi, targets fungi related to plant health (symbionts, pathogens, and biocontrol agents) and biorefinery processes (cellulose degradation, sugar fermentation, industrial hosts), and explores fungal diversity by means of genome sequencing and analysis. Over 50 fungal genomes have been sequenced by JGI to date and released through MycoCosm (www.jgi.doe.gov/fungi), a fungal web-portal, which integrates sequence and functional data with genome analysis tools for user community. Sequence analysis supported by functional genomics leads to developing parts list for complex systems ranging from ecosystems of biofuel crops to biorefineries. Recent examples of such 'parts' suggested by comparative genomics and functional analysis in these areas are presented here

  8. Genomic Encyclopedia of Fungi

    Energy Technology Data Exchange (ETDEWEB)

    Grigoriev, Igor

    2012-08-10

    Genomes of fungi relevant to energy and environment are in focus of the Fungal Genomic Program at the US Department of Energy Joint Genome Institute (JGI). Its key project, the Genomics Encyclopedia of Fungi, targets fungi related to plant health (symbionts, pathogens, and biocontrol agents) and biorefinery processes (cellulose degradation, sugar fermentation, industrial hosts), and explores fungal diversity by means of genome sequencing and analysis. Over 150 fungal genomes have been sequenced by JGI to date and released through MycoCosm (www.jgi.doe.gov/fungi), a fungal web-portal, which integrates sequence and functional data with genome analysis tools for user community. Sequence analysis supported by functional genomics leads to developing parts list for complex systems ranging from ecosystems of biofuel crops to biorefineries. Recent examples of such parts suggested by comparative genomics and functional analysis in these areas are presented here.

  9. Genomics With Cloud Computing

    Directory of Open Access Journals (Sweden)

    Sukhamrit Kaur

    2015-04-01

    Full Text Available Abstract Genomics is study of genome which provides large amount of data for which large storage and computation power is needed. These issues are solved by cloud computing that provides various cloud platforms for genomics. These platforms provides many services to user like easy access to data easy sharing and transfer providing storage in hundreds of terabytes more computational power. Some cloud platforms are Google genomics DNAnexus and Globus genomics. Various features of cloud computing to genomics are like easy access and sharing of data security of data less cost to pay for resources but still there are some demerits like large time needed to transfer data less network bandwidth.

  10. Report on achievements in fiscal 1998. Research on acceleration of improving the base for biological resource information, and development of a technology to measure gene information (development of the DNA measuring technology using bead arrays); 1998 nendo seibutsu shigen joho kiban seibi kasokuka kenkyu idenshi joho no keisoku gijutsu no kaihatsu seika hokokusho. Bizu array wo mochiita DNA keisoku gijutsu no kaihatsu

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    2000-03-01

    It is necessary in the field of post-genoms to know statistical correlation between DNA orientation information and clinical data, which helped growth of DNA probe arrays (DNA chips). However, it is difficult in patent point of view to develop chips in Japan. On the other hand, a movement has begun to use beads fixed with DNA probes in place of DNA chips demarcated on the surface of solids. This is a method to investigate hybridized DNA by means of fluorescent detection, in which each bead retaining the DNA probes is colored to make identification of the retained probes possible, and hybridizing reaction is performed in aqueous solution. Hitachi has developed a DNA measuring technology using bead arrays. The bead array has probe fixing beads of about 100 {mu} m laid sequentially inside a capillary, wherein the array can be used to inspect a large number of genes. Thus, this method can be a DNA measuring technology which is inexpensive, and high in reproducibility. These features lead to a belief that the technology is suitable for gene inspection devices used in the process of medicine development and at the clinical sites. (NEDO)

  11. Comparative Genome Analysis and Genome Evolution

    NARCIS (Netherlands)

    Snel, Berend

    2002-01-01

    This thesis described a collection of bioinformatic analyses on complete genome sequence data. We have studied the evolution of gene content and find that vertical inheritance dominates over horizontal gene trasnfer, even to the extent that we can use the gene content to make genome phylogenies.

  12. Genomic Data Commons launches

    Science.gov (United States)

    The Genomic Data Commons (GDC), a unified data system that promotes sharing of genomic and clinical data between researchers, launched today with a visit from Vice President Joe Biden to the operations center at the University of Chicago.

  13. Rat Genome Database (RGD)

    Data.gov (United States)

    U.S. Department of Health & Human Services — The Rat Genome Database (RGD) is a collaborative effort between leading research institutions involved in rat genetic and genomic research to collect, consolidate,...

  14. Visualization for genomics: the Microbial Genome Viewer.

    Science.gov (United States)

    Kerkhoven, Robert; van Enckevort, Frank H J; Boekhorst, Jos; Molenaar, Douwe; Siezen, Roland J

    2004-07-22

    A Web-based visualization tool, the Microbial Genome Viewer, is presented that allows the user to combine complex genomic data in a highly interactive way. This Web tool enables the interactive generation of chromosome wheels and linear genome maps from genome annotation data stored in a MySQL database. The generated images are in scalable vector graphics (SVG) format, which is suitable for creating high-quality scalable images and dynamic Web representations. Gene-related data such as transcriptome and time-course microarray experiments can be superimposed on the maps for visual inspection. The Microbial Genome Viewer 1.0 is freely available at http://www.cmbi.kun.nl/MGV

  15. Genomic prediction using subsampling

    OpenAIRE

    Xavier, Alencar; Xu, Shizhong; Muir, William; Rainey, Katy Martin

    2017-01-01

    Background Genome-wide assisted selection is a critical tool for the?genetic improvement of plants and animals. Whole-genome regression models in Bayesian framework represent the main family of prediction methods. Fitting such models with a large number of observations involves a prohibitive computational burden. We propose the use of subsampling bootstrap Markov chain in genomic prediction. Such method consists of fitting whole-genome regression models by subsampling observations in each rou...

  16. Ebolavirus comparative genomics

    DEFF Research Database (Denmark)

    Jun, Se-Ran; Leuze, Michael R.; Nookaew, Intawat

    2015-01-01

    The 2014 Ebola outbreak in West Africa is the largest documented for this virus. To examine the dynamics of this genome, we compare more than 100 currently available ebolavirus genomes to each other and to other viral genomes. Based on oligomer frequency analysis, the family Filoviridae forms...

  17. Social information solution; Shakai joho solution

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    2000-01-10

    An information system for government offices is developed, a system that integrally supports operations inside government offices and the staff service operations by combining Intra Net as the basis of an information system with Internet. The objective of the system is as follows: (1) Information sharing in the place of work and utilization of information resources. (2) Improvement in administrative services and vitalization of an interchange of residents through the preparation of Internet environment. (3) Rationalization of staff operations through groupeware. In addition, by building a network system for the entire region, information communication service is to be provided as a solution between the residents and the administration in the occurrence of a disaster as well as for home care, medical and nursing assistance in the health, medical and welfare fields. (translated by NEDO)

  18. The Sequenced Angiosperm Genomes and Genome Databases.

    Science.gov (United States)

    Chen, Fei; Dong, Wei; Zhang, Jiawei; Guo, Xinyue; Chen, Junhao; Wang, Zhengjia; Lin, Zhenguo; Tang, Haibao; Zhang, Liangsheng

    2018-01-01

    Angiosperms, the flowering plants, provide the essential resources for human life, such as food, energy, oxygen, and materials. They also promoted the evolution of human, animals, and the planet earth. Despite the numerous advances in genome reports or sequencing technologies, no review covers all the released angiosperm genomes and the genome databases for data sharing. Based on the rapid advances and innovations in the database reconstruction in the last few years, here we provide a comprehensive review for three major types of angiosperm genome databases, including databases for a single species, for a specific angiosperm clade, and for multiple angiosperm species. The scope, tools, and data of each type of databases and their features are concisely discussed. The genome databases for a single species or a clade of species are especially popular for specific group of researchers, while a timely-updated comprehensive database is more powerful for address of major scientific mysteries at the genome scale. Considering the low coverage of flowering plants in any available database, we propose construction of a comprehensive database to facilitate large-scale comparative studies of angiosperm genomes and to promote the collaborative studies of important questions in plant biology.

  19. Bioinformatics decoding the genome

    CERN Multimedia

    CERN. Geneva; Deutsch, Sam; Michielin, Olivier; Thomas, Arthur; Descombes, Patrick

    2006-01-01

    Extracting the fundamental genomic sequence from the DNA From Genome to Sequence : Biology in the early 21st century has been radically transformed by the availability of the full genome sequences of an ever increasing number of life forms, from bacteria to major crop plants and to humans. The lecture will concentrate on the computational challenges associated with the production, storage and analysis of genome sequence data, with an emphasis on mammalian genomes. The quality and usability of genome sequences is increasingly conditioned by the careful integration of strategies for data collection and computational analysis, from the construction of maps and libraries to the assembly of raw data into sequence contigs and chromosome-sized scaffolds. Once the sequence is assembled, a major challenge is the mapping of biologically relevant information onto this sequence: promoters, introns and exons of protein-encoding genes, regulatory elements, functional RNAs, pseudogenes, transposons, etc. The methodological ...

  20. Genomic research in Eucalyptus.

    Science.gov (United States)

    Poke, Fiona S; Vaillancourt, René E; Potts, Brad M; Reid, James B

    2005-09-01

    Eucalyptus L'Hérit. is a genus comprised of more than 700 species that is of vital importance ecologically to Australia and to the forestry industry world-wide, being grown in plantations for the production of solid wood products as well as pulp for paper. With the sequencing of the genomes of Arabidopsis thaliana and Oryza sativa and the recent completion of the first tree genome sequence, Populus trichocarpa, attention has turned to the current status of genomic research in Eucalyptus. For several eucalypt species, large segregating families have been established, high-resolution genetic maps constructed and large EST databases generated. Collaborative efforts have been initiated for the integration of diverse genomic projects and will provide the framework for future research including exploiting the sequence of the entire eucalypt genome which is currently being sequenced. This review summarises the current position of genomic research in Eucalyptus and discusses the direction of future research.

  1. Genome packaging in viruses

    OpenAIRE

    Sun, Siyang; Rao, Venigalla B.; Rossmann, Michael G.

    2010-01-01

    Genome packaging is a fundamental process in a viral life cycle. Many viruses assemble preformed capsids into which the genomic material is subsequently packaged. These viruses use a packaging motor protein that is driven by the hydrolysis of ATP to condense the nucleic acids into a confined space. How these motor proteins package viral genomes had been poorly understood until recently, when a few X-ray crystal structures and cryo-electron microscopy structures became available. Here we discu...

  2. Between Two Fern Genomes

    Science.gov (United States)

    2014-01-01

    Ferns are the only major lineage of vascular plants not represented by a sequenced nuclear genome. This lack of genome sequence information significantly impedes our ability to understand and reconstruct genome evolution not only in ferns, but across all land plants. Azolla and Ceratopteris are ideal and complementary candidates to be the first ferns to have their nuclear genomes sequenced. They differ dramatically in genome size, life history, and habit, and thus represent the immense diversity of extant ferns. Together, this pair of genomes will facilitate myriad large-scale comparative analyses across ferns and all land plants. Here we review the unique biological characteristics of ferns and describe a number of outstanding questions in plant biology that will benefit from the addition of ferns to the set of taxa with sequenced nuclear genomes. We explain why the fern clade is pivotal for understanding genome evolution across land plants, and we provide a rationale for how knowledge of fern genomes will enable progress in research beyond the ferns themselves. PMID:25324969

  3. Causes of genome instability

    DEFF Research Database (Denmark)

    Langie, Sabine A S; Koppen, Gudrun; Desaulniers, Daniel

    2015-01-01

    function, chromosome segregation, telomere length). The purpose of this review is to describe the crucial aspects of genome instability, to outline the ways in which environmental chemicals can affect this cancer hallmark and to identify candidate chemicals for further study. The overall aim is to make......Genome instability is a prerequisite for the development of cancer. It occurs when genome maintenance systems fail to safeguard the genome's integrity, whether as a consequence of inherited defects or induced via exposure to environmental agents (chemicals, biological agents and radiation). Thus...

  4. Fungal Genomics Program

    Energy Technology Data Exchange (ETDEWEB)

    Grigoriev, Igor

    2012-03-12

    The JGI Fungal Genomics Program aims to scale up sequencing and analysis of fungal genomes to explore the diversity of fungi important for energy and the environment, and to promote functional studies on a system level. Combining new sequencing technologies and comparative genomics tools, JGI is now leading the world in fungal genome sequencing and analysis. Over 120 sequenced fungal genomes with analytical tools are available via MycoCosm (www.jgi.doe.gov/fungi), a web-portal for fungal biologists. Our model of interacting with user communities, unique among other sequencing centers, helps organize these communities, improves genome annotation and analysis work, and facilitates new larger-scale genomic projects. This resulted in 20 high-profile papers published in 2011 alone and contributing to the Genomics Encyclopedia of Fungi, which targets fungi related to plant health (symbionts, pathogens, and biocontrol agents) and biorefinery processes (cellulose degradation, sugar fermentation, industrial hosts). Our next grand challenges include larger scale exploration of fungal diversity (1000 fungal genomes), developing molecular tools for DOE-relevant model organisms, and analysis of complex systems and metagenomes.

  5. MIPS plant genome information resources.

    Science.gov (United States)

    Spannagl, Manuel; Haberer, Georg; Ernst, Rebecca; Schoof, Heiko; Mayer, Klaus F X

    2007-01-01

    The Munich Institute for Protein Sequences (MIPS) has been involved in maintaining plant genome databases since the Arabidopsis thaliana genome project. Genome databases and analysis resources have focused on individual genomes and aim to provide flexible and maintainable data sets for model plant genomes as a backbone against which experimental data, for example from high-throughput functional genomics, can be organized and evaluated. In addition, model genomes also form a scaffold for comparative genomics, and much can be learned from genome-wide evolutionary studies.

  6. Computational genomics of hyperthermophiles

    NARCIS (Netherlands)

    Werken, van de H.J.G.

    2008-01-01

    With the ever increasing number of completely sequenced prokaryotic genomes and the subsequent use of functional genomics tools, e.g. DNA microarray and proteomics, computational data analysis and the integration of microbial and molecular data is inevitable. This thesis describes the computational

  7. Safeguarding genome integrity

    DEFF Research Database (Denmark)

    Sørensen, Claus Storgaard; Syljuåsen, Randi G

    2012-01-01

    Mechanisms that preserve genome integrity are highly important during the normal life cycle of human cells. Loss of genome protective mechanisms can lead to the development of diseases such as cancer. Checkpoint kinases function in the cellular surveillance pathways that help cells to cope with D...

  8. Human genome I

    International Nuclear Information System (INIS)

    Anon.

    1989-01-01

    An international conference, Human Genome I, was held Oct. 2-4, 1989 in San Diego, Calif. Selected speakers discussed: Current Status of the Genome Project; Technique Innovations; Interesting regions; Applications; and Organization - Different Views of Current and Future Science and Procedures. Posters, consisting of 119 presentations, were displayed during the sessions. 119 were indexed for inclusion to the Energy Data Base

  9. Rumen microbial genomics

    International Nuclear Information System (INIS)

    Morrison, M.; Nelson, K.E.

    2005-01-01

    Improving microbial degradation of plant cell wall polysaccharides remains one of the highest priority goals for all livestock enterprises, including the cattle herds and draught animals of developing countries. The North American Consortium for Genomics of Fibrolytic Ruminal Bacteria was created to promote the sequencing and comparative analysis of rumen microbial genomes, offering the potential to fully assess the genetic potential in a functional and comparative fashion. It has been found that the Fibrobacter succinogenes genome encodes many more endoglucanases and cellodextrinases than previously isolated, and several new processive endoglucanases have been identified by genome and proteomic analysis of Ruminococcus albus, in addition to a variety of strategies for its adhesion to fibre. The ramifications of acquiring genome sequence data for rumen microorganisms are profound, including the potential to elucidate and overcome the biochemical, ecological or physiological processes that are rate limiting for ruminal fibre degradation. (author)

  10. Microbial Genomes Multiply

    Science.gov (United States)

    Doolittle, Russell F.

    2002-01-01

    The publication of the first complete sequence of a bacterial genome in 1995 was a signal event, underscored by the fact that the article has been cited more than 2,100 times during the intervening seven years. It was a marvelous technical achievement, made possible by automatic DNA-sequencing machines. The feat is the more impressive in that complete genome sequencing has now been adopted in many different laboratories around the world. Four years ago in these columns I examined the situation after a dozen microbial genomes had been completed. Now, with upwards of 60 microbial genome sequences determined and twice that many in progress, it seems reasonable to assess just what is being learned. Are new concepts emerging about how cells work? Have there been practical benefits in the fields of medicine and agriculture? Is it feasible to determine the genomic sequence of every bacterial species on Earth? The answers to these questions maybe Yes, Perhaps, and No, respectively.

  11. Musa sebagai Model Genom

    Directory of Open Access Journals (Sweden)

    RITA MEGIA

    2005-12-01

    Full Text Available During the meeting in Arlington, USA in 2001, the scientists grouped in PROMUSA agreed with the launching of the Global Musa Genomics Consortium. The Consortium aims to apply genomics technologies to the improvement of this important crop. These genome projects put banana as the third model species after Arabidopsis and rice that will be analyzed and sequenced. Comparing to Arabidopsis and rice, banana genome provides a unique and powerful insight into structural and in functional genomics that could not be found in those two species. This paper discussed these subjects-including the importance of banana as the fourth main food in the world, the evolution and biodiversity of this genetic resource and its parasite.

  12. The genome editing revolution

    DEFF Research Database (Denmark)

    Stella, Stefano; Montoya, Guillermo

    2016-01-01

    -Cas system has become the main tool for genome editing in many laboratories. Currently the targeted genome editing technology has been used in many fields and may be a possible approach for human gene therapy. Furthermore, it can also be used to modifying the genomes of model organisms for studying human......In the last 10 years, we have witnessed a blooming of targeted genome editing systems and applications. The area was revolutionized by the discovery and characterization of the transcription activator-like effector proteins, which are easier to engineer to target new DNA sequences than...... sequence). This ribonucleoprotein complex protects bacteria from invading DNAs, and it was adapted to be used in genome editing. The CRISPR ribonucleic acid (RNA) molecule guides to the specific DNA site the Cas9 nuclease to cleave the DNA target. Two years and more than 1000 publications later, the CRISPR...

  13. Phytozome Comparative Plant Genomics Portal

    Energy Technology Data Exchange (ETDEWEB)

    Goodstein, David; Batra, Sajeev; Carlson, Joseph; Hayes, Richard; Phillips, Jeremy; Shu, Shengqiang; Schmutz, Jeremy; Rokhsar, Daniel

    2014-09-09

    The Dept. of Energy Joint Genome Institute is a genomics user facility supporting DOE mission science in the areas of Bioenergy, Carbon Cycling, and Biogeochemistry. The Plant Program at the JGI applies genomic, analytical, computational and informatics platforms and methods to: 1. Understand and accelerate the improvement (domestication) of bioenergy crops 2. Characterize and moderate plant response to climate change 3. Use comparative genomics to identify constrained elements and infer gene function 4. Build high quality genomic resource platforms of JGI Plant Flagship genomes for functional and experimental work 5. Expand functional genomic resources for Plant Flagship genomes

  14. Genome-derived vaccines.

    Science.gov (United States)

    De Groot, Anne S; Rappuoli, Rino

    2004-02-01

    Vaccine research entered a new era when the complete genome of a pathogenic bacterium was published in 1995. Since then, more than 97 bacterial pathogens have been sequenced and at least 110 additional projects are now in progress. Genome sequencing has also dramatically accelerated: high-throughput facilities can draft the sequence of an entire microbe (two to four megabases) in 1 to 2 days. Vaccine developers are using microarrays, immunoinformatics, proteomics and high-throughput immunology assays to reduce the truly unmanageable volume of information available in genome databases to a manageable size. Vaccines composed by novel antigens discovered from genome mining are already in clinical trials. Within 5 years we can expect to see a novel class of vaccines composed by genome-predicted, assembled and engineered T- and Bcell epitopes. This article addresses the convergence of three forces--microbial genome sequencing, computational immunology and new vaccine technologies--that are shifting genome mining for vaccines onto the forefront of immunology research.

  15. The Banana Genome Hub

    Science.gov (United States)

    Droc, Gaëtan; Larivière, Delphine; Guignon, Valentin; Yahiaoui, Nabila; This, Dominique; Garsmeur, Olivier; Dereeper, Alexis; Hamelin, Chantal; Argout, Xavier; Dufayard, Jean-François; Lengelle, Juliette; Baurens, Franc-Christophe; Cenci, Alberto; Pitollat, Bertrand; D’Hont, Angélique; Ruiz, Manuel; Rouard, Mathieu; Bocs, Stéphanie

    2013-01-01

    Banana is one of the world’s favorite fruits and one of the most important crops for developing countries. The banana reference genome sequence (Musa acuminata) was recently released. Given the taxonomic position of Musa, the completed genomic sequence has particular comparative value to provide fresh insights about the evolution of the monocotyledons. The study of the banana genome has been enhanced by a number of tools and resources that allows harnessing its sequence. First, we set up essential tools such as a Community Annotation System, phylogenomics resources and metabolic pathways. Then, to support post-genomic efforts, we improved banana existing systems (e.g. web front end, query builder), we integrated available Musa data into generic systems (e.g. markers and genetic maps, synteny blocks), we have made interoperable with the banana hub, other existing systems containing Musa data (e.g. transcriptomics, rice reference genome, workflow manager) and finally, we generated new results from sequence analyses (e.g. SNP and polymorphism analysis). Several uses cases illustrate how the Banana Genome Hub can be used to study gene families. Overall, with this collaborative effort, we discuss the importance of the interoperability toward data integration between existing information systems. Database URL: http://banana-genome.cirad.fr/ PMID:23707967

  16. Genomic instability following irradiation

    International Nuclear Information System (INIS)

    Hacker-Klom, U.B.; Goehde, W.

    2001-01-01

    Ionising irradiation may induce genomic instability. The broad spectrum of stress reactions in eukaryontic cells to irradiation complicates the discovery of cellular targets and pathways inducing genomic instability. Irradiation may initiate genomic instability by deletion of genes controlling stability, by induction of genes stimulating instability and/or by activating endogeneous cellular viruses. Alternatively or additionally it is discussed that the initiation of genomic instability may be a consequence of radiation or other agents independently of DNA damage implying non nuclear targets, e.g. signal cascades. As a further mechanism possibly involved our own results may suggest radiation-induced changes in chromatin structure. Once initiated the process of genomic instability probably is perpetuated by endogeneous processes necessary for proliferation. Genomic instability may be a cause or a consequence of the neoplastic phenotype. As a conclusion from the data available up to now a new interpretation of low level radiation effects for radiation protection and in radiotherapy appears useful. The detection of the molecular mechanisms of genomic instability will be important in this context and may contribute to a better understanding of phenomenons occurring at low doses <10 cSv which are not well understood up to now. (orig.)

  17. Traditional medicine and genomics

    Directory of Open Access Journals (Sweden)

    Kalpana Joshi

    2010-01-01

    Full Text Available ′Omics′ developments in the form of genomics, proteomics and metabolomics have increased the impetus of traditional medicine research. Studies exploring the genomic, proteomic and metabolomic basis of human constitutional types based on Ayurveda and other systems of oriental medicine are becoming popular. Such studies remain important to developing better understanding of human variations and individual differences. Countries like India, Korea, China and Japan are investing in research on evidence-based traditional medicines and scientific validation of fundamental principles. This review provides an account of studies addressing relationships between traditional medicine and genomics.

  18. Traditional medicine and genomics.

    Science.gov (United States)

    Joshi, Kalpana; Ghodke, Yogita; Shintre, Pooja

    2010-01-01

    'Omics' developments in the form of genomics, proteomics and metabolomics have increased the impetus of traditional medicine research. Studies exploring the genomic, proteomic and metabolomic basis of human constitutional types based on Ayurveda and other systems of oriental medicine are becoming popular. Such studies remain important to developing better understanding of human variations and individual differences. Countries like India, Korea, China and Japan are investing in research on evidence-based traditional medicines and scientific validation of fundamental principles. This review provides an account of studies addressing relationships between traditional medicine and genomics.

  19. Bacillus subtilis genome diversity.

    Science.gov (United States)

    Earl, Ashlee M; Losick, Richard; Kolter, Roberto

    2007-02-01

    Microarray-based comparative genomic hybridization (M-CGH) is a powerful method for rapidly identifying regions of genome diversity among closely related organisms. We used M-CGH to examine the genome diversity of 17 strains belonging to the nonpathogenic species Bacillus subtilis. Our M-CGH results indicate that there is considerable genetic heterogeneity among members of this species; nearly one-third of Bsu168-specific genes exhibited variability, as measured by the microarray hybridization intensities. The variable loci include those encoding proteins involved in antibiotic production, cell wall synthesis, sporulation, and germination. The diversity in these genes may reflect this organism's ability to survive in diverse natural settings.

  20. Genomic taxonomy of vibrios

    Directory of Open Access Journals (Sweden)

    Iida Tetsuya

    2009-10-01

    Full Text Available Abstract Background Vibrio taxonomy has been based on a polyphasic approach. In this study, we retrieve useful taxonomic information (i.e. data that can be used to distinguish different taxonomic levels, such as species and genera from 32 genome sequences of different vibrio species. We use a variety of tools to explore the taxonomic relationship between the sequenced genomes, including Multilocus Sequence Analysis (MLSA, supertrees, Average Amino Acid Identity (AAI, genomic signatures, and Genome BLAST atlases. Our aim is to analyse the usefulness of these tools for species identification in vibrios. Results We have generated four new genome sequences of three Vibrio species, i.e., V. alginolyticus 40B, V. harveyi-like 1DA3, and V. mimicus strains VM573 and VM603, and present a broad analyses of these genomes along with other sequenced Vibrio species. The genome atlas and pangenome plots provide a tantalizing image of the genomic differences that occur between closely related sister species, e.g. V. cholerae and V. mimicus. The vibrio pangenome contains around 26504 genes. The V. cholerae core genome and pangenome consist of 1520 and 6923 genes, respectively. Pangenomes might allow different strains of V. cholerae to occupy different niches. MLSA and supertree analyses resulted in a similar phylogenetic picture, with a clear distinction of four groups (Vibrio core group, V. cholerae-V. mimicus, Aliivibrio spp., and Photobacterium spp.. A Vibrio species is defined as a group of strains that share > 95% DNA identity in MLSA and supertree analysis, > 96% AAI, ≤ 10 genome signature dissimilarity, and > 61% proteome identity. Strains of the same species and species of the same genus will form monophyletic groups on the basis of MLSA and supertree. Conclusion The combination of different analytical and bioinformatics tools will enable the most accurate species identification through genomic computational analysis. This endeavour will culminate in

  1. Human Genome Project

    Energy Technology Data Exchange (ETDEWEB)

    Block, S. [The MITRE Corporation, McLean, VA (US). JASON Program Office; Cornwall, J. [The MITRE Corporation, McLean, VA (US). JASON Program Office; Dally, W. [The MITRE Corporation, McLean, VA (US). JASON Program Office; Dyson, F. [The MITRE Corporation, McLean, VA (US). JASON Program Office; Fortson, N. [The MITRE Corporation, McLean, VA (US). JASON Program Office; Joyce, G. [The MITRE Corporation, McLean, VA (US). JASON Program Office; Kimble, H. J. [The MITRE Corporation, McLean, VA (US). JASON Program Office; Lewis, N. [The MITRE Corporation, McLean, VA (US). JASON Program Office; Max, C. [The MITRE Corporation, McLean, VA (US). JASON Program Office; Prince, T. [The MITRE Corporation, McLean, VA (US). JASON Program Office; Schwitters, R. [The MITRE Corporation, McLean, VA (US). JASON Program Office; Weinberger, P. [The MITRE Corporation, McLean, VA (US). JASON Program Office; Woodin, W. H. [The MITRE Corporation, McLean, VA (US). JASON Program Office

    1998-01-04

    The study reviews Department of Energy supported aspects of the United States Human Genome Project, the joint National Institutes of Health/Department of Energy program to characterize all human genetic material, to discover the set of human genes, and to render them accessible for further biological study. The study concentrates on issues of technology, quality assurance/control, and informatics relevant to current effort on the genome project and needs beyond it. Recommendations are presented on areas of the genome program that are of particular interest to and supported by the Department of Energy.

  2. Human Genome Program

    Energy Technology Data Exchange (ETDEWEB)

    1993-01-01

    The DOE Human Genome program has grown tremendously, as shown by the marked increase in the number of genome-funded projects since the last workshop held in 1991. The abstracts in this book describe the genome research of DOE-funded grantees and contractors and invited guests, and all projects are represented at the workshop by posters. The 3-day meeting includes plenary sessions on ethical, legal, and social issues pertaining to the availability of genetic data; sequencing techniques, informatics support; and chromosome and cDNA mapping and sequencing.

  3. Genomic signal processing

    CERN Document Server

    Shmulevich, Ilya

    2007-01-01

    Genomic signal processing (GSP) can be defined as the analysis, processing, and use of genomic signals to gain biological knowledge, and the translation of that knowledge into systems-based applications that can be used to diagnose and treat genetic diseases. Situated at the crossroads of engineering, biology, mathematics, statistics, and computer science, GSP requires the development of both nonlinear dynamical models that adequately represent genomic regulation, and diagnostic and therapeutic tools based on these models. This book facilitates these developments by providing rigorous mathema

  4. FY 1999 Industrial science and technology research and development project. Report on the results of research and development of the technologies for genome informatics (Acceleration of analysis of green mold transcription control information); 1999 nendo genome infomatics gijutsu kenkyu kaihatsu seika hokokusho. Koji kabi no tensha seigyo joho no kaiseki kasokuka nado

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    2001-03-01

    A total of 49 budding yeast transcription factor disruptants and one conditional transcription over expression strain are produced, to elucidate the gene regulation networks using the gene expression profile data, and to measure the systematic and high-quality gene expression profiles using the Affymetrix's GeneChip system. The program is also developed for accurately predicting the base sequences which regulate expression of given gene groups, based on the uniqueness of the upstream sequences. The analysis with the aid of the program predicts 8 gene expression regulation sequences, which are considered to be novel, from the gene groups of retarded expression by the transcription factor disruptants. The time course gene expression data are produced from the transcription factor SW14 conditional over expression strain. The analysis of the data indicates that the analysis of the subtracted genes using the gene expression profiles from the wild type strain is useful for clarifying the effects of the derived transcription factor over expression. (NEDO)

  5. Genomics and fish adaptation

    Directory of Open Access Journals (Sweden)

    Agostinho Antunes

    2015-12-01

    Full Text Available The completion of the human genome sequencing in 2003 opened a new perspective into the importance of whole genome sequencing projects, and currently multiple species are having their genomes completed sequenced, from simple organisms, such as bacteria, to more complex taxa, such as mammals. This voluminous sequencing data generated across multiple organisms provides also the framework to better understand the genetic makeup of such species and related ones, allowing to explore the genetic changes underlining the evolution of diverse phenotypic traits. Here, recent results from our group retrieved from comparative evolutionary genomic analyses of varied fish species will be considered to exemplify how gene novelty and gene enhancement by positive selection might have been determinant in the success of adaptive radiations into diverse habitats and lifestyles.

  6. Lophotrochozoan mitochondrial genomes

    Energy Technology Data Exchange (ETDEWEB)

    Valles, Yvonne; Boore, Jeffrey L.

    2005-10-01

    Progress in both molecular techniques and phylogeneticmethods has challenged many of the interpretations of traditionaltaxonomy. One example is in the recognition of the animal superphylumLophotrochozoa (annelids, mollusks, echiurans, platyhelminthes,brachiopods, and other phyla), although the relationships within thisgroup and the inclusion of some phyla remain uncertain. While much ofthis progress in phylogenetic reconstruction has been based on comparingsingle gene sequences, we are beginning to see the potential of comparinglarge-scale features of genomes, such as the relative order of genes.Even though tremendous progress is being made on the sequencedetermination of whole nuclear genomes, the dataset of choice forgenome-level characters for many animals across a broad taxonomic rangeremains mitochondrial genomes. We review here what is known aboutmitochondrial genomes of the lophotrochozoans and discuss the promisethat this dataset will enable insight into theirrelationships.

  7. Mouse Genome Informatics (MGI)

    Data.gov (United States)

    U.S. Department of Health & Human Services — MGI is the international database resource for the laboratory mouse, providing integrated genetic, genomic, and biological data to facilitate the study of human...

  8. Genomic definition of species

    Energy Technology Data Exchange (ETDEWEB)

    Crkvenjakov, R.; Drmanac, R.

    1991-07-01

    The subject of this paper is the definition of species based on the assumption that genome is the fundamental level for the origin and maintenance of biological diversity. For this view to be logically consistent it is necessary to assume the existence and operation of the new law which we call genome law. For this reason the genome law is included in the explanation of species phenomenon presented here even if its precise formulation and elaboration are left for the future. The intellectual underpinnings of this definition can be traced to Goldschmidt. We wish to explore some philosophical aspects of the definition of species in terms of the genome. The point of proposing the definition on these grounds is that any real advance in evolutionary theory has to be correct in both its philosophy and its science.

  9. Structural genomics in endocrinology

    NARCIS (Netherlands)

    Smit, J. W.; Romijn, J. A.

    2001-01-01

    Traditionally, endocrine research evolved from the phenotypical characterisation of endocrine disorders to the identification of underlying molecular pathophysiology. This approach has been, and still is, extremely successful. The introduction of genomics and proteomics has resulted in a reversal of

  10. Epidemiology & Genomics Research Program

    Science.gov (United States)

    The Epidemiology and Genomics Research Program, in the National Cancer Institute's Division of Cancer Control and Population Sciences, funds research in human populations to understand the determinants of cancer occurrence and outcomes.

  11. Annotating individual human genomes.

    Science.gov (United States)

    Torkamani, Ali; Scott-Van Zeeland, Ashley A; Topol, Eric J; Schork, Nicholas J

    2011-10-01

    Advances in DNA sequencing technologies have made it possible to rapidly, accurately and affordably sequence entire individual human genomes. As impressive as this ability seems, however, it will not likely amount to much if one cannot extract meaningful information from individual sequence data. Annotating variations within individual genomes and providing information about their biological or phenotypic impact will thus be crucially important in moving individual sequencing projects forward, especially in the context of the clinical use of sequence information. In this paper we consider the various ways in which one might annotate individual sequence variations and point out limitations in the available methods for doing so. It is arguable that, in the foreseeable future, DNA sequencing of individual genomes will become routine for clinical, research, forensic, and personal purposes. We therefore also consider directions and areas for further research in annotating genomic variants. Copyright © 2011 Elsevier Inc. All rights reserved.

  12. ANNOTATING INDIVIDUAL HUMAN GENOMES*

    Science.gov (United States)

    Torkamani, Ali; Scott-Van Zeeland, Ashley A.; Topol, Eric J.; Schork, Nicholas J.

    2014-01-01

    Advances in DNA sequencing technologies have made it possible to rapidly, accurately and affordably sequence entire individual human genomes. As impressive as this ability seems, however, it will not likely to amount to much if one cannot extract meaningful information from individual sequence data. Annotating variations within individual genomes and providing information about their biological or phenotypic impact will thus be crucially important in moving individual sequencing projects forward, especially in the context of the clinical use of sequence information. In this paper we consider the various ways in which one might annotate individual sequence variations and point out limitations in the available methods for doing so. It is arguable that, in the foreseeable future, DNA sequencing of individual genomes will become routine for clinical, research, forensic, and personal purposes. We therefore also consider directions and areas for further research in annotating genomic variants. PMID:21839162

  13. Yeast genome sequencing:

    DEFF Research Database (Denmark)

    Piskur, Jure; Langkjær, Rikke Breinhold

    2004-01-01

    For decades, unicellular yeasts have been general models to help understand the eukaryotic cell and also our own biology. Recently, over a dozen yeast genomes have been sequenced, providing the basis to resolve several complex biological questions. Analysis of the novel sequence data has shown...... of closely related species helps in gene annotation and to answer how many genes there really are within the genomes. Analysis of non-coding regions among closely related species has provided an example of how to determine novel gene regulatory sequences, which were previously difficult to analyse because...... they are short and degenerate and occupy different positions. Comparative genomics helps to understand the origin of yeasts and points out crucial molecular events in yeast evolutionary history, such as whole-genome duplication and horizontal gene transfer(s). In addition, the accumulating sequence data provide...

  14. Genetical Genomics for Evolutionary Studies

    NARCIS (Netherlands)

    Prins, J.C.P.; Smant, G.; Jansen, R.C.

    2012-01-01

    Genetical genomics combines acquired high-throughput genomic data with genetic analysis. In this chapter, we discuss the application of genetical genomics for evolutionary studies, where new high-throughput molecular technologies are combined with mapping quantitative trait loci (QTL) on the genome

  15. The human genome project

    International Nuclear Information System (INIS)

    Worton, R.

    1996-01-01

    The Human Genome Project is a massive international research project, costing 3 to 5 billion dollars and expected to take 15 years, which will identify the all the genes in the human genome - i.e. the complete sequence of bases in human DNA. The prize will be the ability to identify genes causing or predisposing to disease, and in some cases the development of gene therapy, but this new knowledge will raise important ethical issues

  16. Decoding the human genome

    CERN Multimedia

    CERN. Geneva. Audiovisual Unit; Antonerakis, S E

    2002-01-01

    Decoding the Human genome is a very up-to-date topic, raising several questions besides purely scientific, in view of the two competing teams (public and private), the ethics of using the results, and the fact that the project went apparently faster and easier than expected. The lecture series will address the following chapters: Scientific basis and challenges. Ethical and social aspects of genomics.

  17. Molluscan Evolutionary Genomics

    Energy Technology Data Exchange (ETDEWEB)

    Simison, W. Brian; Boore, Jeffrey L.

    2005-12-01

    In the last 20 years there have been dramatic advances in techniques of high-throughput DNA sequencing, most recently accelerated by the Human Genome Project, a program that has determined the three billion base pair code on which we are based. Now this tremendous capability is being directed at other genome targets that are being sampled across the broad range of life. This opens up opportunities as never before for evolutionary and organismal biologists to address questions of both processes and patterns of organismal change. We stand at the dawn of a new 'modern synthesis' period, paralleling that of the early 20th century when the fledgling field of genetics first identified the underlying basis for Darwin's theory. We must now unite the efforts of systematists, paleontologists, mathematicians, computer programmers, molecular biologists, developmental biologists, and others in the pursuit of discovering what genomics can teach us about the diversity of life. Genome-level sampling for mollusks to date has mostly been limited to mitochondrial genomes and it is likely that these will continue to provide the best targets for broad phylogenetic sampling in the near future. However, we are just beginning to see an inroad into complete nuclear genome sequencing, with several mollusks and other eutrochozoans having been selected for work about to begin. Here, we provide an overview of the state of molluscan mitochondrial genomics, highlight a few of the discoveries from this research, outline the promise of broadening this dataset, describe upcoming projects to sequence whole mollusk nuclear genomes, and challenge the community to prepare for making the best use of these data.

  18. Human Germline Genome Editing

    OpenAIRE

    Ormond, Kelly E.; Mortlock, Douglas P.; Scholes, Derek T.; Bombard, Yvonne; Brody, Lawrence C.; Faucett, W. Andrew; Garrison, Nanibaa’ A.; Hercher, Laura; Isasi, Rosario; Middleton, Anna; Musunuru, Kiran; Shriner, Daniel; Virani, Alice; Young, Caroline E.

    2017-01-01

    With CRISPR/Cas9 and other genome-editing technologies, successful somatic and germline genome editing are becoming feasible. To respond, an American Society of Human Genetics (ASHG) workgroup developed this position statement, which was approved by the ASHG Board in March 2017. The workgroup included representatives from the UK Association of Genetic Nurses and Counsellors, Canadian Association of Genetic Counsellors, International Genetic Epidemiology Society, and US National Society of Gen...

  19. RadGenomics project

    Energy Technology Data Exchange (ETDEWEB)

    Iwakawa, Mayumi; Imai, Takashi; Harada, Yoshinobu [National Inst. of Radiological Sciences, Chiba (Japan). Frontier Research Center] [and others

    2002-06-01

    Human health is determined by a complex interplay of factors, predominantly between genetic susceptibility, environmental conditions and aging. The ultimate aim of the RadGenomics (Radiation Genomics) project is to understand the implications of heterogeneity in responses to ionizing radiation arising from genetic variation between individuals in the human population. The rapid progression of the human genome sequencing and the recent development of new technologies in molecular genetics are providing us with new opportunities to understand the genetic basis of individual differences in susceptibility to natural and/or artificial environmental factors, including radiation exposure. The RadGenomics project will inevitably lead to improved protocols for personalized radiotherapy and reductions in the potential side effects of such treatment. The project will contribute to future research into the molecular mechanisms of radiation sensitivity in humans and will stimulate the development of new high-throughput technologies for a broader application of biological and medical sciences. The staff members are specialists in a variety of fields, including genome science, radiation biology, medical science, molecular biology, and informatics, and have joined the RadGenomics project from various universities, companies, and research institutes. The project started in April 2001. (author)

  20. Comparative Genome Viewer

    International Nuclear Information System (INIS)

    Molineris, I.; Sales, G.

    2009-01-01

    The amount of information about genomes, both in the form of complete sequences and annotations, has been exponentially increasing in the last few years. As a result there is the need for tools providing a graphical representation of such information that should be comprehensive and intuitive. Visual representation is especially important in the comparative genomics field since it should provide a combined view of data belonging to different genomes. We believe that existing tools are limited in this respect as they focus on a single genome at a time (conservation histograms) or compress alignment representation to a single dimension. We have therefore developed a web-based tool called Comparative Genome Viewer (Cgv): it integrates a bidimensional representation of alignments between two regions, both at small and big scales, with the richness of annotations present in other genome browsers. We give access to our system through a web-based interface that provides the user with an interactive representation that can be updated in real time using the mouse to move from region to region and to zoom in on interesting details.

  1. Human social genomics.

    Directory of Open Access Journals (Sweden)

    Steven W Cole

    2014-08-01

    Full Text Available A growing literature in human social genomics has begun to analyze how everyday life circumstances influence human gene expression. Social-environmental conditions such as urbanity, low socioeconomic status, social isolation, social threat, and low or unstable social status have been found to associate with differential expression of hundreds of gene transcripts in leukocytes and diseased tissues such as metastatic cancers. In leukocytes, diverse types of social adversity evoke a common conserved transcriptional response to adversity (CTRA characterized by increased expression of proinflammatory genes and decreased expression of genes involved in innate antiviral responses and antibody synthesis. Mechanistic analyses have mapped the neural "social signal transduction" pathways that stimulate CTRA gene expression in response to social threat and may contribute to social gradients in health. Research has also begun to analyze the functional genomics of optimal health and thriving. Two emerging opportunities now stand to revolutionize our understanding of the everyday life of the human genome: network genomics analyses examining how systems-level capabilities emerge from groups of individual socially sensitive genomes and near-real-time transcriptional biofeedback to empirically optimize individual well-being in the context of the unique genetic, geographic, historical, developmental, and social contexts that jointly shape the transcriptional realization of our innate human genomic potential for thriving.

  2. RadGenomics project

    International Nuclear Information System (INIS)

    Iwakawa, Mayumi; Imai, Takashi; Harada, Yoshinobu

    2002-01-01

    Human health is determined by a complex interplay of factors, predominantly between genetic susceptibility, environmental conditions and aging. The ultimate aim of the RadGenomics (Radiation Genomics) project is to understand the implications of heterogeneity in responses to ionizing radiation arising from genetic variation between individuals in the human population. The rapid progression of the human genome sequencing and the recent development of new technologies in molecular genetics are providing us with new opportunities to understand the genetic basis of individual differences in susceptibility to natural and/or artificial environmental factors, including radiation exposure. The RadGenomics project will inevitably lead to improved protocols for personalized radiotherapy and reductions in the potential side effects of such treatment. The project will contribute to future research into the molecular mechanisms of radiation sensitivity in humans and will stimulate the development of new high-throughput technologies for a broader application of biological and medical sciences. The staff members are specialists in a variety of fields, including genome science, radiation biology, medical science, molecular biology, and informatics, and have joined the RadGenomics project from various universities, companies, and research institutes. The project started in April 2001. (author)

  3. Ultrafast comparison of personal genomes

    OpenAIRE

    Mauldin, Denise; Hood, Leroy; Robinson, Max; Glusman, Gustavo

    2017-01-01

    We present an ultra-fast method for comparing personal genomes. We transform the standard genome representation (lists of variants relative to a reference) into 'genome fingerprints' that can be readily compared across sequencing technologies and reference versions. Because of their reduced size, computation on the genome fingerprints is fast and requires little memory. This enables scaling up a variety of important genome analyses, including quantifying relatedness, recognizing duplicative s...

  4. Genomics using the Assembly of the Mink Genome

    DEFF Research Database (Denmark)

    Guldbrandtsen, Bernt; Cai, Zexi; Sahana, Goutam

    2018-01-01

    The American Mink’s (Neovison vison) genome has recently been sequenced. This opens numerous avenues of research both for studying the basic genetics and physiology of the mink as well as genetic improvement in mink. Using genotyping-by-sequencing (GBS) generated marker data for 2,352 Danish farm...... mink runs of homozygosity (ROH) were detect in mink genomes. Detectable ROH made up on average 1.7% of the genome indicating the presence of at most a moderate level of genomic inbreeding. The fraction of genome regions found in ROH varied. Ten percent of the included regions were never found in ROH....... The ability to detect ROH in the mink genome also demonstrates the general reliability of the new mink genome assembly. Keywords: american mink, run of homozygosity, genome, selection, genomic inbreeding...

  5. Genome size analyses of Pucciniales reveal the largest fungal genomes.

    Science.gov (United States)

    Tavares, Sílvia; Ramos, Ana Paula; Pires, Ana Sofia; Azinheira, Helena G; Caldeirinha, Patrícia; Link, Tobias; Abranches, Rita; Silva, Maria do Céu; Voegele, Ralf T; Loureiro, João; Talhinhas, Pedro

    2014-01-01

    Rust fungi (Basidiomycota, Pucciniales) are biotrophic plant pathogens which exhibit diverse complexities in their life cycles and host ranges. The completion of genome sequencing of a few rust fungi has revealed the occurrence of large genomes. Sequencing efforts for other rust fungi have been hampered by uncertainty concerning their genome sizes. Flow cytometry was recently applied to estimate the genome size of a few rust fungi, and confirmed the occurrence of large genomes in this order (averaging 225.3 Mbp, while the average for Basidiomycota was 49.9 Mbp and was 37.7 Mbp for all fungi). In this work, we have used an innovative and simple approach to simultaneously isolate nuclei from the rust and its host plant in order to estimate the genome size of 30 rust species by flow cytometry. Genome sizes varied over 10-fold, from 70 to 893 Mbp, with an average genome size value of 380.2 Mbp. Compared to the genome sizes of over 1800 fungi, Gymnosporangium confusum possesses the largest fungal genome ever reported (893.2 Mbp). Moreover, even the smallest rust genome determined in this study is larger than the vast majority of fungal genomes (94%). The average genome size of the Pucciniales is now of 305.5 Mbp, while the average Basidiomycota genome size has shifted to 70.4 Mbp and the average for all fungi reached 44.2 Mbp. Despite the fact that no correlation could be drawn between the genome sizes, the phylogenomics or the life cycle of rust fungi, it is interesting to note that rusts with Fabaceae hosts present genomes clearly larger than those with Poaceae hosts. Although this study comprises only a small fraction of the more than 7000 rust species described, it seems already evident that the Pucciniales represent a group where genome size expansion could be a common characteristic. This is in sharp contrast to sister taxa, placing this order in a relevant position in fungal genomics research.

  6. Genomes to Proteomes

    Energy Technology Data Exchange (ETDEWEB)

    Panisko, Ellen A. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Grigoriev, Igor [USDOE Joint Genome Inst., Walnut Creek, CA (United States); Daly, Don S. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Webb-Robertson, Bobbie-Jo [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Baker, Scott E. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States)

    2009-03-01

    Biologists are awash with genomic sequence data. In large part, this is due to the rapid acceleration in the generation of DNA sequence that occurred as public and private research institutes raced to sequence the human genome. In parallel with the large human genome effort, mostly smaller genomes of other important model organisms were sequenced. Projects following on these initial efforts have made use of technological advances and the DNA sequencing infrastructure that was built for the human and other organism genome projects. As a result, the genome sequences of many organisms are available in high quality draft form. While in many ways this is good news, there are limitations to the biological insights that can be gleaned from DNA sequences alone; genome sequences offer only a bird's eye view of the biological processes endemic to an organism or community. Fortunately, the genome sequences now being produced at such a high rate can serve as the foundation for other global experimental platforms such as proteomics. Proteomic methods offer a snapshot of the proteins present at a point in time for a given biological sample. Current global proteomics methods combine enzymatic digestion, separations, mass spectrometry and database searching for peptide identification. One key aspect of proteomics is the prediction of peptide sequences from mass spectrometry data. Global proteomic analysis uses computational matching of experimental mass spectra with predicted spectra based on databases of gene models that are often generated computationally. Thus, the quality of gene models predicted from a genome sequence is crucial in the generation of high quality peptide identifications. Once peptides are identified they can be assigned to their parent protein. Proteins identified as expressed in a given experiment are most useful when compared to other expressed proteins in a larger biological context or biochemical pathway. In this chapter we will discuss the automatic

  7. Experimental Induction of Genome Chaos.

    Science.gov (United States)

    Ye, Christine J; Liu, Guo; Heng, Henry H

    2018-01-01

    Genome chaos, or karyotype chaos, represents a powerful survival strategy for somatic cells under high levels of stress/selection. Since the genome context, not the gene content, encodes the genomic blueprint of the cell, stress-induced rapid and massive reorganization of genome topology functions as a very important mechanism for genome (karyotype) evolution. In recent years, the phenomenon of genome chaos has been confirmed by various sequencing efforts, and many different terms have been coined to describe different subtypes of the chaotic genome including "chromothripsis," "chromoplexy," and "structural mutations." To advance this exciting field, we need an effective experimental system to induce and characterize the karyotype reorganization process. In this chapter, an experimental protocol to induce chaotic genomes is described, following a brief discussion of the mechanism and implication of genome chaos in cancer evolution.

  8. Genome Sequences of Oryza Species

    KAUST Repository

    Kumagai, Masahiko; Tanaka, Tsuyoshi; Ohyanagi, Hajime; Hsing, Yue-Ie C.; Itoh, Takeshi

    2018-01-01

    This chapter summarizes recent data obtained from genome sequencing, annotation projects, and studies on the genome diversity of Oryza sativa and related Oryza species. O. sativa, commonly known as Asian rice, is the first monocot species whose complete genome sequence was deciphered based on physical mapping by an international collaborative effort. This genome, along with its accurate and comprehensive annotation, has become an indispensable foundation for crop genomics and breeding. With the development of innovative sequencing technologies, genomic studies of O. sativa have dramatically increased; in particular, a large number of cultivars and wild accessions have been sequenced and compared with the reference rice genome. Since de novo genome sequencing has become cost-effective, the genome of African cultivated rice, O. glaberrima, has also been determined. Comparative genomic studies have highlighted the independent domestication processes of different rice species, but it also turned out that Asian and African rice share a common gene set that has experienced similar artificial selection. An international project aimed at constructing reference genomes and examining the genome diversity of wild Oryza species is currently underway, and the genomes of some species are publicly available. This project provides a platform for investigations such as the evolution, development, polyploidization, and improvement of crops. Studies on the genomic diversity of Oryza species, including wild species, should provide new insights to solve the problem of growing food demands in the face of rapid climatic changes.

  9. Genome Sequences of Oryza Species

    KAUST Repository

    Kumagai, Masahiko

    2018-02-14

    This chapter summarizes recent data obtained from genome sequencing, annotation projects, and studies on the genome diversity of Oryza sativa and related Oryza species. O. sativa, commonly known as Asian rice, is the first monocot species whose complete genome sequence was deciphered based on physical mapping by an international collaborative effort. This genome, along with its accurate and comprehensive annotation, has become an indispensable foundation for crop genomics and breeding. With the development of innovative sequencing technologies, genomic studies of O. sativa have dramatically increased; in particular, a large number of cultivars and wild accessions have been sequenced and compared with the reference rice genome. Since de novo genome sequencing has become cost-effective, the genome of African cultivated rice, O. glaberrima, has also been determined. Comparative genomic studies have highlighted the independent domestication processes of different rice species, but it also turned out that Asian and African rice share a common gene set that has experienced similar artificial selection. An international project aimed at constructing reference genomes and examining the genome diversity of wild Oryza species is currently underway, and the genomes of some species are publicly available. This project provides a platform for investigations such as the evolution, development, polyploidization, and improvement of crops. Studies on the genomic diversity of Oryza species, including wild species, should provide new insights to solve the problem of growing food demands in the face of rapid climatic changes.

  10. Genome position specific priors for genomic prediction

    DEFF Research Database (Denmark)

    Brøndum, Rasmus Froberg; Su, Guosheng; Lund, Mogens Sandø

    2012-01-01

    casual mutation is different between the populations but affects the same gene. Proportions of a four-distribution mixture for SNP effects in segments of fixed size along the genome are derived from one population and set as location specific prior proportions of distributions of SNP effects...... for the target population. The model was tested using dairy cattle populations of different breeds: 540 Australian Jersey bulls, 2297 Australian Holstein bulls and 5214 Nordic Holstein bulls. The traits studied were protein-, fat- and milk yield. Genotypic data was Illumina 777K SNPs, real or imputed Results...

  11. Genomics of Volvocine Algae

    Science.gov (United States)

    Umen, James G.; Olson, Bradley J.S.C.

    2015-01-01

    Volvocine algae are a group of chlorophytes that together comprise a unique model for evolutionary and developmental biology. The species Chlamydomonas reinhardtii and Volvox carteri represent extremes in morphological diversity within the Volvocine clade. Chlamydomonas is unicellular and reflects the ancestral state of the group, while Volvox is multicellular and has evolved numerous innovations including germ-soma differentiation, sexual dimorphism, and complex morphogenetic patterning. The Chlamydomonas genome sequence has shed light on several areas of eukaryotic cell biology, metabolism and evolution, while the Volvox genome sequence has enabled a comparison with Chlamydomonas that reveals some of the underlying changes that enabled its transition to multicellularity, but also underscores the subtlety of this transition. Many of the tools and resources are in place to further develop Volvocine algae as a model for evolutionary genomics. PMID:25883411

  12. Genomics of Preterm Birth

    Science.gov (United States)

    Swaggart, Kayleigh A.; Pavlicev, Mihaela; Muglia, Louis J.

    2015-01-01

    The molecular mechanisms controlling human birth timing at term, or resulting in preterm birth, have been the focus of considerable investigation, but limited insights have been gained over the past 50 years. In part, these processes have remained elusive because of divergence in reproductive strategies and physiology shown by model organisms, making extrapolation to humans uncertain. Here, we summarize the evolution of progesterone signaling and variation in pregnancy maintenance and termination. We use this comparative physiology to support the hypothesis that selective pressure on genomic loci involved in the timing of parturition have shaped human birth timing, and that these loci can be identified with comparative genomic strategies. Previous limitations imposed by divergence of mechanisms provide an important new opportunity to elucidate fundamental pathways of parturition control through increasing availability of sequenced genomes and associated reproductive physiology characteristics across diverse organisms. PMID:25646385

  13. Genomics of Salmonella Species

    Science.gov (United States)

    Canals, Rocio; McClelland, Michael; Santiviago, Carlos A.; Andrews-Polymenis, Helene

    Progress in the study of Salmonella survival, colonization, and virulence has increased rapidly with the advent of complete genome sequencing and higher capacity assays for transcriptomic and proteomic analysis. Although many of these techniques have yet to be used to directly assay Salmonella growth on foods, these assays are currently in use to determine Salmonella factors necessary for growth in animal models including livestock animals and in in vitro conditions that mimic many different environments. As sequencing of the Salmonella genome and microarray analysis have revolutionized genomics and transcriptomics of salmonellae over the last decade, so are new high-throughput sequencing technologies currently accelerating the pace of our studies and allowing us to approach complex problems that were not previously experimentally tractable.

  14. Ebolavirus comparative genomics

    Science.gov (United States)

    Jun, Se-Ran; Leuze, Michael R.; Nookaew, Intawat; Uberbacher, Edward C.; Land, Miriam; Zhang, Qian; Wanchai, Visanu; Chai, Juanjuan; Nielsen, Morten; Trolle, Thomas; Lund, Ole; Buzard, Gregory S.; Pedersen, Thomas D.; Wassenaar, Trudy M.; Ussery, David W.

    2015-01-01

    The 2014 Ebola outbreak in West Africa is the largest documented for this virus. To examine the dynamics of this genome, we compare more than 100 currently available ebolavirus genomes to each other and to other viral genomes. Based on oligomer frequency analysis, the family Filoviridae forms a distinct group from all other sequenced viral genomes. All filovirus genomes sequenced to date encode proteins with similar functions and gene order, although there is considerable divergence in sequences between the three genera Ebolavirus, Cuevavirus and Marburgvirus within the family Filoviridae. Whereas all ebolavirus genomes are quite similar (multiple sequences of the same strain are often identical), variation is most common in the intergenic regions and within specific areas of the genes encoding the glycoprotein (GP), nucleoprotein (NP) and polymerase (L). We predict regions that could contain epitope-binding sites, which might be good vaccine targets. This information, combined with glycosylation sites and experimentally determined epitopes, can identify the most promising regions for the development of therapeutic strategies. This manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan). PMID:26175035

  15. Brief Guide to Genomics: DNA, Genes and Genomes

    Science.gov (United States)

    ... clinic. Most new drugs based on genome-based research are estimated to be at least 10 to 15 years away, though recent genome-driven efforts in lipid-lowering therapy have considerably shortened that interval. According ...

  16. Genomic Prediction in Barley

    DEFF Research Database (Denmark)

    Edriss, Vahid; Cericola, Fabio; Jensen, Jens D

    2015-01-01

    to next generation. The main goal of this study was to see the potential of using genomic prediction in a commercial Barley breeding program. The data used in this study was from Nordic Seed company which is located in Denmark. Around 350 advanced lines were genotyped with 9K Barely chip from Illumina....... Traits used in this study were grain yield, plant height and heading date. Heading date is number days it takes after 1st June for plant to head. Heritabilities were 0.33, 0.44 and 0.48 for yield, height and heading, respectively for the average of nine plots. The GBLUP model was used for genomic...

  17. Comparative Genomics Reveals High Genomic Diversity in the Genus Photobacterium.

    Science.gov (United States)

    Machado, Henrique; Gram, Lone

    2017-01-01

    Vibrionaceae is a large marine bacterial family, which can constitute up to 50% of the prokaryotic population in marine waters. Photobacterium is the second largest genus in the family and we used comparative genomics on 35 strains representing 16 of the 28 species described so far, to understand the genomic diversity present in the Photobacterium genus. Such understanding is important for ecophysiology studies of the genus. We used whole genome sequences to evaluate phylogenetic relationships using several analyses (16S rRNA, MLSA, fur , amino-acid usage, ANI), which allowed us to identify two misidentified strains. Genome analyses also revealed occurrence of higher and lower GC content clades, correlating with phylogenetic clusters. Pan- and core-genome analysis revealed the conservation of 25% of the genome throughout the genus, with a large and open pan-genome. The major source of genomic diversity could be traced to the smaller chromosome and plasmids. Several of the physiological traits studied in the genus did not correlate with phylogenetic data. Since horizontal gene transfer (HGT) is often suggested as a source of genetic diversity and a potential driver of genomic evolution in bacterial species, we looked into evidence of such in Photobacterium genomes. Genomic islands were the source of genomic differences between strains of the same species. Also, we found transposase genes and CRISPR arrays that suggest multiple encounters with foreign DNA. Presence of genomic exchange traits was widespread and abundant in the genus, suggesting a role in genomic evolution. The high genetic variability and indications of genetic exchange make it difficult to elucidate genome evolutionary paths and raise the awareness of the roles of foreign DNA in the genomic evolution of environmental organisms.

  18. phiGENOME: an integrative navigation throughout bacteriophage genomes.

    Science.gov (United States)

    Stano, Matej; Klucar, Lubos

    2011-11-01

    phiGENOME is a web-based genome browser generating dynamic and interactive graphical representation of phage genomes stored in the phiSITE, database of gene regulation in bacteriophages. phiGENOME is an integral part of the phiSITE web portal (http://www.phisite.org/phigenome) and it was optimised for visualisation of phage genomes with the emphasis on the gene regulatory elements. phiGENOME consists of three components: (i) genome map viewer built using Adobe Flash technology, providing dynamic and interactive graphical display of phage genomes; (ii) sequence browser based on precisely formatted HTML tags, providing detailed exploration of genome features on the sequence level and (iii) regulation illustrator, based on Scalable Vector Graphics (SVG) and designed for graphical representation of gene regulations. Bringing 542 complete genome sequences accompanied with their rich annotations and references, makes phiGENOME a unique information resource in the field of phage genomics. Copyright © 2011 Elsevier Inc. All rights reserved.

  19. Illuminating the Druggable Genome (IDG)

    Data.gov (United States)

    Federal Laboratory Consortium — Results from the Human Genome Project revealed that the human genome contains 20,000 to 25,000 genes. A gene contains (encodes) the information that each cell uses...

  20. National Human Genome Research Institute

    Science.gov (United States)

    ... Care Genomic Medicine Working Group New Horizons and Research Patient Management Policy and Ethics Issues Quick Links for Patient Care Education All About the Human Genome Project Fact Sheets Genetic Education Resources for ...

  1. Genomic prediction using subsampling.

    Science.gov (United States)

    Xavier, Alencar; Xu, Shizhong; Muir, William; Rainey, Katy Martin

    2017-03-24

    Genome-wide assisted selection is a critical tool for the genetic improvement of plants and animals. Whole-genome regression models in Bayesian framework represent the main family of prediction methods. Fitting such models with a large number of observations involves a prohibitive computational burden. We propose the use of subsampling bootstrap Markov chain in genomic prediction. Such method consists of fitting whole-genome regression models by subsampling observations in each round of a Markov Chain Monte Carlo. We evaluated the effect of subsampling bootstrap on prediction and computational parameters. Across datasets, we observed an optimal subsampling proportion of observations around 50% with replacement, and around 33% without replacement. Subsampling provided a substantial decrease in computation time, reducing the time to fit the model by half. On average, losses on predictive properties imposed by subsampling were negligible, usually below 1%. For each dataset, an optimal subsampling point that improves prediction properties was observed, but the improvements were also negligible. Combining subsampling with Gibbs sampling is an interesting ensemble algorithm. The investigation indicates that the subsampling bootstrap Markov chain algorithm substantially reduces computational burden associated with model fitting, and it may slightly enhance prediction properties.

  2. The Lotus japonicus genome

    DEFF Research Database (Denmark)

    Fabaceae, groundbreaking genetic and genomic research has established a significant body of knowledge on Lotus japonicus, which was adopted as a model species more than 20 years ago. The diverse nature of legumes means that such research has a wide potential and agricultural impact, for example...

  3. Genomic taxonomy of vibrios

    DEFF Research Database (Denmark)

    Thompson, Cristiane C.; Vicente, Ana Carolina P.; Souza, Rangel C.

    2009-01-01

    BACKGROUND: Vibrio taxonomy has been based on a polyphasic approach. In this study, we retrieve useful taxonomic information (i.e. data that can be used to distinguish different taxonomic levels, such as species and genera) from 32 genome sequences of different vibrio species. We use a variety of...

  4. The Genome Atlas Resource

    DEFF Research Database (Denmark)

    Azam Qureshi, Matloob; Rotenberg, Eva; Stærfeldt, Hans Henrik

    2010-01-01

    with scripts and algorithms developed in a variety of programming languages at the Centre for Biological Sequence Analysis in order to create a three-tier software application for genome analysis. The results are made available via a web interface developed in Java, PHP and Perl CGI. User...

  5. Genomic Signatures of Reinforcement

    Directory of Open Access Journals (Sweden)

    Austin G. Garner

    2018-04-01

    Full Text Available Reinforcement is the process by which selection against hybridization increases reproductive isolation between taxa. Much research has focused on demonstrating the existence of reinforcement, yet relatively little is known about the genetic basis of reinforcement or the evolutionary conditions under which reinforcement can occur. Inspired by reinforcement’s characteristic phenotypic pattern of reproductive trait divergence in sympatry but not in allopatry, we discuss whether reinforcement also leaves a distinct genomic pattern. First, we describe three patterns of genetic variation we expect as a consequence of reinforcement. Then, we discuss a set of alternative processes and complicating factors that may make the identification of reinforcement at the genomic level difficult. Finally, we consider how genomic analyses can be leveraged to inform if and to what extent reinforcement evolved in the face of gene flow between sympatric lineages and between allopatric and sympatric populations of the same lineage. Our major goals are to understand if genome scans for particular patterns of genetic variation could identify reinforcement, isolate the genetic basis of reinforcement, or infer the conditions under which reinforcement evolved.

  6. Better chocolate through genomics

    Science.gov (United States)

    Theobroma cacao, the cacao or chocolate tree, is a tropical understory tree whose seeds are used to make chocolate. And like any important crop, cacao is the subject of much research. On September 15, 2010, scientists publicly released a preliminary sequence of the cacao genome--which contains all o...

  7. Functional genomics of tomato

    Indian Academy of Sciences (India)

    2014-10-20

    Oct 20, 2014 ... 1Repository of Tomato Genomics Resources, Department of Plant Sciences, School .... Due to its position at the crossroads of Sanger's sequencing .... replacement for the microarray-based expression profiling. .... during RNA fragmentation step prior to library construction, ...... tomato pollen as a test case.

  8. Genomic Signatures of Reinforcement

    Science.gov (United States)

    Goulet, Benjamin E.

    2018-01-01

    Reinforcement is the process by which selection against hybridization increases reproductive isolation between taxa. Much research has focused on demonstrating the existence of reinforcement, yet relatively little is known about the genetic basis of reinforcement or the evolutionary conditions under which reinforcement can occur. Inspired by reinforcement’s characteristic phenotypic pattern of reproductive trait divergence in sympatry but not in allopatry, we discuss whether reinforcement also leaves a distinct genomic pattern. First, we describe three patterns of genetic variation we expect as a consequence of reinforcement. Then, we discuss a set of alternative processes and complicating factors that may make the identification of reinforcement at the genomic level difficult. Finally, we consider how genomic analyses can be leveraged to inform if and to what extent reinforcement evolved in the face of gene flow between sympatric lineages and between allopatric and sympatric populations of the same lineage. Our major goals are to understand if genome scans for particular patterns of genetic variation could identify reinforcement, isolate the genetic basis of reinforcement, or infer the conditions under which reinforcement evolved. PMID:29614048

  9. The Nostoc punctiforme Genome

    Energy Technology Data Exchange (ETDEWEB)

    John C. Meeks

    2001-12-31

    Nostoc punctiforme is a filamentous cyanobacterium with extensive phenotypic characteristics and a relatively large genome, approaching 10 Mb. The phenotypic characteristics include a photoautotrophic, diazotrophic mode of growth, but N. punctiforme is also facultatively heterotrophic; its vegetative cells have multiple development alternatives, including terminal differentiation into nitrogen-fixing heterocysts and transient differentiation into spore-like akinetes or motile filaments called hormogonia; and N. punctiforme has broad symbiotic competence with fungi and terrestrial plants, including bryophytes, gymnosperms and an angiosperm. The shotgun-sequencing phase of the N. punctiforme strain ATCC 29133 genome has been completed by the Joint Genome Institute. Annotation of an 8.9 Mb database yielded 7432 open reading frames, 45% of which encode proteins with known or probable known function and 29% of which are unique to N. punctiforme. Comparative analysis of the sequence indicates a genome that is highly plastic and in a state of flux, with numerous insertion sequences and multilocus repeats, as well as genes encoding transposases and DNA modification enzymes. The sequence also reveals the presence of genes encoding putative proteins that collectively define almost all characteristics of cyanobacteria as a group. N. punctiforme has an extensive potential to sense and respond to environmental signals as reflected by the presence of more than 400 genes encoding sensor protein kinases, response regulators and other transcriptional factors. The signal transduction systems and any of the large number of unique genes may play essential roles in the cell differentiation and symbiotic interaction properties of N. punctiforme.

  10. Comparative Genomics of Eukaryotes.

    NARCIS (Netherlands)

    Noort, V. van

    2007-01-01

    This thesis focuses on developing comparative genomics methods in eukaryotes, with an emphasis on applications for gene function prediction and regulatory element detection. In the past, methods have been developed to predict functional associations between gene pairs in prokaryotes. The challenge

  11. Searching for genomic constraints

    Energy Technology Data Exchange (ETDEWEB)

    Lio` , P [Cambridge, Univ. (United Kingdom). Genetics Dept.; Ruffo, S [Florence, Univ. (Italy). Fac. di Ingegneria. Dipt. di Energetica ` S. Stecco`

    1998-01-01

    The authors have analyzed general properties of very long DNA sequences belonging to simple and complex organisms, by using different correlation methods. They have distinguished those base compositional rules that concern the entire genome which they call `genomic constraints` from the rules that depend on the `external natural selection` acting on single genes, i. e. protein-centered constraints. They show that G + C content, purine / pyrimidine distributions and biological complexity of the organism are the most important factors which determine base compositional rules and genome complexity. Three main facts are here reported: bacteria with high G + C content have more restrictions on base composition than those with low G + C content; at constant G + C content more complex organisms, ranging from prokaryotes to higher eukaryotes (e.g. human) display an increase of repeats 10-20 nucleotides long, which are also partly responsible for long-range correlations; work selection of length 3 to 10 is stronger in human and in bacteria for two distinct reasons. With respect to previous studies, they have also compared the genomic sequence of the archeon Methanococcus jannaschii with those of bacteria and eukaryotes: it shows sometimes an intermediate statistical behaviour.

  12. Searching for genomic constraints

    International Nuclear Information System (INIS)

    Lio', P.; Ruffo, S.

    1998-01-01

    The authors have analyzed general properties of very long DNA sequences belonging to simple and complex organisms, by using different correlation methods. They have distinguished those base compositional rules that concern the entire genome which they call 'genomic constraints' from the rules that depend on the 'external natural selection' acting on single genes, i. e. protein-centered constraints. They show that G + C content, purine / pyrimidine distributions and biological complexity of the organism are the most important factors which determine base compositional rules and genome complexity. Three main facts are here reported: bacteria with high G + C content have more restrictions on base composition than those with low G + C content; at constant G + C content more complex organisms, ranging from prokaryotes to higher eukaryotes (e.g. human) display an increase of repeats 10-20 nucleotides long, which are also partly responsible for long-range correlations; work selection of length 3 to 10 is stronger in human and in bacteria for two distinct reasons. With respect to previous studies, they have also compared the genomic sequence of the archeon Methanococcus jannaschii with those of bacteria and eukaryotes: it shows sometimes an intermediate statistical behaviour

  13. Genomic sequencing in clinical trials

    OpenAIRE

    Mestan, Karen K; Ilkhanoff, Leonard; Mouli, Samdeep; Lin, Simon

    2011-01-01

    Abstract Human genome sequencing is the process by which the exact order of nucleic acid base pairs in the 24 human chromosomes is determined. Since the completion of the Human Genome Project in 2003, genomic sequencing is rapidly becoming a major part of our translational research efforts to understand and improve human health and disease. This article reviews the current and future directions of clinical research with respect to genomic sequencing, a technology that is just beginning to fin...

  14. Statistical Methods in Integrative Genomics

    Science.gov (United States)

    Richardson, Sylvia; Tseng, George C.; Sun, Wei

    2016-01-01

    Statistical methods in integrative genomics aim to answer important biology questions by jointly analyzing multiple types of genomic data (vertical integration) or aggregating the same type of data across multiple studies (horizontal integration). In this article, we introduce different types of genomic data and data resources, and then review statistical methods of integrative genomics, with emphasis on the motivation and rationale of these methods. We conclude with some summary points and future research directions. PMID:27482531

  15. From plant genomes to phenotypes

    OpenAIRE

    Bolger, Marie; Gundlach, Heidrun; Scholz, Uwe; Mayer, Klaus; Usadel, Björn; Schwacke, Rainer; Schmutzer, Thomas; Chen, Jinbo; Arend, Daniel; Oppermann, Markus; Weise, Stephan; Lange, Matthias; Fiorani, Fabio; Spannagl, Manuel

    2017-01-01

    Recent advances in sequencing technologies have greatly accelerated the rate of plant genome and applied breeding research. Despite this advancing trend, plant genomes continue to present numerous difficulties to the standard tools and pipelines not only for genome assembly but also gene annotation and downstream analysis.Here we give a perspective on tools, resources and services necessary to assemble and analyze plant genomes and link them to plant phenotypes.

  16. A Thousand Fly Genomes: An Expanded Drosophila Genome Nexus.

    Science.gov (United States)

    Lack, Justin B; Lange, Jeremy D; Tang, Alison D; Corbett-Detig, Russell B; Pool, John E

    2016-12-01

    The Drosophila Genome Nexus is a population genomic resource that provides D. melanogaster genomes from multiple sources. To facilitate comparisons across data sets, genomes are aligned using a common reference alignment pipeline which involves two rounds of mapping. Regions of residual heterozygosity, identity-by-descent, and recent population admixture are annotated to enable data filtering based on the user's needs. Here, we present a significant expansion of the Drosophila Genome Nexus, which brings the current data object to a total of 1,121 wild-derived genomes. New additions include 305 previously unpublished genomes from inbred lines representing six population samples in Egypt, Ethiopia, France, and South Africa, along with another 193 genomes added from recently-published data sets. We also provide an aligned D. simulans genome to facilitate divergence comparisons. This improved resource will broaden the range of population genomic questions that can addressed from multi-population allele frequencies and haplotypes in this model species. The larger set of genomes will also enhance the discovery of functionally relevant natural variation that exists within and between populations. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  17. The perennial ryegrass GenomeZipper: targeted use of genome resources for comparative grass genomics.

    Science.gov (United States)

    Pfeifer, Matthias; Martis, Mihaela; Asp, Torben; Mayer, Klaus F X; Lübberstedt, Thomas; Byrne, Stephen; Frei, Ursula; Studer, Bruno

    2013-02-01

    Whole-genome sequences established for model and major crop species constitute a key resource for advanced genomic research. For outbreeding forage and turf grass species like ryegrasses (Lolium spp.), such resources have yet to be developed. Here, we present a model of the perennial ryegrass (Lolium perenne) genome on the basis of conserved synteny to barley (Hordeum vulgare) and the model grass genome Brachypodium (Brachypodium distachyon) as well as rice (Oryza sativa) and sorghum (Sorghum bicolor). A transcriptome-based genetic linkage map of perennial ryegrass served as a scaffold to establish the chromosomal arrangement of syntenic genes from model grass species. This scaffold revealed a high degree of synteny and macrocollinearity and was then utilized to anchor a collection of perennial ryegrass genes in silico to their predicted genome positions. This resulted in the unambiguous assignment of 3,315 out of 8,876 previously unmapped genes to the respective chromosomes. In total, the GenomeZipper incorporates 4,035 conserved grass gene loci, which were used for the first genome-wide sequence divergence analysis between perennial ryegrass, barley, Brachypodium, rice, and sorghum. The perennial ryegrass GenomeZipper is an ordered, information-rich genome scaffold, facilitating map-based cloning and genome assembly in perennial ryegrass and closely related Poaceae species. It also represents a milestone in describing synteny between perennial ryegrass and fully sequenced model grass genomes, thereby increasing our understanding of genome organization and evolution in the most important temperate forage and turf grass species.

  18. Implementing genomics and pharmacogenomics in the clinic: The National Human Genome Research Institute's genomic medicine portfolio.

    Science.gov (United States)

    Manolio, Teri A

    2016-10-01

    Increasing knowledge about the influence of genetic variation on human health and growing availability of reliable, cost-effective genetic testing have spurred the implementation of genomic medicine in the clinic. As defined by the National Human Genome Research Institute (NHGRI), genomic medicine uses an individual's genetic information in his or her clinical care, and has begun to be applied effectively in areas such as cancer genomics, pharmacogenomics, and rare and undiagnosed diseases. In 2011 NHGRI published its strategic vision for the future of genomic research, including an ambitious research agenda to facilitate and promote the implementation of genomic medicine. To realize this agenda, NHGRI is consulting and facilitating collaborations with the external research community through a series of "Genomic Medicine Meetings," under the guidance and leadership of the National Advisory Council on Human Genome Research. These meetings have identified and begun to address significant obstacles to implementation, such as lack of evidence of efficacy, limited availability of genomics expertise and testing, lack of standards, and difficulties in integrating genomic results into electronic medical records. The six research and dissemination initiatives comprising NHGRI's genomic research portfolio are designed to speed the evaluation and incorporation, where appropriate, of genomic technologies and findings into routine clinical care. Actual adoption of successful approaches in clinical care will depend upon the willingness, interest, and energy of professional societies, practitioners, patients, and payers to promote their responsible use and share their experiences in doing so. Published by Elsevier Ireland Ltd.

  19. Applied Genomics of Foodborne Pathogens

    DEFF Research Database (Denmark)

    and customized source of information designed for and accessible to microbiologists interested in applying cutting-edge genomics in food safety and public health research. This book fills this void with a well-selected collection of topics, case studies, and bioinformatics tools contributed by experts......This book provides a timely and thorough snapshot into the emerging and fast evolving area of applied genomics of foodborne pathogens. Driven by the drastic advance of whole genome shot gun sequencing (WGS) technologies, genomics applications are becoming increasingly valuable and even essential...... at the forefront of foodborne pathogen genomics research....

  20. Chromatin dynamics in genome stability

    DEFF Research Database (Denmark)

    Nair, Nidhi; Shoaib, Muhammad; Sørensen, Claus Storgaard

    2017-01-01

    Genomic DNA is compacted into chromatin through packaging with histone and non-histone proteins. Importantly, DNA accessibility is dynamically regulated to ensure genome stability. This is exemplified in the response to DNA damage where chromatin relaxation near genomic lesions serves to promote...... access of relevant enzymes to specific DNA regions for signaling and repair. Furthermore, recent data highlight genome maintenance roles of chromatin through the regulation of endogenous DNA-templated processes including transcription and replication. Here, we review research that shows the importance...... of chromatin structure regulation in maintaining genome integrity by multiple mechanisms including facilitating DNA repair and directly suppressing endogenous DNA damage....

  1. Evolution of small prokaryotic genomes

    Directory of Open Access Journals (Sweden)

    David José Martínez-Cano

    2015-01-01

    Full Text Available As revealed by genome sequencing, the biology of prokaryotes with reduced genomes is strikingly diverse. These include free-living prokaryotes with ~800 genes as well as endosymbiotic bacteria with as few as ~140 genes. Comparative genomics is revealing the evolutionary mechanisms that led to these small genomes. In the case of free-living prokaryotes, natural selection directly favored genome reduction, while in the case of endosymbiotic prokaryotes neutral processes played a more prominent role. However, new experimental data suggest that selective processes may be at operation as well for endosymbiotic prokaryotes at least during the first stages of genome reduction. Endosymbiotic prokaryotes have evolved diverse strategies for living with reduced gene sets inside a host-defined medium. These include utilization of host-encoded functions (some of them coded by genes acquired by gene transfer from the endosymbiont and/or other bacteria; metabolic complementation between co-symbionts; and forming consortiums with other bacteria within the host. Recent genome sequencing projects of intracellular mutualistic bacteria showed that previously believed universal evolutionary trends like reduced G+C content and conservation of genome synteny are not always present in highly reduced genomes. Finally, the simplified molecular machinery of some of these organisms with small genomes may be used to aid in the design of artificial minimal cells. Here we review recent genomic discoveries of the biology of prokaryotes endowed with small gene sets and discuss the evolutionary mechanisms that have been proposed to explain their peculiar nature.

  2. Informational laws of genome structures

    Science.gov (United States)

    Bonnici, Vincenzo; Manca, Vincenzo

    2016-06-01

    In recent years, the analysis of genomes by means of strings of length k occurring in the genomes, called k-mers, has provided important insights into the basic mechanisms and design principles of genome structures. In the present study, we focus on the proper choice of the value of k for applying information theoretic concepts that express intrinsic aspects of genomes. The value k = lg2(n), where n is the genome length, is determined to be the best choice in the definition of some genomic informational indexes that are studied and computed for seventy genomes. These indexes, which are based on information entropies and on suitable comparisons with random genomes, suggest five informational laws, to which all of the considered genomes obey. Moreover, an informational genome complexity measure is proposed, which is a generalized logistic map that balances entropic and anti-entropic components of genomes and is related to their evolutionary dynamics. Finally, applications to computational synthetic biology are briefly outlined.

  3. Toward genome-enabled mycology.

    Science.gov (United States)

    Hibbett, David S; Stajich, Jason E; Spatafora, Joseph W

    2013-01-01

    Genome-enabled mycology is a rapidly expanding field that is characterized by the pervasive use of genome-scale data and associated computational tools in all aspects of fungal biology. Genome-enabled mycology is integrative and often requires teams of researchers with diverse skills in organismal mycology, bioinformatics and molecular biology. This issue of Mycologia presents the first complete fungal genomes in the history of the journal, reflecting the ongoing transformation of mycology into a genome-enabled science. Here, we consider the prospects for genome-enabled mycology and the technical and social challenges that will need to be overcome to grow the database of complete fungal genomes and enable all fungal biologists to make use of the new data.

  4. Genomic research perspectives in Kazakhstan

    Directory of Open Access Journals (Sweden)

    Ainur Akilzhanova

    2014-01-01

    Full Text Available Introduction: Technological advancements rapidly propel the field of genome research. Advances in genetics and genomics such as the sequence of the human genome, the human haplotype map, open access databases, cheaper genotyping and chemical genomics, have transformed basic and translational biomedical research. Several projects in the field of genomic and personalized medicine have been conducted at the Center for Life Sciences in Nazarbayev University. The prioritized areas of research include: genomics of multifactorial diseases, cancer genomics, bioinformatics, genetics of infectious diseases and population genomics. At present, DNA-based risk assessment for common complex diseases, application of molecular signatures for cancer diagnosis and prognosis, genome-guided therapy, and dose selection of therapeutic drugs are the important issues in personalized medicine. Results: To further develop genomic and biomedical projects at Center for Life Sciences, the development of bioinformatics research and infrastructure and the establishment of new collaborations in the field are essential. Widespread use of genetic tools will allow the identification of diseases before the onset of clinical symptoms, the individualization of drug treatment, and could induce individual behavioral changes on the basis of calculated disease risk. However, many challenges remain for the successful translation of genomic knowledge and technologies into health advances, such as medicines and diagnostics. It is important to integrate research and education in the fields of genomics, personalized medicine, and bioinformatics, which will be possible with opening of the new Medical Faculty at Nazarbayev University. People in practice and training need to be educated about the key concepts of genomics and engaged so they can effectively apply their knowledge in a matter that will bring the era of genomic medicine to patient care. This requires the development of well

  5. Mycobacteriophage genome database.

    Science.gov (United States)

    Joseph, Jerrine; Rajendran, Vasanthi; Hassan, Sameer; Kumar, Vanaja

    2011-01-01

    Mycobacteriophage genome database (MGDB) is an exclusive repository of the 64 completely sequenced mycobacteriophages with annotated information. It is a comprehensive compilation of the various gene parameters captured from several databases pooled together to empower mycobacteriophage researchers. The MGDB (Version No.1.0) comprises of 6086 genes from 64 mycobacteriophages classified into 72 families based on ACLAME database. Manual curation was aided by information available from public databases which was enriched further by analysis. Its web interface allows browsing as well as querying the classification. The main objective is to collect and organize the complexity inherent to mycobacteriophage protein classification in a rational way. The other objective is to browse the existing and new genomes and describe their functional annotation. The database is available for free at http://mpgdb.ibioinformatics.org/mpgdb.php.

  6. Precision genome editing

    DEFF Research Database (Denmark)

    Steentoft, Catharina; Bennett, Eric P; Schjoldager, Katrine Ter-Borch Gram

    2014-01-01

    Precise and stable gene editing in mammalian cell lines has until recently been hampered by the lack of efficient targeting methods. While different gene silencing strategies have had tremendous impact on many biological fields, they have generally not been applied with wide success in the field...... of glycobiology, primarily due to their low efficiencies, with resultant failure to impose substantial phenotypic consequences upon the final glycosylation products. Here, we review novel nuclease-based precision genome editing techniques enabling efficient and stable gene editing, including gene disruption...... by introducing single or double-stranded breaks at a defined genomic sequence. We here compare and contrast the different techniques and summarize their current applications, highlighting cases from the field of glycobiology as well as pointing to future opportunities. The emerging potential of precision gene...

  7. Alignment of whole genomes.

    Science.gov (United States)

    Delcher, A L; Kasif, S; Fleischmann, R D; Peterson, J; White, O; Salzberg, S L

    1999-01-01

    A new system for aligning whole genome sequences is described. Using an efficient data structure called a suffix tree, the system is able to rapidly align sequences containing millions of nucleotides. Its use is demonstrated on two strains of Mycoplasma tuberculosis, on two less similar species of Mycoplasma bacteria and on two syntenic sequences from human chromosome 12 and mouse chromosome 6. In each case it found an alignment of the input sequences, using between 30 s and 2 min of computation time. From the system output, information on single nucleotide changes, translocations and homologous genes can easily be extracted. Use of the algorithm should facilitate analysis of syntenic chromosomal regions, strain-to-strain comparisons, evolutionary comparisons and genomic duplications. PMID:10325427

  8. eGenomics: Cataloguing Our Complete Genome Collection III

    Directory of Open Access Journals (Sweden)

    Dawn Field

    2007-01-01

    Full Text Available This meeting report summarizes the proceedings of the “eGenomics: Cataloguing our Complete Genome Collection III” workshop held September 11–13, 2006, at the National Institute for Environmental eScience (NIEeS, Cambridge, United Kingdom. This 3rd workshop of the Genomic Standards Consortium was divided into two parts. The first half of the three-day workshop was dedicated to reviewing the genomic diversity of our current and future genome and metagenome collection, and exploring linkages to a series of existing projects through formal presentations. The second half was dedicated to strategic discussions. Outcomes of the workshop include a revised “Minimum Information about a Genome Sequence” (MIGS specification (v1.1, consensus on a variety of features to be added to the Genome Catalogue (GCat, agreement by several researchers to adopt MIGS for imminent genome publications, and an agreement by the EBI and NCBI to input their genome collections into GCat for the purpose of quantifying the amount of optional data already available (e.g., for geographic location coordinates and working towards a single, global list of all public genomes and metagenomes.

  9. Genomics Portals: integrative web-platform for mining genomics data.

    Science.gov (United States)

    Shinde, Kaustubh; Phatak, Mukta; Johannes, Freudenberg M; Chen, Jing; Li, Qian; Vineet, Joshi K; Hu, Zhen; Ghosh, Krishnendu; Meller, Jaroslaw; Medvedovic, Mario

    2010-01-13

    A large amount of experimental data generated by modern high-throughput technologies is available through various public repositories. Our knowledge about molecular interaction networks, functional biological pathways and transcriptional regulatory modules is rapidly expanding, and is being organized in lists of functionally related genes. Jointly, these two sources of information hold a tremendous potential for gaining new insights into functioning of living systems. Genomics Portals platform integrates access to an extensive knowledge base and a large database of human, mouse, and rat genomics data with basic analytical visualization tools. It provides the context for analyzing and interpreting new experimental data and the tool for effective mining of a large number of publicly available genomics datasets stored in the back-end databases. The uniqueness of this platform lies in the volume and the diversity of genomics data that can be accessed and analyzed (gene expression, ChIP-chip, ChIP-seq, epigenomics, computationally predicted binding sites, etc), and the integration with an extensive knowledge base that can be used in such analysis. The integrated access to primary genomics data, functional knowledge and analytical tools makes Genomics Portals platform a unique tool for interpreting results of new genomics experiments and for mining the vast amount of data stored in the Genomics Portals backend databases. Genomics Portals can be accessed and used freely at http://GenomicsPortals.org.

  10. Genomics Portals: integrative web-platform for mining genomics data

    Directory of Open Access Journals (Sweden)

    Ghosh Krishnendu

    2010-01-01

    Full Text Available Abstract Background A large amount of experimental data generated by modern high-throughput technologies is available through various public repositories. Our knowledge about molecular interaction networks, functional biological pathways and transcriptional regulatory modules is rapidly expanding, and is being organized in lists of functionally related genes. Jointly, these two sources of information hold a tremendous potential for gaining new insights into functioning of living systems. Results Genomics Portals platform integrates access to an extensive knowledge base and a large database of human, mouse, and rat genomics data with basic analytical visualization tools. It provides the context for analyzing and interpreting new experimental data and the tool for effective mining of a large number of publicly available genomics datasets stored in the back-end databases. The uniqueness of this platform lies in the volume and the diversity of genomics data that can be accessed and analyzed (gene expression, ChIP-chip, ChIP-seq, epigenomics, computationally predicted binding sites, etc, and the integration with an extensive knowledge base that can be used in such analysis. Conclusion The integrated access to primary genomics data, functional knowledge and analytical tools makes Genomics Portals platform a unique tool for interpreting results of new genomics experiments and for mining the vast amount of data stored in the Genomics Portals backend databases. Genomics Portals can be accessed and used freely at http://GenomicsPortals.org.

  11. Family genome browser: visualizing genomes with pedigree information.

    Science.gov (United States)

    Juan, Liran; Liu, Yongzhuang; Wang, Yongtian; Teng, Mingxiang; Zang, Tianyi; Wang, Yadong

    2015-07-15

    Families with inherited diseases are widely used in Mendelian/complex disease studies. Owing to the advances in high-throughput sequencing technologies, family genome sequencing becomes more and more prevalent. Visualizing family genomes can greatly facilitate human genetics studies and personalized medicine. However, due to the complex genetic relationships and high similarities among genomes of consanguineous family members, family genomes are difficult to be visualized in traditional genome visualization framework. How to visualize the family genome variants and their functions with integrated pedigree information remains a critical challenge. We developed the Family Genome Browser (FGB) to provide comprehensive analysis and visualization for family genomes. The FGB can visualize family genomes in both individual level and variant level effectively, through integrating genome data with pedigree information. Family genome analysis, including determination of parental origin of the variants, detection of de novo mutations, identification of potential recombination events and identical-by-decent segments, etc., can be performed flexibly. Diverse annotations for the family genome variants, such as dbSNP memberships, linkage disequilibriums, genes, variant effects, potential phenotypes, etc., are illustrated as well. Moreover, the FGB can automatically search de novo mutations and compound heterozygous variants for a selected individual, and guide investigators to find high-risk genes with flexible navigation options. These features enable users to investigate and understand family genomes intuitively and systematically. The FGB is available at http://mlg.hit.edu.cn/FGB/. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  12. Human Germline Genome Editing.

    Science.gov (United States)

    Ormond, Kelly E; Mortlock, Douglas P; Scholes, Derek T; Bombard, Yvonne; Brody, Lawrence C; Faucett, W Andrew; Garrison, Nanibaa' A; Hercher, Laura; Isasi, Rosario; Middleton, Anna; Musunuru, Kiran; Shriner, Daniel; Virani, Alice; Young, Caroline E

    2017-08-03

    With CRISPR/Cas9 and other genome-editing technologies, successful somatic and germline genome editing are becoming feasible. To respond, an American Society of Human Genetics (ASHG) workgroup developed this position statement, which was approved by the ASHG Board in March 2017. The workgroup included representatives from the UK Association of Genetic Nurses and Counsellors, Canadian Association of Genetic Counsellors, International Genetic Epidemiology Society, and US National Society of Genetic Counselors. These groups, as well as the American Society for Reproductive Medicine, Asia Pacific Society of Human Genetics, British Society for Genetic Medicine, Human Genetics Society of Australasia, Professional Society of Genetic Counselors in Asia, and Southern African Society for Human Genetics, endorsed the final statement. The statement includes the following positions. (1) At this time, given the nature and number of unanswered scientific, ethical, and policy questions, it is inappropriate to perform germline gene editing that culminates in human pregnancy. (2) Currently, there is no reason to prohibit in vitro germline genome editing on human embryos and gametes, with appropriate oversight and consent from donors, to facilitate research on the possible future clinical applications of gene editing. There should be no prohibition on making public funds available to support this research. (3) Future clinical application of human germline genome editing should not proceed unless, at a minimum, there is (a) a compelling medical rationale, (b) an evidence base that supports its clinical use, (c) an ethical justification, and (d) a transparent public process to solicit and incorporate stakeholder input. Copyright © 2017 American Society of Human Genetics. All rights reserved.

  13. Genomic Prediction from Whole Genome Sequence in Livestock: The 1000 Bull Genomes Project

    DEFF Research Database (Denmark)

    Hayes, Benjamin J; MacLeod, Iona M; Daetwyler, Hans D

    Advantages of using whole genome sequence data to predict genomic estimated breeding values (GEBV) include better persistence of accuracy of GEBV across generations and more accurate GEBV across breeds. The 1000 Bull Genomes Project provides a database of whole genome sequenced key ancestor bulls....... In a dairy data set, predictions using BayesRC and imputed sequence data from 1000 Bull Genomes were 2% more accurate than with 800k data. We could demonstrate the method identified causal mutations in some cases. Further improvements will come from more accurate imputation of sequence variant genotypes...

  14. Genomic technologies in neonatology

    Directory of Open Access Journals (Sweden)

    L. N. Chernova

    2017-01-01

    Full Text Available In recent years, there has been a tremendous trend toward personalized medicine. Advances in the field forced clinicians, including neonatologists, to take a fresh look at prevention, tactics of management and therapy of various diseases. In the center of attention of foreign, and increasingly Russian, researchers and doctors, there are individual genomic data that allow not only to assess the risks of some form of pathology, but also to successfully apply personalized strategies of prediction, prevention and targeted treatment. This article provides a brief review of the latest achievements of genomic technologies in newborns, examines the problems and potential applications of genomics in promoting the concept of personalized medicine in neonatology. The increasing amount of personalized data simply impossible to analyze only by the human mind. In this connection, the need of computers and bioinformatics is obvious. The article reveals the role of translational bioinformatics in the analysis and integration of the results of the accumulated fundamental research into complete clinical decisions. The latest advances in neonatal translational bioinformatics such as clinical decision support systems are considered. It helps to monitor vital parameters of newborns influencing the course of a particular disease, to calculate the increased risks of the development of various pathologies and to select the drugs.

  15. Value-based genomics.

    Science.gov (United States)

    Gong, Jun; Pan, Kathy; Fakih, Marwan; Pal, Sumanta; Salgia, Ravi

    2018-03-20

    Advancements in next-generation sequencing have greatly enhanced the development of biomarker-driven cancer therapies. The affordability and availability of next-generation sequencers have allowed for the commercialization of next-generation sequencing platforms that have found widespread use for clinical-decision making and research purposes. Despite the greater availability of tumor molecular profiling by next-generation sequencing at our doorsteps, the achievement of value-based care, or improving patient outcomes while reducing overall costs or risks, in the era of precision oncology remains a looming challenge. In this review, we highlight available data through a pre-established and conceptualized framework for evaluating value-based medicine to assess the cost (efficiency), clinical benefit (effectiveness), and toxicity (safety) of genomic profiling in cancer care. We also provide perspectives on future directions of next-generation sequencing from targeted panels to whole-exome or whole-genome sequencing and describe potential strategies needed to attain value-based genomics.

  16. Comparative Genomics Reveals High Genomic Diversity in the Genus Photobacterium

    DEFF Research Database (Denmark)

    Machado, Henrique; Gram, Lone

    2017-01-01

    was widespread and abundant in the genus, suggesting a role in genomic evolution. The high genetic variability and indications of genetic exchange make it difficult to elucidate genome evolutionary paths and raise the awareness of the roles of foreign DNA in the genomic evolution of environmental organisms.......Vibrionaceae is a large marine bacterial family, which can constitute up to 50% of the prokaryotic population in marine waters. Photobacterium is the second largest genus in the family and we used comparative genomics on 35 strains representing 16 of the 28 species described so far, to understand...... the genomic diversity present in the Photobacterium genus. Such understanding is important for ecophysiology studies of the genus. We used whole genome sequences to evaluate phylogenetic relationships using several analyses (16S rRNA, MLSA, fur, amino-acid usage, ANI), which allowed us to identify two...

  17. Genome update: the 1000th genome - a cautionary tale

    DEFF Research Database (Denmark)

    Lagesen, Karin; Ussery, David; Wassenaar, Gertrude Maria

    2010-01-01

    conclusions for example about the largest bacterial genome sequenced. Biological diversity is far greater than many have thought. For example, analysis of multiple Escherichia coli genomes has led to an estimate of around 45 000 gene families more genes than are recognized in the human genome. Moreover......There are now more than 1000 sequenced prokaryotic genomes deposited in public databases and available for analysis. Currently, although the sequence databases GenBank, DNA Database of Japan and EMBL are synchronized continually, there are slight differences in content at the genomes level...... for a variety of logistical reasons, including differences in format and loading errors, such as those caused by file transfer protocol interruptions. This means that the 1000th genome will be different in the various databases. Some of the data on the highly accessed web pages are inaccurate, leading to false...

  18. Efficient Breeding by Genomic Mating.

    Science.gov (United States)

    Akdemir, Deniz; Sánchez, Julio I

    2016-01-01

    Selection in breeding programs can be done by using phenotypes (phenotypic selection), pedigree relationship (breeding value selection) or molecular markers (marker assisted selection or genomic selection). All these methods are based on truncation selection, focusing on the best performance of parents before mating. In this article we proposed an approach to breeding, named genomic mating, which focuses on mating instead of truncation selection. Genomic mating uses information in a similar fashion to genomic selection but includes information on complementation of parents to be mated. Following the efficiency frontier surface, genomic mating uses concepts of estimated breeding values, risk (usefulness) and coefficient of ancestry to optimize mating between parents. We used a genetic algorithm to find solutions to this optimization problem and the results from our simulations comparing genomic selection, phenotypic selection and the mating approach indicate that current approach for breeding complex traits is more favorable than phenotypic and genomic selection. Genomic mating is similar to genomic selection in terms of estimating marker effects, but in genomic mating the genetic information and the estimated marker effects are used to decide which genotypes should be crossed to obtain the next breeding population.

  19. Comparative Genomics Reveals High Genomic Diversity in the Genus Photobacterium

    OpenAIRE

    Henrique Machado; Henrique Machado; Lone Gram

    2017-01-01

    Vibrionaceae is a large marine bacterial family, which can constitute up to 50% of the prokaryotic population in marine waters. Photobacterium is the second largest genus in the family and we used comparative genomics on 35 strains representing 16 of the 28 species described so far, to understand the genomic diversity present in the Photobacterium genus. Such understanding is important for ecophysiology studies of the genus. We used whole genome sequences to evaluate phylogenetic relationship...

  20. Genome Surfing As Driver of Microbial Genomic Diversity.

    Science.gov (United States)

    Choudoir, Mallory J; Panke-Buisse, Kevin; Andam, Cheryl P; Buckley, Daniel H

    2017-08-01

    Historical changes in population size, such as those caused by demographic range expansions, can produce nonadaptive changes in genomic diversity through mechanisms such as gene surfing. We propose that demographic range expansion of a microbial population capable of horizontal gene exchange can result in genome surfing, a mechanism that can cause widespread increase in the pan-genome frequency of genes acquired by horizontal gene exchange. We explain that patterns of genetic diversity within Streptomyces are consistent with genome surfing, and we describe several predictions for testing this hypothesis both in Streptomyces and in other microorganisms. Copyright © 2017 Elsevier Ltd. All rights reserved.

  1. Genome U-Plot: a whole genome visualization.

    Science.gov (United States)

    Gaitatzes, Athanasios; Johnson, Sarah H; Smadbeck, James B; Vasmatzis, George

    2018-05-15

    The ability to produce and analyze whole genome sequencing (WGS) data from samples with structural variations (SV) generated the need to visualize such abnormalities in simplified plots. Conventional two-dimensional representations of WGS data frequently use either circular or linear layouts. There are several diverse advantages regarding both these representations, but their major disadvantage is that they do not use the two-dimensional space very efficiently. We propose a layout, termed the Genome U-Plot, which spreads the chromosomes on a two-dimensional surface and essentially quadruples the spatial resolution. We present the Genome U-Plot for producing clear and intuitive graphs that allows researchers to generate novel insights and hypotheses by visualizing SVs such as deletions, amplifications, and chromoanagenesis events. The main features of the Genome U-Plot are its layered layout, its high spatial resolution and its improved aesthetic qualities. We compare conventional visualization schemas with the Genome U-Plot using visualization metrics such as number of line crossings and crossing angle resolution measures. Based on our metrics, we improve the readability of the resulting graph by at least 2-fold, making apparent important features and making it easy to identify important genomic changes. A whole genome visualization tool with high spatial resolution and improved aesthetic qualities. An implementation and documentation of the Genome U-Plot is publicly available at https://github.com/gaitat/GenomeUPlot. vasmatzis.george@mayo.edu. Supplementary data are available at Bioinformatics online.

  2. Genomic Data Commons and Genomic Cloud Pilots - Google Hangout

    Science.gov (United States)

    Join us for a live, moderated discussion about two NCI efforts to expand access to cancer genomics data: the Genomic Data Commons and Genomic Cloud Pilots. NCI subject matters experts will include Louis M. Staudt, M.D., Ph.D., Director Center for Cancer Genomics, Warren Kibbe, Ph.D., Director, NCI Center for Biomedical Informatics and Information Technology, and moderated by Anthony Kerlavage, Ph.D., Chief, Cancer Informatics Branch, Center for Biomedical Informatics and Information Technology. We welcome your questions before and during the Hangout on Twitter using the hashtag #AskNCI.

  3. Ensembl 2002: accommodating comparative genomics.

    Science.gov (United States)

    Clamp, M; Andrews, D; Barker, D; Bevan, P; Cameron, G; Chen, Y; Clark, L; Cox, T; Cuff, J; Curwen, V; Down, T; Durbin, R; Eyras, E; Gilbert, J; Hammond, M; Hubbard, T; Kasprzyk, A; Keefe, D; Lehvaslaiho, H; Iyer, V; Melsopp, C; Mongin, E; Pettett, R; Potter, S; Rust, A; Schmidt, E; Searle, S; Slater, G; Smith, J; Spooner, W; Stabenau, A; Stalker, J; Stupka, E; Ureta-Vidal, A; Vastrik, I; Birney, E

    2003-01-01

    The Ensembl (http://www.ensembl.org/) database project provides a bioinformatics framework to organise biology around the sequences of large genomes. It is a comprehensive source of stable automatic annotation of human, mouse and other genome sequences, available as either an interactive web site or as flat files. Ensembl also integrates manually annotated gene structures from external sources where available. As well as being one of the leading sources of genome annotation, Ensembl is an open source software engineering project to develop a portable system able to handle very large genomes and associated requirements. These range from sequence analysis to data storage and visualisation and installations exist around the world in both companies and at academic sites. With both human and mouse genome sequences available and more vertebrate sequences to follow, many of the recent developments in Ensembl have focusing on developing automatic comparative genome analysis and visualisation.

  4. The Ensembl genome database project.

    Science.gov (United States)

    Hubbard, T; Barker, D; Birney, E; Cameron, G; Chen, Y; Clark, L; Cox, T; Cuff, J; Curwen, V; Down, T; Durbin, R; Eyras, E; Gilbert, J; Hammond, M; Huminiecki, L; Kasprzyk, A; Lehvaslaiho, H; Lijnzaad, P; Melsopp, C; Mongin, E; Pettett, R; Pocock, M; Potter, S; Rust, A; Schmidt, E; Searle, S; Slater, G; Smith, J; Spooner, W; Stabenau, A; Stalker, J; Stupka, E; Ureta-Vidal, A; Vastrik, I; Clamp, M

    2002-01-01

    The Ensembl (http://www.ensembl.org/) database project provides a bioinformatics framework to organise biology around the sequences of large genomes. It is a comprehensive source of stable automatic annotation of the human genome sequence, with confirmed gene predictions that have been integrated with external data sources, and is available as either an interactive web site or as flat files. It is also an open source software engineering project to develop a portable system able to handle very large genomes and associated requirements from sequence analysis to data storage and visualisation. The Ensembl site is one of the leading sources of human genome sequence annotation and provided much of the analysis for publication by the international human genome project of the draft genome. The Ensembl system is being installed around the world in both companies and academic sites on machines ranging from supercomputers to laptops.

  5. Comparative Genomics in Homo sapiens.

    Science.gov (United States)

    Oti, Martin; Sammeth, Michael

    2018-01-01

    Genomes can be compared at different levels of divergence, either between species or within species. Within species genomes can be compared between different subpopulations, such as human subpopulations from different continents. Investigating the genomic differences between different human subpopulations is important when studying complex diseases that are affected by many genetic variants, as the variants involved can differ between populations. The 1000 Genomes Project collected genome-scale variation data for 2504 human individuals from 26 different populations, enabling a systematic comparison of variation between human subpopulations. In this chapter, we present step-by-step a basic protocol for the identification of population-specific variants employing the 1000 Genomes data. These variants are subsequently further investigated for those that affect the proteome or RNA splice sites, to investigate potentially biologically relevant differences between the populations.

  6. The genome of Arabidopsis thaliana.

    OpenAIRE

    Goodman, H M; Ecker, J R; Dean, C

    1995-01-01

    Arabidopsis thaliana is a small flowering plant that is a member of the family cruciferae. It has many characteristics--diploid genetics, rapid growth cycle, relatively low repetitive DNA content, and small genome size--that recommend it as the model for a plant genome project. The current status of the genetic and physical maps, as well as efforts to sequence the genome, are presented. Examples are given of genes isolated by using map-based cloning. The importance of the Arabidopsis project ...

  7. Advances in editing microalgae genomes

    OpenAIRE

    Daboussi, Fayza

    2017-01-01

    There have been significant advances in microalgal genomics over the last decade. Nevertheless, there are still insufficient tools for the manipulation of microalgae genomes and the development of microalgae as industrial biofactories. Several research groups have recently contributed to progress by demonstrating that particular nucleases can be used for targeted and stable modifications of the genomes of some microalgae species. The nucleases include Meganucleases, Zinc Finger nucleases, TAL...

  8. Genomic selection in plant breeding.

    Science.gov (United States)

    Newell, Mark A; Jannink, Jean-Luc

    2014-01-01

    Genomic selection (GS) is a method to predict the genetic value of selection candidates based on the genomic estimated breeding value (GEBV) predicted from high-density markers positioned throughout the genome. Unlike marker-assisted selection, the GEBV is based on all markers including both minor and major marker effects. Thus, the GEBV may capture more of the genetic variation for the particular trait under selection.

  9. Genomic Feature Models

    DEFF Research Database (Denmark)

    Sørensen, Peter; Edwards, Stefan McKinnon; Rohde, Palle Duun

    -additive genetic mechanisms. These modeling approaches have proven to be highly useful to determine population genetic parameters as well as prediction of genetic risk or value. We present a series of statistical modelling approaches that use prior biological information for evaluating the collective action......Whole-genome sequences and multiple trait phenotypes from large numbers of individuals will soon be available in many populations. Well established statistical modeling approaches enable the genetic analyses of complex trait phenotypes while accounting for a variety of additive and non...... regions and gene ontologies) that provide better model fit and increase predictive ability of the statistical model for this trait....

  10. Genomic dairy cattle breeding

    DEFF Research Database (Denmark)

    Mark, Thomas; Sandøe, Peter

    2010-01-01

    the thoughts of breeders and other stakeholders on how to best make use of genomic breeding in the future. Intensive breeding has played a major role in securing dramatic increases in milk yield since the Second World War. Until recently, the main focus in dairy cattle breeding was on production traits...... it less accountable to the concern of private farmers for the welfare of their animals. It is argued that there is a need to mobilise a wide range of stakeholders to monitor developments and maintain pressure on breeding companies so that they are aware of the need to take precautionary measures to avoid...

  11. Organizational heterogeneity of vertebrate genomes.

    Science.gov (United States)

    Frenkel, Svetlana; Kirzhner, Valery; Korol, Abraham

    2012-01-01

    Genomes of higher eukaryotes are mosaics of segments with various structural, functional, and evolutionary properties. The availability of whole-genome sequences allows the investigation of their structure as "texts" using different statistical and computational methods. One such method, referred to as Compositional Spectra (CS) analysis, is based on scoring the occurrences of fixed-length oligonucleotides (k-mers) in the target DNA sequence. CS analysis allows generating species- or region-specific characteristics of the genome, regardless of their length and the presence of coding DNA. In this study, we consider the heterogeneity of vertebrate genomes as a joint effect of regional variation in sequence organization superimposed on the differences in nucleotide composition. We estimated compositional and organizational heterogeneity of genome and chromosome sequences separately and found that both heterogeneity types vary widely among genomes as well as among chromosomes in all investigated taxonomic groups. The high correspondence of heterogeneity scores obtained on three genome fractions, coding, repetitive, and the remaining part of the noncoding DNA (the genome dark matter--GDM) allows the assumption that CS-heterogeneity may have functional relevance to genome regulation. Of special interest for such interpretation is the fact that natural GDM sequences display the highest deviation from the corresponding reshuffled sequences.

  12. Organizational heterogeneity of vertebrate genomes.

    Directory of Open Access Journals (Sweden)

    Svetlana Frenkel

    Full Text Available Genomes of higher eukaryotes are mosaics of segments with various structural, functional, and evolutionary properties. The availability of whole-genome sequences allows the investigation of their structure as "texts" using different statistical and computational methods. One such method, referred to as Compositional Spectra (CS analysis, is based on scoring the occurrences of fixed-length oligonucleotides (k-mers in the target DNA sequence. CS analysis allows generating species- or region-specific characteristics of the genome, regardless of their length and the presence of coding DNA. In this study, we consider the heterogeneity of vertebrate genomes as a joint effect of regional variation in sequence organization superimposed on the differences in nucleotide composition. We estimated compositional and organizational heterogeneity of genome and chromosome sequences separately and found that both heterogeneity types vary widely among genomes as well as among chromosomes in all investigated taxonomic groups. The high correspondence of heterogeneity scores obtained on three genome fractions, coding, repetitive, and the remaining part of the noncoding DNA (the genome dark matter--GDM allows the assumption that CS-heterogeneity may have functional relevance to genome regulation. Of special interest for such interpretation is the fact that natural GDM sequences display the highest deviation from the corresponding reshuffled sequences.

  13. Genome engineering in Vibrio cholerae

    DEFF Research Database (Denmark)

    Val, Marie-Eve; Skovgaard, Ole; Ducos-Galand, Magaly

    2012-01-01

    Although bacteria with multipartite genomes are prevalent, our knowledge of the mechanisms maintaining their genome is very limited, and much remains to be learned about the structural and functional interrelationships of multiple chromosomes. Owing to its bi-chromosomal genome architecture and its....... This difficulty was surmounted using a unique and powerful strategy based on massive rearrangement of prokaryotic genomes. We developed a site-specific recombination-based engineering tool, which allows targeted, oriented, and reciprocal DNA exchanges. Using this genetic tool, we obtained a panel of V. cholerae...

  14. Genome Writing: Current Progress and Related Applications

    Directory of Open Access Journals (Sweden)

    Yueqiang Wang

    2018-02-01

    Full Text Available The ultimate goal of synthetic biology is to build customized cells or organisms to meet specific industrial or medical needs. The most important part of the customized cell is a synthetic genome. Advanced genomic writing technologies are required to build such an artificial genome. Recently, the partially-completed synthetic yeast genome project represents a milestone in this field. In this mini review, we briefly introduce the techniques for de novo genome synthesis and genome editing. Furthermore, we summarize recent research progresses and highlight several applications in the synthetic genome field. Finally, we discuss current challenges and future prospects. Keywords: Synthetic biology, Genome writing, Genome editing, Bioethics, Biosafety

  15. Comparative genomics of Lactobacillus and other LAB

    DEFF Research Database (Denmark)

    Wassenaar, Trudy M.; Lukjancenko, Oksana

    2014-01-01

    that of the others, with the two Streptococcus species having the shortest genomes. The widest distribution in genome content was observed for Lactobacillus. The number of tRNA and rRNA gene copies varied considerably, with exceptional high numbers observed for Lb. delbrueckii, while these numbers were relatively......The genomes of 66 LABs, belonging to five different genera, were compared for genome size and gene content. The analyzed genomes included 37 Lactobacillus genomes of 17 species, six Lactococcus lactis genomes, four Leuconostoc genomes of three species, six Streptococcus genomes of two species...

  16. Genome Update: alignment of bacterial chromosomes

    DEFF Research Database (Denmark)

    Ussery, David; Jensen, Mette; Poulsen, Tine Rugh

    2004-01-01

    There are four new microbial genomes listed in this month's Genome Update, three belonging to Gram-positive bacteria and one belonging to an archaeon that lives at pH 0; all of these genomes are listed in Table 1⇓. The method of genome comparison this month is that of genome alignment and, as an ...

  17. Insights into structural variations and genome rearrangements in prokaryotic genomes.

    Science.gov (United States)

    Periwal, Vinita; Scaria, Vinod

    2015-01-01

    Structural variations (SVs) are genomic rearrangements that affect fairly large fragments of DNA. Most of the SVs such as inversions, deletions and translocations have been largely studied in context of genetic diseases in eukaryotes. However, recent studies demonstrate that genome rearrangements can also have profound impact on prokaryotic genomes, leading to altered cell phenotype. In contrast to single-nucleotide variations, SVs provide a much deeper insight into organization of bacterial genomes at a much better resolution. SVs can confer change in gene copy number, creation of new genes, altered gene expression and many other functional consequences. High-throughput technologies have now made it possible to explore SVs at a much refined resolution in bacterial genomes. Through this review, we aim to highlight the importance of the less explored field of SVs in prokaryotic genomes and their impact. We also discuss its potential applicability in the emerging fields of synthetic biology and genome engineering where targeted SVs could serve to create sophisticated and accurate genome editing. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  18. Comparative genomics reveals insights into avian genome evolution and adaptation

    DEFF Research Database (Denmark)

    Zhang, Guojie; Li, Cai; Li, Qiye

    2014-01-01

    Birds are the most species-rich class of tetrapod vertebrates and have wide relevance across many research fields. We explored bird macroevolution using full genomes from 48 avian species representing all major extant clades. The avian genome is principally characterized by its constrained size, ...

  19. Pre-genomic, genomic and post-genomic study of microbial communities involved in bioenergy.

    Science.gov (United States)

    Rittmann, Bruce E; Krajmalnik-Brown, Rosa; Halden, Rolf U

    2008-08-01

    Microorganisms can produce renewable energy in large quantities and without damaging the environment or disrupting food supply. The microbial communities must be robust and self-stabilizing, and their essential syntrophies must be managed. Pre-genomic, genomic and post-genomic tools can provide crucial information about the structure and function of these microbial communities. Applying these tools will help accelerate the rate at which microbial bioenergy processes move from intriguing science to real-world practice.

  20. Genomic instability and radiation

    Energy Technology Data Exchange (ETDEWEB)

    Little, John B [Harvard School of Public Health, Boston, MA 02115 (United States)

    2003-06-01

    Genomic instability is a hallmark of cancer cells, and is thought to be involved in the process of carcinogenesis. Indeed, a number of rare genetic disorders associated with a predisposition to cancer are characterised by genomic instability occurring in somatic cells. Of particular interest is the observation that transmissible instability can be induced in somatic cells from normal individuals by exposure to ionising radiation, leading to a persistent enhancement in the rate at which mutations and chromosomal aberrations arise in the progeny of the irradiated cells after many generations of replication. If such induced instability is involved in radiation carcinogenesis, it would imply that the initial carcinogenic event may not be a rare mutation occurring in a specific gene or set of genes. Rather, radiation may induce a process of instability in many cells in a population, enhancing the rate at which the multiple gene mutations necessary for the development of cancer may arise in a given cell lineage. Furthermore, radiation could act at any stage in the development of cancer by facilitating the accumulation of the remaining genetic events required to produce a fully malignant tumour. The experimental evidence for such induced instability is reviewed. (review)

  1. Genomic instability and radiation

    International Nuclear Information System (INIS)

    Little, John B

    2003-01-01

    Genomic instability is a hallmark of cancer cells, and is thought to be involved in the process of carcinogenesis. Indeed, a number of rare genetic disorders associated with a predisposition to cancer are characterised by genomic instability occurring in somatic cells. Of particular interest is the observation that transmissible instability can be induced in somatic cells from normal individuals by exposure to ionising radiation, leading to a persistent enhancement in the rate at which mutations and chromosomal aberrations arise in the progeny of the irradiated cells after many generations of replication. If such induced instability is involved in radiation carcinogenesis, it would imply that the initial carcinogenic event may not be a rare mutation occurring in a specific gene or set of genes. Rather, radiation may induce a process of instability in many cells in a population, enhancing the rate at which the multiple gene mutations necessary for the development of cancer may arise in a given cell lineage. Furthermore, radiation could act at any stage in the development of cancer by facilitating the accumulation of the remaining genetic events required to produce a fully malignant tumour. The experimental evidence for such induced instability is reviewed. (review)

  2. Theory of microbial genome evolution

    Science.gov (United States)

    Koonin, Eugene

    Bacteria and archaea have small genomes tightly packed with protein-coding genes. This compactness is commonly perceived as evidence of adaptive genome streamlining caused by strong purifying selection in large microbial populations. In such populations, even the small cost incurred by nonfunctional DNA because of extra energy and time expenditure is thought to be sufficient for this extra genetic material to be eliminated by selection. However, contrary to the predictions of this model, there exists a consistent, positive correlation between the strength of selection at the protein sequence level, measured as the ratio of nonsynonymous to synonymous substitution rates, and microbial genome size. By fitting the genome size distributions in multiple groups of prokaryotes to predictions of mathematical models of population evolution, we show that only models in which acquisition of additional genes is, on average, slightly beneficial yield a good fit to genomic data. Thus, the number of genes in prokaryotic genomes seems to reflect the equilibrium between the benefit of additional genes that diminishes as the genome grows and deletion bias. New genes acquired by microbial genomes, on average, appear to be adaptive. Evolution of bacterial and archaeal genomes involves extensive horizontal gene transfer and gene loss. Many microbes have open pangenomes, where each newly sequenced genome contains more than 10% `ORFans', genes without detectable homologues in other species. A simple, steady-state evolutionary model reveals two sharply distinct classes of microbial genes, one of which (ORFans) is characterized by effectively instantaneous gene replacement, whereas the other consists of genes with finite, distributed replacement rates. These findings imply a conservative estimate of at least a billion distinct genes in the prokaryotic genomic universe.

  3. Genomic selection: genome-wide prediction in plant improvement.

    Science.gov (United States)

    Desta, Zeratsion Abera; Ortiz, Rodomiro

    2014-09-01

    Association analysis is used to measure relations between markers and quantitative trait loci (QTL). Their estimation ignores genes with small effects that trigger underpinning quantitative traits. By contrast, genome-wide selection estimates marker effects across the whole genome on the target population based on a prediction model developed in the training population (TP). Whole-genome prediction models estimate all marker effects in all loci and capture small QTL effects. Here, we review several genomic selection (GS) models with respect to both the prediction accuracy and genetic gain from selection. Phenotypic selection or marker-assisted breeding protocols can be replaced by selection, based on whole-genome predictions in which phenotyping updates the model to build up the prediction accuracy. Copyright © 2014 Elsevier Ltd. All rights reserved.

  4. Parasite Genome Projects and the Trypanosoma cruzi Genome Initiative

    Directory of Open Access Journals (Sweden)

    Wim Degrave

    1997-11-01

    Full Text Available Since the start of the human genome project, a great number of genome projects on other "model" organism have been initiated, some of them already completed. Several initiatives have also been started on parasite genomes, mainly through support from WHO/TDR, involving North-South and South-South collaborations, and great hopes are vested in that these initiatives will lead to new tools for disease control and prevention, as well as to the establishment of genomic research technology in developing countries. The Trypanosoma cruzi genome project, using the clone CL-Brener as starting point, has made considerable progress through the concerted action of more than 20 laboratories, most of them in the South. A brief overview of the current state of the project is given

  5. A Taste of Algal Genomes from the Joint Genome Institute

    Energy Technology Data Exchange (ETDEWEB)

    Kuo, Alan; Grigoriev, Igor

    2012-06-17

    Algae play profound roles in aquatic food chains and the carbon cycle, can impose health and economic costs through toxic blooms, provide models for the study of symbiosis, photosynthesis, and eukaryotic evolution, and are candidate sources for bio-fuels; all of these research areas are part of the mission of DOE's Joint Genome Institute (JGI). To date JGI has sequenced, assembled, annotated, and released to the public the genomes of 18 species and strains of algae, sampling almost all of the major clades of photosynthetic eukaryotes. With more algal genomes currently undergoing analysis, JGI continues its commitment to driving forward basic and applied algal science. Among these ongoing projects are the pan-genome of the dominant coccolithophore Emiliania huxleyi, the interrelationships between the 4 genomes in the nucleomorph-containing Bigelowiella natans and Guillardia theta, and the search for symbiosis genes of lichens.

  6. OryzaGenome: Genome Diversity Database of Wild Oryza Species

    KAUST Repository

    Ohyanagi, Hajime

    2015-11-18

    The species in the genus Oryza, encompassing nine genome types and 23 species, are a rich genetic resource and may have applications in deeper genomic analyses aiming to understand the evolution of plant genomes. With the advancement of next-generation sequencing (NGS) technology, a flood of Oryza species reference genomes and genomic variation information has become available in recent years. This genomic information, combined with the comprehensive phenotypic information that we are accumulating in our Oryzabase, can serve as an excellent genotype-phenotype association resource for analyzing rice functional and structural evolution, and the associated diversity of the Oryza genus. Here we integrate our previous and future phenotypic/habitat information and newly determined genotype information into a united repository, named OryzaGenome, providing the variant information with hyperlinks to Oryzabase. The current version of OryzaGenome includes genotype information of 446 O. rufipogon accessions derived by imputation and of 17 accessions derived by imputation-free deep sequencing. Two variant viewers are implemented: SNP Viewer as a conventional genome browser interface and Variant Table as a textbased browser for precise inspection of each variant one by one. Portable VCF (variant call format) file or tabdelimited file download is also available. Following these SNP (single nucleotide polymorphism) data, reference pseudomolecules/ scaffolds/contigs and genome-wide variation information for almost all of the closely and distantly related wild Oryza species from the NIG Wild Rice Collection will be available in future releases. All of the resources can be accessed through http://viewer.shigen.info/oryzagenome/.

  7. Cocoa/Cotton Comparative Genomics

    Science.gov (United States)

    With genome sequence from two members of the Malvaceae family recently made available, we are exploring syntenic relationships, gene content, and evolutionary trajectories between the cacao and cotton genomes. An assembly of cacao (Theobroma cacao) using Illumina and 454 sequence technology yielded ...

  8. Genomic selection in dairy cattle

    NARCIS (Netherlands)

    Roos, de A.P.W.

    2011-01-01

    The objectives of this Ph.D. thesis were (1) to optimise genomic selection in dairy cattle with respect to the accuracy of predicting total genetic merit and (2) to optimise a dairy cattle breeding program using genomic selection. The study was performed using a combination of real data sets and

  9. Cloud computing for comparative genomics

    Directory of Open Access Journals (Sweden)

    Pivovarov Rimma

    2010-05-01

    Full Text Available Abstract Background Large comparative genomics studies and tools are becoming increasingly more compute-expensive as the number of available genome sequences continues to rise. The capacity and cost of local computing infrastructures are likely to become prohibitive with the increase, especially as the breadth of questions continues to rise. Alternative computing architectures, in particular cloud computing environments, may help alleviate this increasing pressure and enable fast, large-scale, and cost-effective comparative genomics strategies going forward. To test this, we redesigned a typical comparative genomics algorithm, the reciprocal smallest distance algorithm (RSD, to run within Amazon's Elastic Computing Cloud (EC2. We then employed the RSD-cloud for ortholog calculations across a wide selection of fully sequenced genomes. Results We ran more than 300,000 RSD-cloud processes within the EC2. These jobs were farmed simultaneously to 100 high capacity compute nodes using the Amazon Web Service Elastic Map Reduce and included a wide mix of large and small genomes. The total computation time took just under 70 hours and cost a total of $6,302 USD. Conclusions The effort to transform existing comparative genomics algorithms from local compute infrastructures is not trivial. However, the speed and flexibility of cloud computing environments provides a substantial boost with manageable cost. The procedure designed to transform the RSD algorithm into a cloud-ready application is readily adaptable to similar comparative genomics problems.

  10. Cloud computing for comparative genomics.

    Science.gov (United States)

    Wall, Dennis P; Kudtarkar, Parul; Fusaro, Vincent A; Pivovarov, Rimma; Patil, Prasad; Tonellato, Peter J

    2010-05-18

    Large comparative genomics studies and tools are becoming increasingly more compute-expensive as the number of available genome sequences continues to rise. The capacity and cost of local computing infrastructures are likely to become prohibitive with the increase, especially as the breadth of questions continues to rise. Alternative computing architectures, in particular cloud computing environments, may help alleviate this increasing pressure and enable fast, large-scale, and cost-effective comparative genomics strategies going forward. To test this, we redesigned a typical comparative genomics algorithm, the reciprocal smallest distance algorithm (RSD), to run within Amazon's Elastic Computing Cloud (EC2). We then employed the RSD-cloud for ortholog calculations across a wide selection of fully sequenced genomes. We ran more than 300,000 RSD-cloud processes within the EC2. These jobs were farmed simultaneously to 100 high capacity compute nodes using the Amazon Web Service Elastic Map Reduce and included a wide mix of large and small genomes. The total computation time took just under 70 hours and cost a total of $6,302 USD. The effort to transform existing comparative genomics algorithms from local compute infrastructures is not trivial. However, the speed and flexibility of cloud computing environments provides a substantial boost with manageable cost. The procedure designed to transform the RSD algorithm into a cloud-ready application is readily adaptable to similar comparative genomics problems.

  11. The promise of insect genomics

    DEFF Research Database (Denmark)

    Grimmelikhuijzen, Cornelis J P; Cazzamali, Giuseppe; Williamson, Michael

    2007-01-01

    Insects are the largest animal group in the world and are ecologically and economically extremely important. This importance of insects is reflected by the existence of currently 24 insect genome projects. Our perspective discusses the state-of-the-art of these genome projects and the impacts...

  12. Bioinformatics of genomic association mapping

    NARCIS (Netherlands)

    Vaez Barzani, Ahmad

    2015-01-01

    In this thesis we present an overview of bioinformatics-based approaches for genomic association mapping, with emphasis on human quantitative traits and their contribution to complex diseases. We aim to provide a comprehensive walk-through of the classic steps of genomic association mapping

  13. Allele coding in genomic evaluation

    DEFF Research Database (Denmark)

    Standen, Ismo; Christensen, Ole Fredslund

    2011-01-01

    Genomic data are used in animal breeding to assist genetic evaluation. Several models to estimate genomic breeding values have been studied. In general, two approaches have been used. One approach estimates the marker effects first and then, genomic breeding values are obtained by summing marker...... effects. In the second approach, genomic breeding values are estimated directly using an equivalent model with a genomic relationship matrix. Allele coding is the method chosen to assign values to the regression coefficients in the statistical model. A common allele coding is zero for the homozygous...... genotype of the first allele, one for the heterozygote, and two for the homozygous genotype for the other allele. Another common allele coding changes these regression coefficients by subtracting a value from each marker such that the mean of regression coefficients is zero within each marker. We call...

  14. Pathophysiology of MDS: genomic aberrations.

    Science.gov (United States)

    Ichikawa, Motoshi

    2016-01-01

    Myelodysplastic syndromes (MDS) are characterized by clonal proliferation of hematopoietic stem/progenitor cells and their apoptosis, and show a propensity to progress to acute myelogenous leukemia (AML). Although MDS are recognized as neoplastic diseases caused by genomic aberrations of hematopoietic cells, the details of the genetic abnormalities underlying disease development have not as yet been fully elucidated due to difficulties in analyzing chromosomal abnormalities. Recent advances in comprehensive analyses of disease genomes including whole-genome sequencing technologies have revealed the genomic abnormalities in MDS. Surprisingly, gene mutations were found in approximately 80-90% of cases with MDS, and the novel mutations discovered with these technologies included previously unknown, MDS-specific, mutations such as those of the genes in the RNA-splicing machinery. It is anticipated that these recent studies will shed new light on the pathophysiology of MDS due to genomic aberrations.

  15. Chemical biology on the genome.

    Science.gov (United States)

    Balasubramanian, Shankar

    2014-08-15

    In this article I discuss studies towards understanding the structure and function of DNA in the context of genomes from the perspective of a chemist. The first area I describe concerns the studies that led to the invention and subsequent development of a method for sequencing DNA on a genome scale at high speed and low cost, now known as Solexa/Illumina sequencing. The second theme will feature the four-stranded DNA structure known as a G-quadruplex with a focus on its fundamental properties, its presence in cellular genomic DNA and the prospects for targeting such a structure in cels with small molecules. The final topic for discussion is naturally occurring chemically modified DNA bases with an emphasis on chemistry for decoding (or sequencing) such modifications in genomic DNA. The genome is a fruitful topic to be further elucidated by the creation and application of chemical approaches. Copyright © 2014 Elsevier Ltd. All rights reserved.

  16. IMA Genome-F 5G

    OpenAIRE

    Wingfield, Brenda D.; Barnes, Irene; Wilhelm de Beer, Z.; De Vos, Lieschen; Duong, Tuan A.; Kanzi, Aquillah M.; Naidoo, Kershney; Nguyen, Hai D.T.; Santana, Quentin C.; Sayari, Mohammad; Seifert, Keith A.; Steenkamp, Emma T.; Trollip, Conrad; van der Merwe, Nicolaas A.; van der Nest, Magriet A.

    2015-01-01

    The genomes of Ceratocystis eucalypticola, Chrysoporthe cubensis, Chrysoporthe deuterocubensis, Davidsoniella virescens, Fusarium temperatum, Graphilbum fragrans, Penicillium nordicum and Thielaviopsis musarum are presented in this genome announcement. These seven genomes are from plant pathogens and otherwise economically important fungal species. The genome sizes range from 28 Mb in the case of T. musarum to 45 Mb for Fusarium temperatum. These genomes include the first reports of genomes f...

  17. [Preface for genome editing special issue].

    Science.gov (United States)

    Gu, Feng; Gao, Caixia

    2017-10-25

    Genome editing technology, as an innovative biotechnology, has been widely used for editing the genome from model organisms, animals, plants and microbes. CRISPR/Cas9-based genome editing technology shows its great value and potential in the dissection of functional genomics, improved breeding and genetic disease treatment. In the present special issue, the principle and application of genome editing techniques has been summarized. The advantages and disadvantages of the current genome editing technology and future prospects would also be highlighted.

  18. Privacy in the Genomic Era.

    Science.gov (United States)

    Naveed, Muhammad; Ayday, Erman; Clayton, Ellen W; Fellay, Jacques; Gunter, Carl A; Hubaux, Jean-Pierre; Malin, Bradley A; Wang, Xiaofeng

    2015-09-01

    Genome sequencing technology has advanced at a rapid pace and it is now possible to generate highly-detailed genotypes inexpensively. The collection and analysis of such data has the potential to support various applications, including personalized medical services. While the benefits of the genomics revolution are trumpeted by the biomedical community, the increased availability of such data has major implications for personal privacy; notably because the genome has certain essential features, which include (but are not limited to) (i) an association with traits and certain diseases, (ii) identification capability (e.g., forensics), and (iii) revelation of family relationships. Moreover, direct-to-consumer DNA testing increases the likelihood that genome data will be made available in less regulated environments, such as the Internet and for-profit companies. The problem of genome data privacy thus resides at the crossroads of computer science, medicine, and public policy. While the computer scientists have addressed data privacy for various data types, there has been less attention dedicated to genomic data. Thus, the goal of this paper is to provide a systematization of knowledge for the computer science community. In doing so, we address some of the (sometimes erroneous) beliefs of this field and we report on a survey we conducted about genome data privacy with biomedical specialists. Then, after characterizing the genome privacy problem, we review the state-of-the-art regarding privacy attacks on genomic data and strategies for mitigating such attacks, as well as contextualizing these attacks from the perspective of medicine and public policy. This paper concludes with an enumeration of the challenges for genome data privacy and presents a framework to systematize the analysis of threats and the design of countermeasures as the field moves forward.

  19. Privacy in the Genomic Era

    Science.gov (United States)

    NAVEED, MUHAMMAD; AYDAY, ERMAN; CLAYTON, ELLEN W.; FELLAY, JACQUES; GUNTER, CARL A.; HUBAUX, JEAN-PIERRE; MALIN, BRADLEY A.; WANG, XIAOFENG

    2015-01-01

    Genome sequencing technology has advanced at a rapid pace and it is now possible to generate highly-detailed genotypes inexpensively. The collection and analysis of such data has the potential to support various applications, including personalized medical services. While the benefits of the genomics revolution are trumpeted by the biomedical community, the increased availability of such data has major implications for personal privacy; notably because the genome has certain essential features, which include (but are not limited to) (i) an association with traits and certain diseases, (ii) identification capability (e.g., forensics), and (iii) revelation of family relationships. Moreover, direct-to-consumer DNA testing increases the likelihood that genome data will be made available in less regulated environments, such as the Internet and for-profit companies. The problem of genome data privacy thus resides at the crossroads of computer science, medicine, and public policy. While the computer scientists have addressed data privacy for various data types, there has been less attention dedicated to genomic data. Thus, the goal of this paper is to provide a systematization of knowledge for the computer science community. In doing so, we address some of the (sometimes erroneous) beliefs of this field and we report on a survey we conducted about genome data privacy with biomedical specialists. Then, after characterizing the genome privacy problem, we review the state-of-the-art regarding privacy attacks on genomic data and strategies for mitigating such attacks, as well as contextualizing these attacks from the perspective of medicine and public policy. This paper concludes with an enumeration of the challenges for genome data privacy and presents a framework to systematize the analysis of threats and the design of countermeasures as the field moves forward. PMID:26640318

  20. The genomes and comparative genomics of Lactobacillus delbrueckii phages.

    Science.gov (United States)

    Riipinen, Katja-Anneli; Forsman, Päivi; Alatossava, Tapani

    2011-07-01

    Lactobacillus delbrueckii phages are a great source of genetic diversity. Here, the genome sequences of Lb. delbrueckii phages LL-Ku, c5 and JCL1032 were analyzed in detail, and the genetic diversity of Lb. delbrueckii phages belonging to different taxonomic groups was explored. The lytic isometric group b phages LL-Ku (31,080 bp) and c5 (31,841 bp) showed a minimum nucleotide sequence identity of 90% over about three-fourths of their genomes. The genomic locations of their lysis modules were unique, and the genomes featured several putative overlapping transcription units of genes. LL-Ku and c5 virions displayed peptidoglycan hydrolytic activity associated with a ~36-kDa protein similar in size to the endolysin. Unexpectedly, the 49,433-bp genome of the prolate phage JCL1032 (temperate, group c) revealed a conserved gene order within its structural genes. Lb. delbrueckii phages representing groups a (a phage LL-H), b and c possessed only limited protein sequence homology. Genomic comparison of LL-Ku and c5 suggested that diversification of Lb. delbrueckii phages is mainly due to insertions, deletions and recombination. For the first time, the complete genome sequences of group b and c Lb. delbrueckii phages are reported.

  1. Genomics technologies to study structural variations in the grapevine genome

    Directory of Open Access Journals (Sweden)

    Cardone Maria Francesca

    2016-01-01

    Full Text Available Grapevine is one of the most important crop plants in the world. Recently there was great expansion of genomics resources about grapevine genome, thus providing increasing efforts for molecular breeding. Current cultivars display a great level of inter-specific differentiation that needs to be investigated to reach a comprehensive understanding of the genetic basis of phenotypic differences, and to find responsible genes selected by cross breeding programs. While there have been significant advances in resolving the pattern and nature of single nucleotide polymorphisms (SNPs on plant genomes, few data are available on copy number variation (CNV. Furthermore association between structural variations and phenotypes has been described in only a few cases. We combined high throughput biotechnologies and bioinformatics tools, to reveal the first inter-varietal atlas of structural variation (SV for the grapevine genome. We sequenced and compared four table grape cultivars with the Pinot noir inbred line PN40024 genome as the reference. We detected roughly 8% of the grapevine genome affected by genomic variations. Taken into account phenotypic differences existing among the studied varieties we performed comparison of SVs among them and the reference and next we performed an in-depth analysis of gene content of polymorphic regions. This allowed us to identify genes showing differences in copy number as putative functional candidates for important traits in grapevine cultivation.

  2. Marine Bacterial Genomics

    DEFF Research Database (Denmark)

    Machado, Henrique

    For decades, terrestrial microorganisms have been used as sources of countless enzymes and chemical compounds that have been produced by pharmaceutical and biotech companies and used by mankind. There is a need for new chemical compounds, including antibiotics,new enzymatic activities and new...... microorganisms to be used as cell factories for production. Therefore exploitation of new microbial niches and use of different strategies is an opportunity to boost discoveries. Even though scientists have started to explore several habitats other than the terrestrial ones, the marine environment stands out...... as a hitherto under-explored niche. This thesis work uses high-throughput sequencing technologies on a collection of marine bacteria established during the Galathea 3 expedition, with the purpose of unraveling new biodiversity and new bioactivities. Several tools were used for genomic analysis in order...

  3. The South Asian genome.

    Directory of Open Access Journals (Sweden)

    John C Chambers

    Full Text Available The genetic sequence variation of people from the Indian subcontinent who comprise one-quarter of the world's population, is not well described. We carried out whole genome sequencing of 168 South Asians, along with whole-exome sequencing of 147 South Asians to provide deeper characterisation of coding regions. We identify 12,962,155 autosomal sequence variants, including 2,946,861 new SNPs and 312,738 novel indels. This catalogue of SNPs and indels amongst South Asians provides the first comprehensive map of genetic variation in this major human population, and reveals evidence for selective pressures on genes involved in skin biology, metabolism, infection and immunity. Our results will accelerate the search for the genetic variants underlying susceptibility to disorders such as type-2 diabetes and cardiovascular disease which are highly prevalent amongst South Asians.

  4. Comparative RNA genomics

    DEFF Research Database (Denmark)

    Backofen, Rolf; Gorodkin, Jan; Hofacker, Ivo L.

    2018-01-01

    Over the last two decades it has become clear that RNA is much more than just a boring intermediate in protein expression. Ancient RNAs still appear in the core information metabolism and comprise a surprisingly large component in bacterial gene regulation. A common theme with these types of mostly...... small RNAs is their reliance of conserved secondary structures. Large scale sequencing projects, on the other hand, have profoundly changed our understanding of eukaryotic genomes. Pervasively transcribed, they give rise to a plethora of large and evolutionarily extremely flexible noncoding RNAs...... that exert a vastly diverse array of molecule functions. In this chapter we provide a—necessarily incomplete—overview of the current state of comparative analysis of noncoding RNAs, emphasizing computational approaches as a means to gain a global picture of the modern RNA world....

  5. Materials Genome Initiative

    Science.gov (United States)

    Vickers, John

    2015-01-01

    The Materials Genome Initiative (MGI) project element is a cross-Center effort that is focused on the integration of computational tools to simulate manufacturing processes and materials behavior. These computational simulations will be utilized to gain understanding of processes and materials behavior to accelerate process development and certification to more efficiently integrate new materials in existing NASA projects and to lead to the design of new materials for improved performance. This NASA effort looks to collaborate with efforts at other government agencies and universities working under the national MGI. MGI plans to develop integrated computational/experimental/ processing methodologies for accelerating discovery and insertion of materials to satisfy NASA's unique mission demands. The challenges include validated design tools that incorporate materials properties, processes, and design requirements; and materials process control to rapidly mature emerging manufacturing methods and develop certified manufacturing processes

  6. Inheritance of the yeast mitochondrial genome

    DEFF Research Database (Denmark)

    Piskur, Jure

    1994-01-01

    Mitochondrion, extrachromosomal genetics, intergenic sequences, genome size, mitochondrial DNA, petite mutation, yeast......Mitochondrion, extrachromosomal genetics, intergenic sequences, genome size, mitochondrial DNA, petite mutation, yeast...

  7. Plantagora: modeling whole genome sequencing and assembly of plant genomes.

    Directory of Open Access Journals (Sweden)

    Roger Barthelson

    Full Text Available BACKGROUND: Genomics studies are being revolutionized by the next generation sequencing technologies, which have made whole genome sequencing much more accessible to the average researcher. Whole genome sequencing with the new technologies is a developing art that, despite the large volumes of data that can be produced, may still fail to provide a clear and thorough map of a genome. The Plantagora project was conceived to address specifically the gap between having the technical tools for genome sequencing and knowing precisely the best way to use them. METHODOLOGY/PRINCIPAL FINDINGS: For Plantagora, a platform was created for generating simulated reads from several different plant genomes of different sizes. The resulting read files mimicked either 454 or Illumina reads, with varying paired end spacing. Thousands of datasets of reads were created, most derived from our primary model genome, rice chromosome one. All reads were assembled with different software assemblers, including Newbler, Abyss, and SOAPdenovo, and the resulting assemblies were evaluated by an extensive battery of metrics chosen for these studies. The metrics included both statistics of the assembly sequences and fidelity-related measures derived by alignment of the assemblies to the original genome source for the reads. The results were presented in a website, which includes a data graphing tool, all created to help the user compare rapidly the feasibility and effectiveness of different sequencing and assembly strategies prior to testing an approach in the lab. Some of our own conclusions regarding the different strategies were also recorded on the website. CONCLUSIONS/SIGNIFICANCE: Plantagora provides a substantial body of information for comparing different approaches to sequencing a plant genome, and some conclusions regarding some of the specific approaches. Plantagora also provides a platform of metrics and tools for studying the process of sequencing and assembly

  8. Genomes in turmoil: quantification of genome dynamics in prokaryote supergenomes.

    Science.gov (United States)

    Puigbò, Pere; Lobkovsky, Alexander E; Kristensen, David M; Wolf, Yuri I; Koonin, Eugene V

    2014-08-21

    Genomes of bacteria and archaea (collectively, prokaryotes) appear to exist in incessant flux, expanding via horizontal gene transfer and gene duplication, and contracting via gene loss. However, the actual rates of genome dynamics and relative contributions of different types of event across the diversity of prokaryotes are largely unknown, as are the sizes of microbial supergenomes, i.e. pools of genes that are accessible to the given microbial species. We performed a comprehensive analysis of the genome dynamics in 35 groups (34 bacterial and one archaeal) of closely related microbial genomes using a phylogenetic birth-and-death maximum likelihood model to quantify the rates of gene family gain and loss, as well as expansion and reduction. The results show that loss of gene families dominates the evolution of prokaryotes, occurring at approximately three times the rate of gain. The rates of gene family expansion and reduction are typically seven and twenty times less than the gain and loss rates, respectively. Thus, the prevailing mode of evolution in bacteria and archaea is genome contraction, which is partially compensated by the gain of new gene families via horizontal gene transfer. However, the rates of gene family gain, loss, expansion and reduction vary within wide ranges, with the most stable genomes showing rates about 25 times lower than the most dynamic genomes. For many groups, the supergenome estimated from the fraction of repetitive gene family gains includes about tenfold more gene families than the typical genome in the group although some groups appear to have vast, 'open' supergenomes. Reconstruction of evolution for groups of closely related bacteria and archaea reveals an extremely rapid and highly variable flux of genes in evolving microbial genomes, demonstrates that extensive gene loss and horizontal gene transfer leading to innovation are the two dominant evolutionary processes, and yields robust estimates of the supergenome size.

  9. 1000 Bull Genomes - Toward genomic Selectionf from whole genome sequence Data in Dairy and Beef Cattle

    NARCIS (Netherlands)

    Hayes, B.; Daetwyler, H.D.; Fries, R.; Guldbrandtsen, B.; Mogens Sando Lund, M.; Didier A. Boichard, D.A.; Stothard, P.; Veerkamp, R.F.; Hulsegge, B.; Rocha, D.; Tassell, C.; Mullaart, E.; Gredler, B.; Druet, T.; Bagnato, A.; Goddard, M.E.; Chamberlain, H.L.

    2013-01-01

    Genomic prediction of breeding values is now used as the basis for selection of dairy cattle, and in some cases beef cattle, in a number of countries. When genomic prediction was introduced most of the information was to thought to be derived from linkage disequilibrium between markers and causative

  10. Comparing Mycobacterium tuberculosis genomes using genome topology networks.

    Science.gov (United States)

    Jiang, Jianping; Gu, Jianlei; Zhang, Liang; Zhang, Chenyi; Deng, Xiao; Dou, Tonghai; Zhao, Guoping; Zhou, Yan

    2015-02-14

    Over the last decade, emerging research methods, such as comparative genomic analysis and phylogenetic study, have yielded new insights into genotypes and phenotypes of closely related bacterial strains. Several findings have revealed that genomic structural variations (SVs), including gene gain/loss, gene duplication and genome rearrangement, can lead to different phenotypes among strains, and an investigation of genes affected by SVs may extend our knowledge of the relationships between SVs and phenotypes in microbes, especially in pathogenic bacteria. In this work, we introduce a 'Genome Topology Network' (GTN) method based on gene homology and gene locations to analyze genomic SVs and perform phylogenetic analysis. Furthermore, the concept of 'unfixed ortholog' has been proposed, whose members are affected by SVs in genome topology among close species. To improve the precision of 'unfixed ortholog' recognition, a strategy to detect annotation differences and complete gene annotation was applied. To assess the GTN method, a set of thirteen complete M. tuberculosis genomes was analyzed as a case study. GTNs with two different gene homology-assigning methods were built, the Clusters of Orthologous Groups (COG) method and the orthoMCL clustering method, and two phylogenetic trees were constructed accordingly, which may provide additional insights into whole genome-based phylogenetic analysis. We obtained 24 unfixable COG groups, of which most members were related to immunogenicity and drug resistance, such as PPE-repeat proteins (COG5651) and transcriptional regulator TetR gene family members (COG1309). The GTN method has been implemented in PERL and released on our website. The tool can be downloaded from http://homepage.fudan.edu.cn/zhouyan/gtn/ , and allows re-annotating the 'lost' genes among closely related genomes, analyzing genes affected by SVs, and performing phylogenetic analysis. With this tool, many immunogenic-related and drug resistance-related genes

  11. A universal genomic coordinate translator for comparative genomics.

    Science.gov (United States)

    Zamani, Neda; Sundström, Görel; Meadows, Jennifer R S; Höppner, Marc P; Dainat, Jacques; Lantz, Henrik; Haas, Brian J; Grabherr, Manfred G

    2014-06-30

    Genomic duplications constitute major events in the evolution of species, allowing paralogous copies of genes to take on fine-tuned biological roles. Unambiguously identifying the orthology relationship between copies across multiple genomes can be resolved by synteny, i.e. the conserved order of genomic sequences. However, a comprehensive analysis of duplication events and their contributions to evolution would require all-to-all genome alignments, which increases at N2 with the number of available genomes, N. Here, we introduce Kraken, software that omits the all-to-all requirement by recursively traversing a graph of pairwise alignments and dynamically re-computing orthology. Kraken scales linearly with the number of targeted genomes, N, which allows for including large numbers of genomes in analyses. We first evaluated the method on the set of 12 Drosophila genomes, finding that orthologous correspondence computed indirectly through a graph of multiple synteny maps comes at minimal cost in terms of sensitivity, but reduces overall computational runtime by an order of magnitude. We then used the method on three well-annotated mammalian genomes, human, mouse, and rat, and show that up to 93% of protein coding transcripts have unambiguous pairwise orthologous relationships across the genomes. On a nucleotide level, 70 to 83% of exons match exactly at both splice junctions, and up to 97% on at least one junction. We last applied Kraken to an RNA-sequencing dataset from multiple vertebrates and diverse tissues, where we confirmed that brain-specific gene family members, i.e. one-to-many or many-to-many homologs, are more highly correlated across species than single-copy (i.e. one-to-one homologous) genes. Not limited to protein coding genes, Kraken also identifies thousands of newly identified transcribed loci, likely non-coding RNAs that are consistently transcribed in human, chimpanzee and gorilla, and maintain significant correlation of expression levels across

  12. The Perennial Ryegrass GenomeZipper – Targeted Use of Genome Resources for Comparative Grass Genomics

    DEFF Research Database (Denmark)

    Pfeiffer, Matthias; Martis, Mihaela; Asp, Torben

    2013-01-01

    (Lolium perenne) genome on the basis of conserved synteny to barley (Hordeum vulgare) and the model grass genome Brachypodium (Brachypodium distachyon) as well as rice (Oryza sativa) and sorghum (Sorghum bicolor). A transcriptome-based genetic linkage map of perennial ryegrass served as a scaffold......Whole-genome sequences established for model and major crop species constitute a key resource for advanced genomic research. For outbreeding forage and turf grass species like ryegrasses (Lolium spp.), such resources have yet to be developed. Here, we present a model of the perennial ryegrass...... to establish the chromosomal arrangement of syntenic genes from model grass species. This scaffold revealed a high degree of synteny and macrocollinearity and was then utilized to anchor a collection of perennial ryegrass genes in silico to their predicted genome positions. This resulted in the unambiguous...

  13. Components of Adenovirus Genome Packaging

    Science.gov (United States)

    Ahi, Yadvinder S.; Mittal, Suresh K.

    2016-01-01

    Adenoviruses (AdVs) are icosahedral viruses with double-stranded DNA (dsDNA) genomes. Genome packaging in AdV is thought to be similar to that seen in dsDNA containing icosahedral bacteriophages and herpesviruses. Specific recognition of the AdV genome is mediated by a packaging domain located close to the left end of the viral genome and is mediated by the viral packaging machinery. Our understanding of the role of various components of the viral packaging machinery in AdV genome packaging has greatly advanced in recent years. Characterization of empty capsids assembled in the absence of one or more components involved in packaging, identification of the unique vertex, and demonstration of the role of IVa2, the putative packaging ATPase, in genome packaging have provided compelling evidence that AdVs follow a sequential assembly pathway. This review provides a detailed discussion on the functions of the various viral and cellular factors involved in AdV genome packaging. We conclude by briefly discussing the roles of the empty capsids, assembly intermediates, scaffolding proteins, portal vertex and DNA encapsidating enzymes in AdV assembly and packaging. PMID:27721809

  14. The genome of Eucalyptus grandis

    Energy Technology Data Exchange (ETDEWEB)

    Myburg, Alexander A.; Grattapaglia, Dario; Tuskan, Gerald A.; Hellsten, Uffe; Hayes, Richard D.; Grimwood, Jane; Jenkins, Jerry; Lindquist, Erika; Tice, Hope; Bauer, Diane; Goodstein, David M.; Dubchak, Inna; Poliakov, Alexandre; Mizrachi, Eshchar; Kullan, Anand R. K.; Hussey, Steven G.; Pinard, Desre; van der Merwe, Karen; Singh, Pooja; van Jaarsveld, Ida; Silva-Junior, Orzenil B.; Togawa, Roberto C.; Pappas, Marilia R.; Faria, Danielle A.; Sansaloni, Carolina P.; Petroli, Cesar D.; Yang, Xiaohan; Ranjan, Priya; Tschaplinski, Timothy J.; Ye, Chu-Yu; Li, Ting; Sterck, Lieven; Vanneste, Kevin; Murat, Florent; Soler, Marçal; Clemente, Hélène San; Saidi, Naijib; Cassan-Wang, Hua; Dunand, Christophe; Hefer, Charles A.; Bornberg-Bauer, Erich; Kersting, Anna R.; Vining, Kelly; Amarasinghe, Vindhya; Ranik, Martin; Naithani, Sushma; Elser, Justin; Boyd, Alexander E.; Liston, Aaron; Spatafora, Joseph W.; Dharmwardhana, Palitha; Raja, Rajani; Sullivan, Christopher; Romanel, Elisson; Alves-Ferreira, Marcio; Külheim, Carsten; Foley, William; Carocha, Victor; Paiva, Jorge; Kudrna, David; Brommonschenkel, Sergio H.; Pasquali, Giancarlo; Byrne, Margaret; Rigault, Philippe; Tibbits, Josquin; Spokevicius, Antanas; Jones, Rebecca C.; Steane, Dorothy A.; Vaillancourt, René E.; Potts, Brad M.; Joubert, Fourie; Barry, Kerrie; Pappas, Georgios J.; Strauss, Steven H.; Jaiswal, Pankaj; Grima-Pettenati, Jacqueline; Salse, Jérôme; Van de Peer, Yves; Rokhsar, Daniel S.; Schmutz, Jeremy

    2014-06-11

    Eucalypts are the world s most widely planted hardwood trees. Their broad adaptability, rich species diversity, fast growth and superior multipurpose wood, have made them a global renewable resource of fiber and energy that mitigates human pressures on natural forests. We sequenced and assembled >94% of the 640 Mbp genome of Eucalyptus grandis into its 11 chromosomes. A set of 36,376 protein coding genes were predicted revealing that 34% occur in tandem duplications, the largest proportion found thus far in any plant genome. Eucalypts also show the highest diversity of genes for plant specialized metabolism that act as chemical defence against biotic agents and provide unique pharmaceutical oils. Resequencing of a set of inbred tree genomes revealed regions of strongly conserved heterozygosity, likely hotspots of inbreeding depression. The resequenced genome of the sister species E. globulus underscored the high inter-specific genome colinearity despite substantial genome size variation in the genus. The genome of E. grandis is the first reference for the early diverging Rosid order Myrtales and is placed here basal to the Eurosids. This resource expands knowledge on the unique biology of large woody perennials and provides a powerful tool to accelerate comparative biology, breeding and biotechnology.

  15. [Genome editing of industrial microorganism].

    Science.gov (United States)

    Zhu, Linjiang; Li, Qi

    2015-03-01

    Genome editing is defined as highly-effective and precise modification of cellular genome in a large scale. In recent years, such genome-editing methods have been rapidly developed in the field of industrial strain improvement. The quickly-updating methods thoroughly change the old mode of inefficient genetic modification, which is "one modification, one selection marker, and one target site". Highly-effective modification mode in genome editing have been developed including simultaneous modification of multiplex genes, highly-effective insertion, replacement, and deletion of target genes in the genome scale, cut-paste of a large DNA fragment. These new tools for microbial genome editing will certainly be applied widely, and increase the efficiency of industrial strain improvement, and promote the revolution of traditional fermentation industry and rapid development of novel industrial biotechnology like production of biofuel and biomaterial. The technological principle of these genome-editing methods and their applications were summarized in this review, which can benefit engineering and construction of industrial microorganism.

  16. Genome size variation in the genus Avena.

    Science.gov (United States)

    Yan, Honghai; Martin, Sara L; Bekele, Wubishet A; Latta, Robert G; Diederichsen, Axel; Peng, Yuanying; Tinker, Nicholas A

    2016-03-01

    Genome size is an indicator of evolutionary distance and a metric for genome characterization. Here, we report accurate estimates of genome size in 99 accessions from 26 species of Avena. We demonstrate that the average genome size of C genome diploid species (2C = 10.26 pg) is 15% larger than that of A genome species (2C = 8.95 pg), and that this difference likely accounts for a progression of size among tetraploid species, where AB genome configuration had similar genome sizes (average 2C = 25.74 pg). Genome size was mostly consistent within species and in general agreement with current information about evolutionary distance among species. Results also suggest that most of the polyploid species in Avena have experienced genome downsizing in relation to their diploid progenitors. Genome size measurements could provide additional quality control for species identification in germplasm collections, especially in cases where diploid and polyploid species have similar morphology.

  17. OryzaGenome: Genome Diversity Database of Wild Oryza Species

    KAUST Repository

    Ohyanagi, Hajime; Ebata, Toshinobu; Huang, Xuehui; Gong, Hao; Fujita, Masahiro; Mochizuki, Takako; Toyoda, Atsushi; Fujiyama, Asao; Kaminuma, Eli; Nakamura, Yasukazu; Feng, Qi; Wang, Zi Xuan; Han, Bin; Kurata, Nori

    2015-01-01

    . Portable VCF (variant call format) file or tabdelimited file download is also available. Following these SNP (single nucleotide polymorphism) data, reference pseudomolecules/ scaffolds/contigs and genome-wide variation information for almost all

  18. Genome Modeling System: A Knowledge Management Platform for Genomics.

    Directory of Open Access Journals (Sweden)

    Malachi Griffith

    2015-07-01

    Full Text Available In this work, we present the Genome Modeling System (GMS, an analysis information management system capable of executing automated genome analysis pipelines at a massive scale. The GMS framework provides detailed tracking of samples and data coupled with reliable and repeatable analysis pipelines. The GMS also serves as a platform for bioinformatics development, allowing a large team to collaborate on data analysis, or an individual researcher to leverage the work of others effectively within its data management system. Rather than separating ad-hoc analysis from rigorous, reproducible pipelines, the GMS promotes systematic integration between the two. As a demonstration of the GMS, we performed an integrated analysis of whole genome, exome and transcriptome sequencing data from a breast cancer cell line (HCC1395 and matched lymphoblastoid line (HCC1395BL. These data are available for users to test the software, complete tutorials and develop novel GMS pipeline configurations. The GMS is available at https://github.com/genome/gms.

  19. Comparative genomics reveals insights into avian genome evolution and adaptation

    Science.gov (United States)

    Zhang, Guojie; Li, Cai; Li, Qiye; Li, Bo; Larkin, Denis M.; Lee, Chul; Storz, Jay F.; Antunes, Agostinho; Greenwold, Matthew J.; Meredith, Robert W.; Ödeen, Anders; Cui, Jie; Zhou, Qi; Xu, Luohao; Pan, Hailin; Wang, Zongji; Jin, Lijun; Zhang, Pei; Hu, Haofu; Yang, Wei; Hu, Jiang; Xiao, Jin; Yang, Zhikai; Liu, Yang; Xie, Qiaolin; Yu, Hao; Lian, Jinmin; Wen, Ping; Zhang, Fang; Li, Hui; Zeng, Yongli; Xiong, Zijun; Liu, Shiping; Zhou, Long; Huang, Zhiyong; An, Na; Wang, Jie; Zheng, Qiumei; Xiong, Yingqi; Wang, Guangbiao; Wang, Bo; Wang, Jingjing; Fan, Yu; da Fonseca, Rute R.; Alfaro-Núñez, Alonzo; Schubert, Mikkel; Orlando, Ludovic; Mourier, Tobias; Howard, Jason T.; Ganapathy, Ganeshkumar; Pfenning, Andreas; Whitney, Osceola; Rivas, Miriam V.; Hara, Erina; Smith, Julia; Farré, Marta; Narayan, Jitendra; Slavov, Gancho; Romanov, Michael N; Borges, Rui; Machado, João Paulo; Khan, Imran; Springer, Mark S.; Gatesy, John; Hoffmann, Federico G.; Opazo, Juan C.; Håstad, Olle; Sawyer, Roger H.; Kim, Heebal; Kim, Kyu-Won; Kim, Hyeon Jeong; Cho, Seoae; Li, Ning; Huang, Yinhua; Bruford, Michael W.; Zhan, Xiangjiang; Dixon, Andrew; Bertelsen, Mads F.; Derryberry, Elizabeth; Warren, Wesley; Wilson, Richard K; Li, Shengbin; Ray, David A.; Green, Richard E.; O’Brien, Stephen J.; Griffin, Darren; Johnson, Warren E.; Haussler, David; Ryder, Oliver A.; Willerslev, Eske; Graves, Gary R.; Alström, Per; Fjeldså, Jon; Mindell, David P.; Edwards, Scott V.; Braun, Edward L.; Rahbek, Carsten; Burt, David W.; Houde, Peter; Zhang, Yong; Yang, Huanming; Wang, Jian; Jarvis, Erich D.; Gilbert, M. Thomas P.; Wang, Jun

    2015-01-01

    Birds are the most species-rich class of tetrapod vertebrates and have wide relevance across many research fields. We explored bird macroevolution using full genomes from 48 avian species representing all major extant clades. The avian genome is principally characterized by its constrained size, which predominantly arose because of lineage-specific erosion of repetitive elements, large segmental deletions, and gene loss. Avian genomes furthermore show a remarkably high degree of evolutionary stasis at the levels of nucleotide sequence, gene synteny, and chromosomal structure. Despite this pattern of conservation, we detected many non-neutral evolutionary changes in protein-coding genes and noncoding regions. These analyses reveal that pan-avian genomic diversity covaries with adaptations to different lifestyles and convergent evolution of traits. PMID:25504712

  20. The bonobo genome compared with the chimpanzee and human genomes

    Science.gov (United States)

    Prüfer, Kay; Munch, Kasper; Hellmann, Ines; Akagi, Keiko; Miller, Jason R.; Walenz, Brian; Koren, Sergey; Sutton, Granger; Kodira, Chinnappa; Winer, Roger; Knight, James R.; Mullikin, James C.; Meader, Stephen J.; Ponting, Chris P.; Lunter, Gerton; Higashino, Saneyuki; Hobolth, Asger; Dutheil, Julien; Karakoç, Emre; Alkan, Can; Sajjadian, Saba; Catacchio, Claudia Rita; Ventura, Mario; Marques-Bonet, Tomas; Eichler, Evan E.; André, Claudine; Atencia, Rebeca; Mugisha, Lawrence; Junhold, Jörg; Patterson, Nick; Siebauer, Michael; Good, Jeffrey M.; Fischer, Anne; Ptak, Susan E.; Lachmann, Michael; Symer, David E.; Mailund, Thomas; Schierup, Mikkel H.; Andrés, Aida M.; Kelso, Janet; Pääbo, Svante

    2012-01-01

    Two African apes are the closest living relatives of humans: the chimpanzee (Pan troglodytes) and the bonobo (Pan paniscus). Although they are similar in many respects, bonobos and chimpanzees differ strikingly in key social and sexual behaviours1–4, and for some of these traits they show more similarity with humans than with each other. Here we report the sequencing and assembly of the bonobo genome to study its evolutionary relationship with the chimpanzee and human genomes. We find that more than three per cent of the human genome is more closely related to either the bonobo or the chimpanzee genome than these are to each other. These regions allow various aspects of the ancestry of the two ape species to be reconstructed. In addition, many of the regions that overlap genes may eventually help us understand the genetic basis of phenotypes that humans share with one of the two apes to the exclusion of the other. PMID:22722832

  1. Capturing prokaryotic dark matter genomes.

    Science.gov (United States)

    Gasc, Cyrielle; Ribière, Céline; Parisot, Nicolas; Beugnot, Réjane; Defois, Clémence; Petit-Biderre, Corinne; Boucher, Delphine; Peyretaillade, Eric; Peyret, Pierre

    2015-12-01

    Prokaryotes are the most diverse and abundant cellular life forms on Earth. Most of them, identified by indirect molecular approaches, belong to microbial dark matter. The advent of metagenomic and single-cell genomic approaches has highlighted the metabolic capabilities of numerous members of this dark matter through genome reconstruction. Thus, linking functions back to the species has revolutionized our understanding of how ecosystem function is sustained by the microbial world. This review will present discoveries acquired through the illumination of prokaryotic dark matter genomes by these innovative approaches. Copyright © 2015 Institut Pasteur. Published by Elsevier Masson SAS. All rights reserved.

  2. Human genome. 1993 Program report

    Energy Technology Data Exchange (ETDEWEB)

    1994-03-01

    The purpose of this report is to update the Human Genome 1991-92 Program Report and provide new information on the DOE genome program to researchers, program managers, other government agencies, and the interested public. This FY 1993 supplement includes abstracts of 60 new or renewed projects and listings of 112 continuing and 28 completed projects. These two reports, taken together, present the most complete published view of the DOE Human Genome Program through FY 1993. Research is progressing rapidly toward 15-year goals of mapping and sequencing the DNA of each of the 24 different human chromosomes.

  3. Implementing Genome-Driven Oncology

    Science.gov (United States)

    Hyman, David M.; Taylor, Barry S.; Baselga, José

    2017-01-01

    Early successes in identifying and targeting individual oncogenic drivers, together with the increasing feasibility of sequencing tumor genomes, have brought forth the promise of genome-driven oncology care. As we expand the breadth and depth of genomic analyses, the biological and clinical complexity of its implementation will be unparalleled. Challenges include target credentialing and validation, implementing drug combinations, clinical trial designs, targeting tumor heterogeneity, and deploying technologies beyond DNA sequencing, among others. We review how contemporary approaches are tackling these challenges and will ultimately serve as an engine for biological discovery and increase our insight into cancer and its treatment. PMID:28187282

  4. Deep whole-genome sequencing of 90 Han Chinese genomes.

    Science.gov (United States)

    Lan, Tianming; Lin, Haoxiang; Zhu, Wenjuan; Laurent, Tellier Christian Asker Melchior; Yang, Mengcheng; Liu, Xin; Wang, Jun; Wang, Jian; Yang, Huanming; Xu, Xun; Guo, Xiaosen

    2017-09-01

    Next-generation sequencing provides a high-resolution insight into human genetic information. However, the focus of previous studies has primarily been on low-coverage data due to the high cost of sequencing. Although the 1000 Genomes Project and the Haplotype Reference Consortium have both provided powerful reference panels for imputation, low-frequency and novel variants remain difficult to discover and call with accuracy on the basis of low-coverage data. Deep sequencing provides an optimal solution for the problem of these low-frequency and novel variants. Although whole-exome sequencing is also a viable choice for exome regions, it cannot account for noncoding regions, sometimes resulting in the absence of important, causal variants. For Han Chinese populations, the majority of variants have been discovered based upon low-coverage data from the 1000 Genomes Project. However, high-coverage, whole-genome sequencing data are limited for any population, and a large amount of low-frequency, population-specific variants remain uncharacterized. We have performed whole-genome sequencing at a high depth (∼×80) of 90 unrelated individuals of Chinese ancestry, collected from the 1000 Genomes Project samples, including 45 Northern Han Chinese and 45 Southern Han Chinese samples. Eighty-three of these 90 have been sequenced by the 1000 Genomes Project. We have identified 12 568 804 single nucleotide polymorphisms, 2 074 210 short InDels, and 26 142 structural variations from these 90 samples. Compared to the Han Chinese data from the 1000 Genomes Project, we have found 7 000 629 novel variants with low frequency (defined as minor allele frequency genome. Compared to the 1000 Genomes Project, these Han Chinese deep sequencing data enhance the characterization of a large number of low-frequency, novel variants. This will be a valuable resource for promoting Chinese genetics research and medical development. Additionally, it will provide a valuable supplement to the 1000

  5. Identification of genomic sites for CRISPR/Cas9-based genome editing in the Vitis vinifera genome

    Science.gov (United States)

    CRISPR/Cas9 has been recently demonstrated as an effective and popular genome editing tool for modifying genomes of human, animals, microorganisms, and plants. Success of such genome editing is highly dependent on the availability of suitable target sites in the genomes to be edited. Many specific t...

  6. Genomics and the human genome project: implications for psychiatry

    OpenAIRE

    Kelsoe, J R

    2004-01-01

    In the past decade the Human Genome Project has made extraordinary strides in understanding of fundamental human genetics. The complete human genetic sequence has been determined, and the chromosomal location of almost all human genes identified. Presently, a large international consortium, the HapMap Project, is working to identify a large portion of genetic variation in different human populations and the structure and relationship of these variants to each other. The Human Genome Project h...

  7. Challenges in Whole-Genome Annotation of Pyrosequenced Eukaryotic Genomes

    Energy Technology Data Exchange (ETDEWEB)

    Kuo, Alan; Grigoriev, Igor

    2009-04-17

    Pyrosequencing technologies such as 454/Roche and Solexa/Illumina vastly lower the cost of nucleotide sequencing compared to the traditional Sanger method, and thus promise to greatly expand the number of sequenced eukaryotic genomes. However, the new technologies also bring new challenges such as shorter reads and new kinds and higher rates of sequencing errors, which complicate genome assembly and gene prediction. At JGI we are deploying 454 technology for the sequencing and assembly of ever-larger eukaryotic genomes. Here we describe our first whole-genome annotation of a purely 454-sequenced fungal genome that is larger than a yeast (>30 Mbp). The pezizomycotine (filamentous ascomycote) Aspergillus carbonarius belongs to the Aspergillus section Nigri species complex, members of which are significant as platforms for bioenergy and bioindustrial technology, as members of soil microbial communities and players in the global carbon cycle, and as agricultural toxigens. Application of a modified version of the standard JGI Annotation Pipeline has so far predicted ~;;10k genes. ~;;12percent of these preliminary annotations suffer a potential frameshift error, which is somewhat higher than the ~;;9percent rate in the Sanger-sequenced and conventionally assembled and annotated genome of fellow Aspergillus section Nigri member A. niger. Also,>90percent of A. niger genes have potential homologs in the A. carbonarius preliminary annotation. Weconclude, and with further annotation and comparative analysis expect to confirm, that 454 sequencing strategies provide a promising substrate for annotation of modestly sized eukaryotic genomes. We will also present results of annotation of a number of other pyrosequenced fungal genomes of bioenergy interest.

  8. Genome projects and the functional-genomic era.

    Science.gov (United States)

    Sauer, Sascha; Konthur, Zoltán; Lehrach, Hans

    2005-12-01

    The problems we face today in public health as a result of the -- fortunately -- increasing age of people and the requirements of developing countries create an urgent need for new and innovative approaches in medicine and in agronomics. Genomic and functional genomic approaches have a great potential to at least partially solve these problems in the future. Important progress has been made by procedures to decode genomic information of humans, but also of other key organisms. The basic comprehension of genomic information (and its transfer) should now give us the possibility to pursue the next important step in life science eventually leading to a basic understanding of biological information flow; the elucidation of the function of all genes and correlative products encoded in the genome, as well as the discovery of their interactions in a molecular context and the response to environmental factors. As a result of the sequencing projects, we are now able to ask important questions about sequence variation and can start to comprehensively study the function of expressed genes on different levels such as RNA, protein or the cell in a systematic context including underlying networks. In this article we review and comment on current trends in large-scale systematic biological research. A particular emphasis is put on technology developments that can provide means to accomplish the tasks of future lines of functional genomics.

  9. The Amaranth Genome: Genome, Transcriptome, and Physical Map Assembly

    Directory of Open Access Journals (Sweden)

    J. W. Clouse

    2016-03-01

    Full Text Available Amaranth ( L. is an emerging pseudocereal native to the New World that has garnered increased attention in recent years because of its nutritional quality, in particular its seed protein and more specifically its high levels of the essential amino acid lysine. It belongs to the Amaranthaceae family, is an ancient paleopolyploid that shows disomic inheritance (2 = 32, and has an estimated genome size of 466 Mb. Here we present a high-quality draft genome sequence of the grain amaranth. The genome assembly consisted of 377 Mb in 3518 scaffolds with an N of 371 kb. Repetitive element analysis predicted that 48% of the genome is comprised of repeat sequences, of which -like elements were the most commonly classified retrotransposon. A de novo transcriptome consisting of 66,370 contigs was assembled from eight different amaranth tissue and abiotic stress libraries. Annotation of the genome identified 23,059 protein-coding genes. Seven grain amaranths (, , and and their putative progenitor ( were resequenced. A single nucleotide polymorphism (SNP phylogeny supported the classification of as the progenitor species of the grain amaranths. Lastly, we generated a de novo physical map for using the BioNano Genomics’ Genome Mapping platform. The physical map spanned 340 Mb and a hybrid assembly using the BioNano physical maps nearly doubled the N of the assembly to 697 kb. Moreover, we analyzed synteny between amaranth and sugar beet ( L. and estimated, using analysis, the age of the most recent polyploidization event in amaranth.

  10. Whole genome sequencing and bioinformatics analysis of two Egyptian genomes.

    Science.gov (United States)

    ElHefnawi, Mahmoud; Jeon, Sungwon; Bhak, Youngjune; ElFiky, Asmaa; Horaiz, Ahmed; Jun, JeHoon; Kim, Hyunho; Bhak, Jong

    2018-05-15

    We report two Egyptian male genomes (EGP1 and EGP2) sequenced at ~ 30× sequencing depths. EGP1 had 4.7 million variants, where 198,877 were novel variants while EGP2 had 209,109 novel variants out of 4.8 million variants. The mitochondrial haplogroup of the two individuals were identified to be H7b1 and L2a1c, respectively. We also identified the Y haplogroup of EGP1 (R1b) and EGP2 (J1a2a1a2 > P58 > FGC11). EGP1 had a mutation in the NADH gene of the mitochondrial genome ND4 (m.11778 G > A) that causes Leber's hereditary optic neuropathy. Some SNPs shared by the two genomes were associated with an increased level of cholesterol and triglycerides, probably related with Egyptians obesity. Comparison of these genomes with African and Western-Asian genomes can provide insights on Egyptian ancestry and genetic history. This resource can be used to further understand genomic diversity and functional classification of variants as well as human migration and evolution across Africa and Western-Asia. Copyright © 2017. Published by Elsevier B.V.

  11. Comparative genomic hybridization.

    Science.gov (United States)

    Pinkel, Daniel; Albertson, Donna G

    2005-01-01

    Altering DNA copy number is one of the many ways that gene expression and function may be modified. Some variations are found among normal individuals ( 14, 35, 103 ), others occur in the course of normal processes in some species ( 33 ), and still others participate in causing various disease states. For example, many defects in human development are due to gains and losses of chromosomes and chromosomal segments that occur prior to or shortly after fertilization, whereas DNA dosage alterations that occur in somatic cells are frequent contributors to cancer. Detecting these aberrations, and interpreting them within the context of broader knowledge, facilitates identification of critical genes and pathways involved in biological processes and diseases, and provides clinically relevant information. Over the past several years array comparative genomic hybridization (array CGH) has demonstrated its value for analyzing DNA copy number variations. In this review we discuss the state of the art of array CGH and its applications in medical genetics and cancer, emphasizing general concepts rather than specific results.

  12. Functional Genomics Group. Program Description

    National Research Council Canada - National Science Library

    Burian, Dennis

    2008-01-01

    .... This article reviews mechanisms of gene regulation and discusses how genomics is changing the way medicine is practiced today as a means of demonstrating that molecular medicine is here to stay...

  13. Genomic Resources for Cancer Epidemiology

    Science.gov (United States)

    This page provides links to research resources, complied by the Epidemiology and Genomics Research Program, that may be of interest to genetic epidemiologists conducting cancer research, but is not exhaustive.

  14. Fungal genomics beyond Saccharomyces cerevisiae?

    DEFF Research Database (Denmark)

    Hofmann, Gerald; Mcintyre, Mhairi; Nielsen, Jens

    2003-01-01

    Fungi are used extensively in both fundamental research and industrial applications. Saccharomyces cerevisiae has been the model organism for fungal research for many years, particularly in functional genomics. However, considering the diversity within the fungal kingdom, it is obvious...

  15. Genome engineering in human cells.

    Science.gov (United States)

    Song, Minjung; Kim, Young-Hoon; Kim, Jin-Soo; Kim, Hyongbum

    2014-01-01

    Genome editing in human cells is of great value in research, medicine, and biotechnology. Programmable nucleases including zinc-finger nucleases, transcription activator-like effector nucleases, and RNA-guided engineered nucleases recognize a specific target sequence and make a double-strand break at that site, which can result in gene disruption, gene insertion, gene correction, or chromosomal rearrangements. The target sequence complexities of these programmable nucleases are higher than 3.2 mega base pairs, the size of the haploid human genome. Here, we briefly introduce the structure of the human genome and the characteristics of each programmable nuclease, and review their applications in human cells including pluripotent stem cells. In addition, we discuss various delivery methods for nucleases, programmable nickases, and enrichment of gene-edited human cells, all of which facilitate efficient and precise genome editing in human cells.

  16. [Advances in microbial genome reduction and modification].

    Science.gov (United States)

    Wang, Jianli; Wang, Xiaoyuan

    2013-08-01

    Microbial genome reduction and modification are important strategies for constructing cellular chassis used for synthetic biology. This article summarized the essential genes and the methods to identify them in microorganisms, compared various strategies for microbial genome reduction, and analyzed the characteristics of some microorganisms with the minimized genome. This review shows the important role of genome reduction in constructing cellular chassis.

  17. 2004 Structural, Function and Evolutionary Genomics

    Energy Technology Data Exchange (ETDEWEB)

    Douglas L. Brutlag Nancy Ryan Gray

    2005-03-23

    This Gordon conference will cover the areas of structural, functional and evolutionary genomics. It will take a systematic approach to genomics, examining the evolution of proteins, protein functional sites, protein-protein interactions, regulatory networks, and metabolic networks. Emphasis will be placed on what we can learn from comparative genomics and entire genomes and proteomes.

  18. Draft Genome Sequence of Lactobacillus rhamnosus 2166.

    OpenAIRE

    Karlyshev, Andrey V.; Melnikov, Vyacheslav G.; Kosarev, Igor V.; Abramov, Vyacheslav M.

    2014-01-01

    In this report, we present a draft sequence of the genome of Lactobacillus rhamnosus strain 2166, a potential novel probiotic. Genome annotation and read mapping onto a reference genome of L. rhamnosus strain GG allowed for the identification of the differences and similarities in the genomic contents and gene arrangements of these strains.

  19. Genomic instability and radiation effects

    International Nuclear Information System (INIS)

    Christian Streffer

    2007-01-01

    Complete text of publication follows. Cancer, genetic mutations and developmental abnormalities are apparently associated with an increased genomic instability. Such phenomena have been frequently shown in human cancer cells in vitro and in situ. It is also well-known that individuals with a genetic predisposition for cancer proneness, such as ataxia telangiectesia, Fanconi anaemia etc. demonstrate a general high genomic instability e.g. in peripheral lymphocytes before a cancer has developed. Analogous data have been found in mice which develop a specific congenital malformation which has a genetic background. Under these aspects it is of high interest that ionising radiation can increase the genomic instability of mammalian cells after exposures in vitro an in vivo. This phenomenon is expressed 20 to 40 cell cycles after the exposure e.g. by de novo chromosomal aberrations. Such effects have been observed with high and low LET radiation, high LET radiation is more efficient. With low LET radiation a good dose response is observed in the dose range 0.2 to 2.0 Gy, Recently it has been reported that senescence and genomic instability was induced in human fibroblasts after 1 mGy carbon ions (1 in 18 cells are hit), apparently bystander effects also occurred under these conditions. The instability has been shown with DNA damage, chromosomal aberrations, gene mutation and cell death. It is also transferred to the next generation of mice with respect to gene mutations, chromosomal aberrations and congenital malformations. Several mechanisms have been discussed. The involvement of telomeres has gained interest. Genomic instability seems to be induced by a general lesion to the whole genome. The transmission of one chromosome from an irradiated cell to an non-irradiated cell leads to genomic instability in the untreated cells. Genomic instability increases mutation rates in the affected cells in general. As radiation late effects (cancer, gene mutations and congenital

  20. Genomic diversity of Lactobacillus salivarius

    OpenAIRE

    Raftis, Emma J.

    2015-01-01

    Lactobacillus salivarius is unusual among the lactobacilli due to its multireplicon genome architecture. The circular megaplasmids harboured by L. salivarius strains encode strain-specific traits for intestinal survival and probiotic activity. L. salivarius strains are increasingly being exploited for their probiotic properties in humans and animals. In terms of probiotic strain selection, it is important to have an understanding of the level of genomic diversity present in this species. Comp...

  1. Population Genomics of Paramecium Species.

    Science.gov (United States)

    Johri, Parul; Krenek, Sascha; Marinov, Georgi K; Doak, Thomas G; Berendonk, Thomas U; Lynch, Michael

    2017-05-01

    Population-genomic analyses are essential to understanding factors shaping genomic variation and lineage-specific sequence constraints. The dearth of such analyses for unicellular eukaryotes prompted us to assess genomic variation in Paramecium, one of the most well-studied ciliate genera. The Paramecium aurelia complex consists of ∼15 morphologically indistinguishable species that diverged subsequent to two rounds of whole-genome duplications (WGDs, as long as 320 MYA) and possess extremely streamlined genomes. We examine patterns of both nuclear and mitochondrial polymorphism, by sequencing whole genomes of 10-13 worldwide isolates of each of three species belonging to the P. aurelia complex: P. tetraurelia, P. biaurelia, P. sexaurelia, as well as two outgroup species that do not share the WGDs: P. caudatum and P. multimicronucleatum. An apparent absence of global geographic population structure suggests continuous or recent dispersal of Paramecium over long distances. Intergenic regions are highly constrained relative to coding sequences, especially in P. caudatum and P. multimicronucleatum that have shorter intergenic distances. Sequence diversity and divergence are reduced up to ∼100-150 bp both upstream and downstream of genes, suggesting strong constraints imposed by the presence of densely packed regulatory modules. In addition, comparison of sequence variation at non-synonymous and synonymous sites suggests similar recent selective pressures on paralogs within and orthologs across the deeply diverging species. This study presents the first genome-wide population-genomic analysis in ciliates and provides a valuable resource for future studies in evolutionary and functional genetics in Paramecium. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  2. Reconstructing ancient genomes and epigenomes

    DEFF Research Database (Denmark)

    Orlando, Ludovic Antoine Alexandre; Gilbert, M. Thomas P.; Willerslev, Eske

    2015-01-01

    DNA studies have now progressed to whole-genome sequencing for an increasing number of ancient individuals and extinct species, as well as to epigenomic characterization. Such advances have enabled the sequencing of specimens of up to 1 million years old, which, owing to their extensive DNA damage...... and contamination, were previously not amenable to genetic analyses. In this Review, we discuss these varied technical challenges and solutions for sequencing ancient genomes and epigenomes....

  3. GOBASE: an organelle genome database

    OpenAIRE

    O?Brien, Emmet A.; Zhang, Yue; Wang, Eric; Marie, Veronique; Badejoko, Wole; Lang, B. Franz; Burger, Gertraud

    2008-01-01

    The organelle genome database GOBASE, now in its 21st release (June 2008), contains all published mitochondrion-encoded sequences (?913 000) and chloroplast-encoded sequences (?250 000) from a wide range of eukaryotic taxa. For all sequences, information on related genes, exons, introns, gene products and taxonomy is available, as well as selected genome maps and RNA secondary structures. Recent major enhancements to database functionality include: (i) addition of an interface for RNA editing...

  4. The fishes of Genome 10K

    KAUST Repository

    Bernardi, Giacomo

    2012-09-01

    The Genome 10K project aims to sequence the genomes of 10,000 vertebrates, representing approximately one genome for each vertebrate genus. Since fishes (cartilaginous fishes, ray-finned fishes and lobe-finned fishes) represent more than 50% of extant vertebrates, it is planned to target 4,000 fish genomes. At present, nearly 60 fish genomes are being sequenced at various public funded labs, and under a Genome 10K and BGI pilot project. An additional 100 fishes have been identified for sequencing in the next phase of Genome 10K project. © 2012 Elsevier B.V.

  5. The integrated microbial genome resource of analysis.

    Science.gov (United States)

    Checcucci, Alice; Mengoni, Alessio

    2015-01-01

    Integrated Microbial Genomes and Metagenomes (IMG) is a biocomputational system that allows to provide information and support for annotation and comparative analysis of microbial genomes and metagenomes. IMG has been developed by the US Department of Energy (DOE)-Joint Genome Institute (JGI). IMG platform contains both draft and complete genomes, sequenced by Joint Genome Institute and other public and available genomes. Genomes of strains belonging to Archaea, Bacteria, and Eukarya domains are present as well as those of viruses and plasmids. Here, we provide some essential features of IMG system and case study for pangenome analysis.

  6. The fishes of Genome 10K

    KAUST Repository

    Bernardi, Giacomo; Wiley, Edward O.; Mansour, Hicham; Miller, Michael R.; Ortí , Guillermo; Haussler, David H.; O'Brien, Stephen J O; Ryder, Oliver A.; Venkatesh, Byrappa

    2012-01-01

    The Genome 10K project aims to sequence the genomes of 10,000 vertebrates, representing approximately one genome for each vertebrate genus. Since fishes (cartilaginous fishes, ray-finned fishes and lobe-finned fishes) represent more than 50% of extant vertebrates, it is planned to target 4,000 fish genomes. At present, nearly 60 fish genomes are being sequenced at various public funded labs, and under a Genome 10K and BGI pilot project. An additional 100 fishes have been identified for sequencing in the next phase of Genome 10K project. © 2012 Elsevier B.V.

  7. The dynamic genome of Hydra

    Science.gov (United States)

    Chapman, Jarrod A.; Kirkness, Ewen F.; Simakov, Oleg; Hampson, Steven E.; Mitros, Therese; Weinmaier, Therese; Rattei, Thomas; Balasubramanian, Prakash G.; Borman, Jon; Busam, Dana; Disbennett, Kathryn; Pfannkoch, Cynthia; Sumin, Nadezhda; Sutton, Granger G.; Viswanathan, Lakshmi Devi; Walenz, Brian; Goodstein, David M.; Hellsten, Uffe; Kawashima, Takeshi; Prochnik, Simon E.; Putnam, Nicholas H.; Shu, Shengquiang; Blumberg, Bruce; Dana, Catherine E.; Gee, Lydia; Kibler, Dennis F.; Law, Lee; Lindgens, Dirk; Martinez, Daniel E.; Peng, Jisong; Wigge, Philip A.; Bertulat, Bianca; Guder, Corina; Nakamura, Yukio; Ozbek, Suat; Watanabe, Hiroshi; Khalturin, Konstantin; Hemmrich, Georg; Franke, André; Augustin, René; Fraune, Sebastian; Hayakawa, Eisuke; Hayakawa, Shiho; Hirose, Mamiko; Hwang, Jung Shan; Ikeo, Kazuho; Nishimiya-Fujisawa, Chiemi; Ogura, Atshushi; Takahashi, Toshio; Steinmetz, Patrick R. H.; Zhang, Xiaoming; Aufschnaiter, Roland; Eder, Marie-Kristin; Gorny, Anne-Kathrin; Salvenmoser, Willi; Heimberg, Alysha M.; Wheeler, Benjamin M.; Peterson, Kevin J.; Böttger, Angelika; Tischler, Patrick; Wolf, Alexander; Gojobori, Takashi; Remington, Karin A.; Strausberg, Robert L.; Venter, J. Craig; Technau, Ulrich; Hobmayer, Bert; Bosch, Thomas C. G.; Holstein, Thomas W.; Fujisawa, Toshitaka; Bode, Hans R.; David, Charles N.; Rokhsar, Daniel S.; Steele, Robert E.

    2015-01-01

    The freshwater cnidarian Hydra was first described in 17021 and has been the object of study for 300 years. Experimental studies of Hydra between 1736 and 1744 culminated in the discovery of asexual reproduction of an animal by budding, the first description of regeneration in an animal, and successful transplantation of tissue between animals2. Today, Hydra is an important model for studies of axial patterning3, stem cell biology4 and regeneration5. Here we report the genome of Hydra magnipapillata and compare it to the genomes of the anthozoan Nematostella vectensis6 and other animals. The Hydra genome has been shaped by bursts of transposable element expansion, horizontal gene transfer, trans-splicing, and simplification of gene structure and gene content that parallel simplification of the Hydra life cycle. We also report the sequence of the genome of a novel bacterium stably associated with H. magnipapillata. Comparisons of the Hydra genome to the genomes of other animals shed light on the evolution of epithelia, contractile tissues, developmentally regulated transcription factors, the Spemann–Mangold organizer, pluripotency genes and the neuromuscular junction. PMID:20228792

  8. Universal pacemaker of genome evolution.

    Science.gov (United States)

    Snir, Sagi; Wolf, Yuri I; Koonin, Eugene V

    2012-01-01

    A fundamental observation of comparative genomics is that the distribution of evolution rates across the complete sets of orthologous genes in pairs of related genomes remains virtually unchanged throughout the evolution of life, from bacteria to mammals. The most straightforward explanation for the conservation of this distribution appears to be that the relative evolution rates of all genes remain nearly constant, or in other words, that evolutionary rates of different genes are strongly correlated within each evolving genome. This correlation could be explained by a model that we denoted Universal PaceMaker (UPM) of genome evolution. The UPM model posits that the rate of evolution changes synchronously across genome-wide sets of genes in all evolving lineages. Alternatively, however, the correlation between the evolutionary rates of genes could be a simple consequence of molecular clock (MC). We sought to differentiate between the MC and UPM models by fitting thousands of phylogenetic trees for bacterial and archaeal genes to supertrees that reflect the dominant trend of vertical descent in the evolution of archaea and bacteria and that were constrained according to the two models. The goodness of fit for the UPM model was better than the fit for the MC model, with overwhelming statistical significance, although similarly to the MC, the UPM is strongly overdispersed. Thus, the results of this analysis reveal a universal, genome-wide pacemaker of evolution that could have been in operation throughout the history of life.

  9. A Genome-Wide Landscape of Retrocopies in Primate Genomes.

    Science.gov (United States)

    Navarro, Fábio C P; Galante, Pedro A F

    2015-07-29

    Gene duplication is a key factor contributing to phenotype diversity across and within species. Although the availability of complete genomes has led to the extensive study of genomic duplications, the dynamics and variability of gene duplications mediated by retrotransposition are not well understood. Here, we predict mRNA retrotransposition and use comparative genomics to investigate their origin and variability across primates. Analyzing seven anthropoid primate genomes, we found a similar number of mRNA retrotranspositions (∼7,500 retrocopies) in Catarrhini (Old Word Monkeys, including humans), but a surprising large number of retrocopies (∼10,000) in Platyrrhini (New World Monkeys), which may be a by-product of higher long interspersed nuclear element 1 activity in these genomes. By inferring retrocopy orthology, we dated most of the primate retrocopy origins, and estimated a decrease in the fixation rate in recent primate history, implying a smaller number of species-specific retrocopies. Moreover, using RNA-Seq data, we identified approximately 3,600 expressed retrocopies. As expected, most of these retrocopies are located near or within known genes, present tissue-specific and even species-specific expression patterns, and no expression correlation to their parental genes. Taken together, our results provide further evidence that mRNA retrotransposition is an active mechanism in primate evolution and suggest that retrocopies may not only introduce great genetic variability between lineages but also create a large reservoir of potentially functional new genomic loci in primate genomes. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  10. HGVA: the Human Genome Variation Archive

    OpenAIRE

    Lopez, Javier; Coll, Jacobo; Haimel, Matthias; Kandasamy, Swaathi; Tarraga, Joaquin; Furio-Tari, Pedro; Bari, Wasim; Bleda, Marta; Rueda, Antonio; Gr?f, Stefan; Rendon, Augusto; Dopazo, Joaquin; Medina, Ignacio

    2017-01-01

    Abstract High-profile genomic variation projects like the 1000 Genomes project or the Exome Aggregation Consortium, are generating a wealth of human genomic variation knowledge which can be used as an essential reference for identifying disease-causing genotypes. However, accessing these data, contrasting the various studies and integrating those data in downstream analyses remains cumbersome. The Human Genome Variation Archive (HGVA) tackles these challenges and facilitates access to genomic...

  11. Facility information management system; Shisetsu joho kanri system

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    2000-01-10

    A facility management system (FMS) was developed as a tool for efficiently operating and managing building facilities and related equipment. The maintenance management data is designed to be collected through automatic formation of data base by using a work flow function and releasing the daily business from paper work. The data base thus formed can be retrieved and displayed by utilizing a network system. The plan view for construction facilities is made a minute plan comparable to the ground plan by taking in DXF type drawing data such as a completion drawing, making it a colored display for example to create an intuitive expression understandable at first sight. The plan is controlled by the level including equipment classification and is capable of superimposed display. Detailed management data is displayed by mouse clicking of registered icons, allowing required information to be readily taken out. (translated by NEDO)

  12. Developing chemical information system; Henbosuru kagaku joho -intanetto no sekai

    Energy Technology Data Exchange (ETDEWEB)

    Chihara, H.

    1999-12-01

    With the internet's popularization, the chemical information system greatly changes. In this paper, recent development of a chemical information system using the internet is summarized. To begin with, the kinds of online information systems using WWW and how to use them are described. Next, features of the electronic journals and how to use them are described. Next, CAS and STN as internet editions of the secondary information are introduced. Next, the Scifinder and the SciFinder Scholar which CAS developed as information retrieval tools for researcher are explained well. Next, ISI and DIALOG are introduced as information retrieval services of the other web editions. Finally, realization of retrieval and display of the English database by Japanese and preparation of a fact database such as density, boiling point, spectra, etc. and the offer of them by the internet are mentioned as a future image of chemical information systems. (NEDO)

  13. Leading research on brainware; Nokino joho shori no sendo kenkyu

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1996-03-01

    Leading research on brainware is conducted to realize the engineering information processing based on the learning, memorization, association, intuition, value judgment, and motivation which are activities of human brains. For the highly integrated information society at the 21st century, it will be essential to establish human-like information processing technology which is considered to be difficult with the conventional computers. The R and D theme for this technology will focus on the development of novel devices and systems by eliciting the principles and key roles of information processing functions of the brain and in living organisms from both viewpoints of the science and engineering and the brain information science. It is considered that important research targets are in elucidating brain functions and the modeling and developing novel devices and systems, such as brain information architecture, neural devices, neural networks, and man-machine interface. Technical trend surveys in the USA, the UK, and Germany were also conducted. 347 refs., 58 figs., 7 tabs.

  14. The Genomic Code: Genome Evolution and Potential Applications

    KAUST Repository

    Bernardi, Giorgio

    2016-01-25

    The genome of metazoans is organized according to a genomic code which comprises three laws: 1) Compositional correlations hold between contiguous coding and non-coding sequences, as well as among the three codon positions of protein-coding genes; these correlations are the consequence of the fact that the genomes under consideration consist of fairly homogeneous, long (≥200Kb) sequences, the isochores; 2) Although isochores are defined on the basis of purely compositional properties, GC levels of isochores are correlated with all tested structural and functional properties of the genome; 3) GC levels of isochores are correlated with chromosome architecture from interphase to metaphase; in the case of interphase the correlation concerns isochores and the three-dimensional “topological associated domains” (TADs); in the case of mitotic chromosomes, the correlation concerns isochores and chromosomal bands. Finally, the genomic code is the fourth and last pillar of molecular biology, the first three pillars being 1) the double helix structure of DNA; 2) the regulation of gene expression in prokaryotes; and 3) the genetic code.

  15. Discovery of previously unidentified genomic disorders from the duplication architecture of the human genome

    NARCIS (Netherlands)

    Sharp, Andrew J.; Hansen, Sierra; Selzer, Rebecca R.; Cheng, Ze; Regan, Regina; Hurst, Jane A.; Stewart, Helen; Price, Sue M.; Blair, Edward; Hennekam, Raoul C.; Fitzpatrick, Carrie A.; Segraves, Rick; Richmond, Todd A.; Guiver, Cheryl; Albertson, Donna G.; Pinkel, Daniel; Eis, Peggy S.; Schwartz, Stuart; Knight, Samantha J. L.; Eichler, Evan E.

    2006-01-01

    Genomic disorders are characterized by the presence of flanking segmental duplications that predispose these regions to recurrent rearrangement. Based on the duplication architecture of the genome, we investigated 130 regions that we hypothesized as candidates for previously undescribed genomic

  16. Supplementary Material for: Whole genome sequencing reveals genomic heterogeneity and antibiotic purification in Mycobacterium tuberculosis isolates

    KAUST Repository

    Black, PA; Vos, M. de; Louw, GE; Merwe, RG van der; Dippenaar, A.; Streicher, EM; Abdallah, AM; Sampson, SL; Victor, TC; Dolby, T.; Simpson, JA; Helden, PD van; Warren, RM; Pain, Arnab

    2015-01-01

    Abstract Background Whole genome sequencing has revolutionised the interrogation of mycobacterial genomes. Recent studies have reported conflicting findings on the genomic stability of Mycobacterium tuberculosis during the evolution of drug

  17. Human Genome Education Program

    Energy Technology Data Exchange (ETDEWEB)

    Richard Myers; Lane Conn

    2000-05-01

    The funds from the DOE Human Genome Program, for the project period 2/1/96 through 1/31/98, have provided major support for the curriculum development and field testing efforts for two high school level instructional units: Unit 1, ''Exploring Genetic Conditions: Genes, Culture and Choices''; and Unit 2, ''DNA Snapshots: Peaking at Your DNA''. In the original proposal, they requested DOE support for the partial salary and benefits of a Field Test Coordinator position to: (1) complete the field testing and revision of two high school curriculum units, and (2) initiate the education of teachers using these units. During the project period of this two-year DOE grant, a part-time Field-Test Coordinator was hired (Ms. Geraldine Horsma) and significant progress has been made in both of the original proposal objectives. Field testing for Unit 1 has occurred in over 12 schools (local and non-local sites with diverse student populations). Field testing for Unit 2 has occurred in over 15 schools (local and non-local sites) and will continue in 12-15 schools during the 96-97 school year. For both curricula, field-test sites and site teachers were selected for their interest in genetics education and in hands-on science education. Many of the site teachers had no previous experience with HGEP or the unit under development. Both of these first-year biology curriculum units, which contain genetics, biotechnology, societal, ethical and cultural issues related to HGP, are being implemented in many local and non-local schools (SF Bay Area, Southern California, Nebraska, Hawaii, and Texas) and in programs for teachers. These units will reach over 10,000 students in the SF Bay Area and continues to receive support from local corporate and private philanthropic organizations. Although HGEP unit development is nearing completion for both units, data is still being gathered and analyzed on unit effectiveness and student learning. The final field

  18. Allele coding in genomic evaluation

    Directory of Open Access Journals (Sweden)

    Christensen Ole F

    2011-06-01

    Full Text Available Abstract Background Genomic data are used in animal breeding to assist genetic evaluation. Several models to estimate genomic breeding values have been studied. In general, two approaches have been used. One approach estimates the marker effects first and then, genomic breeding values are obtained by summing marker effects. In the second approach, genomic breeding values are estimated directly using an equivalent model with a genomic relationship matrix. Allele coding is the method chosen to assign values to the regression coefficients in the statistical model. A common allele coding is zero for the homozygous genotype of the first allele, one for the heterozygote, and two for the homozygous genotype for the other allele. Another common allele coding changes these regression coefficients by subtracting a value from each marker such that the mean of regression coefficients is zero within each marker. We call this centered allele coding. This study considered effects of different allele coding methods on inference. Both marker-based and equivalent models were considered, and restricted maximum likelihood and Bayesian methods were used in inference. Results Theoretical derivations showed that parameter estimates and estimated marker effects in marker-based models are the same irrespective of the allele coding, provided that the model has a fixed general mean. For the equivalent models, the same results hold, even though different allele coding methods lead to different genomic relationship matrices. Calculated genomic breeding values are independent of allele coding when the estimate of the general mean is included into the values. Reliabilities of estimated genomic breeding values calculated using elements of the inverse of the coefficient matrix depend on the allele coding because different allele coding methods imply different models. Finally, allele coding affects the mixing of Markov chain Monte Carlo algorithms, with the centered coding being

  19. Genomic selection in maritime pine.

    Science.gov (United States)

    Isik, Fikret; Bartholomé, Jérôme; Farjat, Alfredo; Chancerel, Emilie; Raffin, Annie; Sanchez, Leopoldo; Plomion, Christophe; Bouffier, Laurent

    2016-01-01

    A two-generation maritime pine (Pinus pinaster Ait.) breeding population (n=661) was genotyped using 2500 SNP markers. The extent of linkage disequilibrium and utility of genomic selection for growth and stem straightness improvement were investigated. The overall intra-chromosomal linkage disequilibrium was r(2)=0.01. Linkage disequilibrium corrected for genomic relationships derived from markers was smaller (rV(2)=0.006). Genomic BLUP, Bayesian ridge regression and Bayesian LASSO regression statistical models were used to obtain genomic estimated breeding values. Two validation methods (random sampling 50% of the population and 10% of the progeny generation as validation sets) were used with 100 replications. The average predictive ability across statistical models and validation methods was about 0.49 for stem sweep, and 0.47 and 0.43 for total height and tree diameter, respectively. The sensitivity analysis suggested that prior densities (variance explained by markers) had little or no discernible effect on posterior means (residual variance) in Bayesian prediction models. Sampling from the progeny generation for model validation increased the predictive ability of markers for tree diameter and stem sweep but not for total height. The results are promising despite low linkage disequilibrium and low marker coverage of the genome (∼1.39 markers/cM). Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  20. The genome of Prunus mume.

    Science.gov (United States)

    Zhang, Qixiang; Chen, Wenbin; Sun, Lidan; Zhao, Fangying; Huang, Bangqing; Yang, Weiru; Tao, Ye; Wang, Jia; Yuan, Zhiqiong; Fan, Guangyi; Xing, Zhen; Han, Changlei; Pan, Huitang; Zhong, Xiao; Shi, Wenfang; Liang, Xinming; Du, Dongliang; Sun, Fengming; Xu, Zongda; Hao, Ruijie; Lv, Tian; Lv, Yingmin; Zheng, Zequn; Sun, Ming; Luo, Le; Cai, Ming; Gao, Yike; Wang, Junyi; Yin, Ye; Xu, Xun; Cheng, Tangren; Wang, Jun

    2012-01-01

    Prunus mume (mei), which was domesticated in China more than 3,000 years ago as ornamental plant and fruit, is one of the first genomes among Prunus subfamilies of Rosaceae been sequenced. Here, we assemble a 280M genome by combining 101-fold next-generation sequencing and optical mapping data. We further anchor 83.9% of scaffolds to eight chromosomes with genetic map constructed by restriction-site-associated DNA sequencing. Combining P. mume genome with available data, we succeed in reconstructing nine ancestral chromosomes of Rosaceae family, as well as depicting chromosome fusion, fission and duplication history in three major subfamilies. We sequence the transcriptome of various tissues and perform genome-wide analysis to reveal the characteristics of P. mume, including its regulation of early blooming in endodormancy, immune response against bacterial infection and biosynthesis of flower scent. The P. mume genome sequence adds to our understanding of Rosaceae evolution and provides important data for improvement of fruit trees.

  1. The genome of Chenopodium quinoa

    KAUST Repository

    Jarvis, David Erwin; Ho, Yung Shwen; Lightfoot, Damien; Schmö ckel, Sandra M.; Li, Bo; Borm, Theo J. A.; Ohyanagi, Hajime; Mineta, Katsuhiko; Michell, Craig; Saber, Noha; Kharbatia, Najeh M.; Rupper, Ryan R.; Sharp, Aaron R.; Dally, Nadine; Boughton, Berin A.; Woo, Yong; Gao, Ge; Schijlen, Elio G. W. M.; Guo, Xiujie; Momin, Afaque Ahmad Imtiyaz; Negrã o, Só nia; Al-Babili, Salim; Gehring, Christoph A; Roessner, Ute; Jung, Christian; Murphy, Kevin; Arold, Stefan T.; Gojobori, Takashi; Linden, C. Gerard van der; Loo, Eibertus N. van; Jellen, Eric N.; Maughan, Peter J.; Tester, Mark A.

    2017-01-01

    Chenopodium quinoa (quinoa) is a highly nutritious grain identified as an important crop to improve world food security. Unfortunately, few resources are available to facilitate its genetic improvement. Here we report the assembly of a high-quality, chromosome-scale reference genome sequence for quinoa, which was produced using single-molecule real-time sequencing in combination with optical, chromosome-contact and genetic maps. We also report the sequencing of two diploids from the ancestral gene pools of quinoa, which enables the identification of sub-genomes in quinoa, and reduced-coverage genome sequences for 22 other samples of the allotetraploid goosefoot complex. The genome sequence facilitated the identification of the transcription factor likely to control the production of anti-nutritional triterpenoid saponins found in quinoa seeds, including a mutation that appears to cause alternative splicing and a premature stop codon in sweet quinoa strains. These genomic resources are an important first step towards the genetic improvement of quinoa.

  2. The genome of Chenopodium quinoa

    KAUST Repository

    Jarvis, David Erwin

    2017-02-08

    Chenopodium quinoa (quinoa) is a highly nutritious grain identified as an important crop to improve world food security. Unfortunately, few resources are available to facilitate its genetic improvement. Here we report the assembly of a high-quality, chromosome-scale reference genome sequence for quinoa, which was produced using single-molecule real-time sequencing in combination with optical, chromosome-contact and genetic maps. We also report the sequencing of two diploids from the ancestral gene pools of quinoa, which enables the identification of sub-genomes in quinoa, and reduced-coverage genome sequences for 22 other samples of the allotetraploid goosefoot complex. The genome sequence facilitated the identification of the transcription factor likely to control the production of anti-nutritional triterpenoid saponins found in quinoa seeds, including a mutation that appears to cause alternative splicing and a premature stop codon in sweet quinoa strains. These genomic resources are an important first step towards the genetic improvement of quinoa.

  3. The genome of Chenopodium quinoa.

    Science.gov (United States)

    Jarvis, David E; Ho, Yung Shwen; Lightfoot, Damien J; Schmöckel, Sandra M; Li, Bo; Borm, Theo J A; Ohyanagi, Hajime; Mineta, Katsuhiko; Michell, Craig T; Saber, Noha; Kharbatia, Najeh M; Rupper, Ryan R; Sharp, Aaron R; Dally, Nadine; Boughton, Berin A; Woo, Yong H; Gao, Ge; Schijlen, Elio G W M; Guo, Xiujie; Momin, Afaque A; Negrão, Sónia; Al-Babili, Salim; Gehring, Christoph; Roessner, Ute; Jung, Christian; Murphy, Kevin; Arold, Stefan T; Gojobori, Takashi; Linden, C Gerard van der; van Loo, Eibertus N; Jellen, Eric N; Maughan, Peter J; Tester, Mark

    2017-02-16

    Chenopodium quinoa (quinoa) is a highly nutritious grain identified as an important crop to improve world food security. Unfortunately, few resources are available to facilitate its genetic improvement. Here we report the assembly of a high-quality, chromosome-scale reference genome sequence for quinoa, which was produced using single-molecule real-time sequencing in combination with optical, chromosome-contact and genetic maps. We also report the sequencing of two diploids from the ancestral gene pools of quinoa, which enables the identification of sub-genomes in quinoa, and reduced-coverage genome sequences for 22 other samples of the allotetraploid goosefoot complex. The genome sequence facilitated the identification of the transcription factor likely to control the production of anti-nutritional triterpenoid saponins found in quinoa seeds, including a mutation that appears to cause alternative splicing and a premature stop codon in sweet quinoa strains. These genomic resources are an important first step towards the genetic improvement of quinoa.

  4. Genomics of Arctic cod

    Science.gov (United States)

    Wilson, Robert E.; Sage, George K.; Sonsthagen, Sarah A.; Gravley, Megan C.; Menning, Damian; Talbot, Sandra L.

    2017-01-01

    The Arctic cod (Boreogadus saida) is an abundant marine fish that plays a vital role in the marine food web. To better understand the population genetic structure and the role of natural selection acting on the maternally-inherited mitochondrial genome (mitogenome), a molecule often associated with adaptations to temperature, we analyzed genetic data collected from 11 biparentally-inherited nuclear microsatellite DNA loci and nucleotide sequence data from from the mitochondrial DNA (mtDNA) cytochrome b (cytb) gene and, for a subset of individuals, the entire mitogenome. In addition, due to potential of species misidentification with morphologically similar Polar cod (Arctogadus glacialis), we used ddRAD-Seq data to determine the level of divergence between species and identify species-specific markers. Based on the findings presented here, Arctic cod across the Pacific Arctic (Bering, Chukchi, and Beaufort Seas) comprise a single panmictic population with high genetic diversity compared to other gadids. High genetic diversity was indicated across all 13 protein-coding genes in the mitogenome. In addition, we found moderate levels of genetic diversity in the nuclear microsatellite loci, with highest diversity found in the Chukchi Sea. Our analyses of markers from both marker classes (nuclear microsatellite fragment data and mtDNA cytb sequence data) failed to uncover a signal of microgeographic genetic structure within Arctic cod across the three regions, within the Alaskan Beaufort Sea, or between near-shore or offshore habitats. Further, data from a subset of mitogenomes revealed no genetic differentiation between Bering, Chukchi, and Beaufort seas populations for Arctic cod, Saffron cod (Eleginus gracilis), or Walleye pollock (Gadus chalcogrammus). However, we uncovered significant differences in the distribution of microsatellite alleles between the southern Chukchi and central and eastern Beaufort Sea samples of Arctic cod. Finally, using ddRAD-Seq data, we

  5. The Pediatric Cancer Genome Project

    Science.gov (United States)

    Downing, James R; Wilson, Richard K; Zhang, Jinghui; Mardis, Elaine R; Pui, Ching-Hon; Ding, Li; Ley, Timothy J; Evans, William E

    2013-01-01

    The St. Jude Children’s Research Hospital–Washington University Pediatric Cancer Genome Project (PCGP) is participating in the international effort to identify somatic mutations that drive cancer. These cancer genome sequencing efforts will not only yield an unparalleled view of the altered signaling pathways in cancer but should also identify new targets against which novel therapeutics can be developed. Although these projects are still deep in the phase of generating primary DNA sequence data, important results are emerging and valuable community resources are being generated that should catalyze future cancer research. We describe here the rationale for conducting the PCGP, present some of the early results of this project and discuss the major lessons learned and how these will affect the application of genomic sequencing in the clinic. PMID:22641210

  6. Integrating genomics into evolutionary medicine.

    Science.gov (United States)

    Rodríguez, Juan Antonio; Marigorta, Urko M; Navarro, Arcadi

    2014-12-01

    The application of the principles of evolutionary biology into medicine was suggested long ago and is already providing insight into the ultimate causes of disease. However, a full systematic integration of medical genomics and evolutionary medicine is still missing. Here, we briefly review some cases where the combination of the two fields has proven profitable and highlight two of the main issues hindering the development of evolutionary genomic medicine as a mature field, namely the dissociation between fitness and health and the still considerable difficulties in predicting phenotypes from genotypes. We use publicly available data to illustrate both problems and conclude that new approaches are needed for evolutionary genomic medicine to overcome these obstacles. Copyright © 2014 Elsevier Ltd. All rights reserved.

  7. Enhancer Identification through Comparative Genomics

    Energy Technology Data Exchange (ETDEWEB)

    Visel, Axel; Bristow, James; Pennacchio, Len A.

    2006-10-01

    With the availability of genomic sequence from numerousvertebrates, a paradigm shift has occurred in the identification ofdistant-acting gene regulatory elements. In contrast to traditionalgene-centric studies in which investigators randomly scanned genomicfragments that flank genes of interest in functional assays, the modernapproach begins electronically with publicly available comparativesequence datasets that provide investigators with prioritized lists ofputative functional sequences based on their evolutionary conservation.However, although a large number of tools and resources are nowavailable, application of comparative genomic approaches remains far fromtrivial. In particular, it requires users to dynamically consider thespecies and methods for comparison depending on the specific biologicalquestion under investigation. While there is currently no single generalrule to this end, it is clear that when applied appropriately,comparative genomic approaches exponentially increase our power ingenerating biological hypotheses for subsequent experimentaltesting.

  8. Genomics of Escherichia and Shigella

    Science.gov (United States)

    Perna, Nicole T.

    The laboratory workhorse Escherichia coli K-12 is among the most intensively studied living organisms on earth, and this single strain serves as the model system behind much of our understanding of prokaryotic molecular biology. Dense genome sequencing and recent insightful comparative analyses are making the species E. coli, as a whole, an emerging system for studying prokaryotic population genetics and the relationship between system-scale, or genome-scale, molecular evolution and complex traits like host range and pathogenic potential. Genomic perspective has revealed a coherent but dynamic species united by intraspecific gene flow via homologous lateral or horizontal transfer and differentiated by content flux mediated by acquisition of DNA segments from interspecies transfers.

  9. Genomic composition factors affect codon usage in porcine genome ...

    African Journals Online (AJOL)

    ... be explored for designing degenerate primers, necessitate selecting appropriate hosts expression systems to manipulate the expression of target genes in vivo or in vitro and improve the accuracy of gene prediction from genomic sequences thus maximizing the effectiveness of genetic manipulations in synthetic biology.

  10. Multiplexed precision genome editing with trackable genomic barcodes in yeast.

    Science.gov (United States)

    Roy, Kevin R; Smith, Justin D; Vonesch, Sibylle C; Lin, Gen; Tu, Chelsea Szu; Lederer, Alex R; Chu, Angela; Suresh, Sundari; Nguyen, Michelle; Horecka, Joe; Tripathi, Ashutosh; Burnett, Wallace T; Morgan, Maddison A; Schulz, Julia; Orsley, Kevin M; Wei, Wu; Aiyar, Raeka S; Davis, Ronald W; Bankaitis, Vytas A; Haber, James E; Salit, Marc L; St Onge, Robert P; Steinmetz, Lars M

    2018-07-01

    Our understanding of how genotype controls phenotype is limited by the scale at which we can precisely alter the genome and assess the phenotypic consequences of each perturbation. Here we describe a CRISPR-Cas9-based method for multiplexed accurate genome editing with short, trackable, integrated cellular barcodes (MAGESTIC) in Saccharomyces cerevisiae. MAGESTIC uses array-synthesized guide-donor oligos for plasmid-based high-throughput editing and features genomic barcode integration to prevent plasmid barcode loss and to enable robust phenotyping. We demonstrate that editing efficiency can be increased more than fivefold by recruiting donor DNA to the site of breaks using the LexA-Fkh1p fusion protein. We performed saturation editing of the essential gene SEC14 and identified amino acids critical for chemical inhibition of lipid signaling. We also constructed thousands of natural genetic variants, characterized guide mismatch tolerance at the genome scale, and ascertained that cryptic Pol III termination elements substantially reduce guide efficacy. MAGESTIC will be broadly useful to uncover the genetic basis of phenotypes in yeast.

  11. Genomic composition factors affect codon usage in porcine genome

    African Journals Online (AJOL)

    j.khobondo

    2015-01-28

    Jan 28, 2015 ... The mutational bias hypothesis predicted that genes in the GC-rich regions of the genome ... observed codon divided by its expected frequency at equilibrium. An RSCU value close to 1 indicates lack of bias, ..... study our results points to preferred usage of both C or G and A or T at the synonyms sites as ...

  12. The Phaeodactylum genome reveals the evolutionary history of diatom genomes

    Czech Academy of Sciences Publication Activity Database

    Bowler, Ch.; Allen, A. E.; Badger, J. H.; Grimwood, J.; Jabbari, K.; Kuo, A.; Maheswari, U.; Martens, C.; Maumus, F.; Otillar, R. P.; Rayko, E.; Salamov, A.; Vandepoele, K.; Beszteri, B.; Gruber, A.; Heijde, M.; Katinka, M.; Mock, T.; Valentin, K.; Verret, F.; Berges, J. A.; Brownlee, C.; Cadoret, J.-P.; Chiovitti, A.; Choi, Ch. J.; Coesel, S.; De Martino, A.; Detter, J. Ch.; Durkin, C.; Falciatore, A.; Fournet, J.; Haruta, M.; Huysman, M. J. J.; Jenkins, B. D.; Jiroutová, Kateřina; Jorgensen, R. E.; Joubert, Y.; Kaplan, A.; Kröger, N.; Kroth, P. G.; La Roche, J.; Lindquist, E.; Lommer, M.; Martin–Jézéquel, V.; Lopez, P. J.; Lucas, S.; Mangogna, M.; McGinnis, K.; Medlin, L. K.; Montsant, A.; Oudot–Le Secq, M.-P.; Napoli, C.; Oborník, Miroslav; Schnitzler Parker, M.; Petit, J.-L.; Porcel, B. M.; Poulsen, N.; Robison, M.; Rychlewski, L.; Rynearson, T. A.; Schmutz, J.; Shapiro, H.; Siaut, M.; Stanley, M.; Sussman, M. R.; Taylor, A. R.; Vardi, A.; von Dassow, P.; Vyverman, W.; Willis, A.; Wyrwicz, L. S.; Rokhsar, D. S.; Weissenbach, J.; Armbrust, E. V.; Green, B. R.; Van de Peer, Y.; Grigoriev, I. V.

    2008-01-01

    Roč. 456, 13-11-2008 (2008), s. 239-244 ISSN 0028-0836 Institutional research plan: CEZ:AV0Z60220518 Keywords : Phaeodactylum * genome * evolution * diatom Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 31.434, year: 2008

  13. Genome Size Dynamics and Evolution in Monocots

    Directory of Open Access Journals (Sweden)

    Ilia J. Leitch

    2010-01-01

    Full Text Available Monocot genomic diversity includes striking variation at many levels. This paper compares various genomic characters (e.g., range of chromosome numbers and ploidy levels, occurrence of endopolyploidy, GC content, chromosome packaging and organization, genome size between monocots and the remaining angiosperms to discern just how distinctive monocot genomes are. One of the most notable features of monocots is their wide range and diversity of genome sizes, including the species with the largest genome so far reported in plants. This genomic character is analysed in greater detail, within a phylogenetic context. By surveying available genome size and chromosome data it is apparent that different monocot orders follow distinctive modes of genome size and chromosome evolution. Further insights into genome size-evolution and dynamics were obtained using statistical modelling approaches to reconstruct the ancestral genome size at key nodes across the monocot phylogenetic tree. Such approaches reveal that while the ancestral genome size of all monocots was small (1C=1.9 pg, there have been several major increases and decreases during monocot evolution. In addition, notable increases in the rates of genome size-evolution were found in Asparagales and Poales compared with other monocot lineages.

  14. The Perennial Ryegrass GenomeZipper: Targeted Use of Genome Resources for Comparative Grass Genomics1[C][W

    Science.gov (United States)

    Pfeifer, Matthias; Martis, Mihaela; Asp, Torben; Mayer, Klaus F.X.; Lübberstedt, Thomas; Byrne, Stephen; Frei, Ursula; Studer, Bruno

    2013-01-01

    Whole-genome sequences established for model and major crop species constitute a key resource for advanced genomic research. For outbreeding forage and turf grass species like ryegrasses (Lolium spp.), such resources have yet to be developed. Here, we present a model of the perennial ryegrass (Lolium perenne) genome on the basis of conserved synteny to barley (Hordeum vulgare) and the model grass genome Brachypodium (Brachypodium distachyon) as well as rice (Oryza sativa) and sorghum (Sorghum bicolor). A transcriptome-based genetic linkage map of perennial ryegrass served as a scaffold to establish the chromosomal arrangement of syntenic genes from model grass species. This scaffold revealed a high degree of synteny and macrocollinearity and was then utilized to anchor a collection of perennial ryegrass genes in silico to their predicted genome positions. This resulted in the unambiguous assignment of 3,315 out of 8,876 previously unmapped genes to the respective chromosomes. In total, the GenomeZipper incorporates 4,035 conserved grass gene loci, which were used for the first genome-wide sequence divergence analysis between perennial ryegrass, barley, Brachypodium, rice, and sorghum. The perennial ryegrass GenomeZipper is an ordered, information-rich genome scaffold, facilitating map-based cloning and genome assembly in perennial ryegrass and closely related Poaceae species. It also represents a milestone in describing synteny between perennial ryegrass and fully sequenced model grass genomes, thereby increasing our understanding of genome organization and evolution in the most important temperate forage and turf grass species. PMID:23184232

  15. Implementing genomics and pharmacogenomics in the clinic: The National Human Genome Research Institute’s genomic medicine portfolio

    Science.gov (United States)

    Manolio, Teri A.

    2016-01-01

    Increasing knowledge about the influence of genetic variation on human health and growing availability of reliable, cost-effective genetic testing have spurred the implementation of genomic medicine in the clinic. As defined by the National Human Genome Research Institute (NHGRI), genomic medicine uses an individual’s genetic information in his or her clinical care, and has begun to be applied effectively in areas such as cancer genomics, pharmacogenomics, and rare and undiagnosed diseases. In 2011 NHGRI published its strategic vision for the future of genomic research, including an ambitious research agenda to facilitate and promote the implementation of genomic medicine. To realize this agenda, NHGRI is consulting and facilitating collaborations with the external research community through a series of “Genomic Medicine Meetings,” under the guidance and leadership of the National Advisory Council on Human Genome Research. These meetings have identified and begun to address significant obstacles to implementation, such as lack of evidence of efficacy, limited availability of genomics expertise and testing, lack of standards, and diffficulties in integrating genomic results into electronic medical records. The six research and dissemination initiatives comprising NHGRI’s genomic research portfolio are designed to speed the evaluation and incorporation, where appropriate, of genomic technologies and findings into routine clinical care. Actual adoption of successful approaches in clinical care will depend upon the willingness, interest, and energy of professional societies, practitioners, patients, and payers to promote their responsible use and share their experiences in doing so. PMID:27612677

  16. The Chlamydomonas genome project: a decade on

    Science.gov (United States)

    Blaby, Ian K.; Blaby-Haas, Crysten; Tourasse, Nicolas; Hom, Erik F. Y.; Lopez, David; Aksoy, Munevver; Grossman, Arthur; Umen, James; Dutcher, Susan; Porter, Mary; King, Stephen; Witman, George; Stanke, Mario; Harris, Elizabeth H.; Goodstein, David; Grimwood, Jane; Schmutz, Jeremy; Vallon, Olivier; Merchant, Sabeeha S.; Prochnik, Simon

    2014-01-01

    The green alga Chlamydomonas reinhardtii is a popular unicellular organism for studying photosynthesis, cilia biogenesis and micronutrient homeostasis. Ten years since its genome project was initiated, an iterative process of improvements to the genome and gene predictions has propelled this organism to the forefront of the “omics” era. Housed at Phytozome, the Joint Genome Institute’s (JGI) plant genomics portal, the most up-to-date genomic data include a genome arranged on chromosomes and high-quality gene models with alternative splice forms supported by an abundance of RNA-Seq data. Here, we present the past, present and future of Chlamydomonas genomics. Specifically, we detail progress on genome assembly and gene model refinement, discuss resources for gene annotations, functional predictions and locus ID mapping between versions and, importantly, outline a standardized framework for naming genes. PMID:24950814

  17. Big Data Analysis of Human Genome Variations

    KAUST Repository

    Gojobori, Takashi

    2016-01-01

    Since the human genome draft sequence was in public for the first time in 2000, genomic analyses have been intensively extended to the population level. The following three international projects are good examples for large-scale studies of human

  18. V-GAP: Viral genome assembly pipeline

    KAUST Repository

    Nakamura, Yoji

    2015-10-22

    Next-generation sequencing technologies have allowed the rapid determination of the complete genomes of many organisms. Although shotgun sequences from large genome organisms are still difficult to reconstruct perfect contigs each of which represents a full chromosome, those from small genomes have been assembled successfully into a very small number of contigs. In this study, we show that shotgun reads from phage genomes can be reconstructed into a single contig by controlling the number of read sequences used in de novo assembly. We have developed a pipeline to assemble small viral genomes with good reliability using a resampling method from shotgun data. This pipeline, named V-GAP (Viral Genome Assembly Pipeline), will contribute to the rapid genome typing of viruses, which are highly divergent, and thus will meet the increasing need for viral genome comparisons in metagenomic studies.

  19. All about the Human Genome Project (HGP)

    Science.gov (United States)

    ... Care Genomic Medicine Working Group New Horizons and Research Patient Management Policy and Ethics Issues Quick Links for Patient Care Education All About the Human Genome Project Fact Sheets Genetic Education Resources for ...

  20. Hapsembler: An Assembler for Highly Polymorphic Genomes

    Science.gov (United States)

    Donmez, Nilgun; Brudno, Michael

    As whole genome sequencing has become a routine biological experiment, algorithms for assembly of whole genome shotgun data has become a topic of extensive research, with a plethora of off-the-shelf methods that can reconstruct the genomes of many organisms. Simultaneously, several recently sequenced genomes exhibit very high polymorphism rates. For these organisms genome assembly remains a challenge as most assemblers are unable to handle highly divergent haplotypes in a single individual. In this paper we describe Hapsembler, an assembler for highly polymorphic genomes, which makes use of paired reads. Our experiments show that Hapsembler produces accurate and contiguous assemblies of highly polymorphic genomes, while performing on par with the leading tools on haploid genomes. Hapsembler is available for download at http://compbio.cs.toronto.edu/hapsembler.

  1. Collaborative Genomics Study Advances Precision Oncology

    Science.gov (United States)

    A collaborative study conducted by two Office of Cancer Genomics (OCG) initiatives highlights the importance of integrating structural and functional genomics programs to improve cancer therapies, and more specifically, contribute to precision oncology treatments for children.

  2. Unsupervised statistical identification of genomic islands using ...

    Indian Academy of Sciences (India)

    Vibrio species. These investigations lead to observations that are of evolutionary ... Identification of genomic islands in prokaryotic genomes has received considerable attention in the literature due to .... For instance, selective pres- sures as a ...

  3. V-GAP: Viral genome assembly pipeline

    KAUST Repository

    Nakamura, Yoji; Yasuike, Motoshige; Nishiki, Issei; Iwasaki, Yuki; Fujiwara, Atushi; Kawato, Yasuhiko; Nakai, Toshihiro; Nagai, Satoshi; Kobayashi, Takanori; Gojobori, Takashi; Ototake, Mitsuru

    2015-01-01

    Next-generation sequencing technologies have allowed the rapid determination of the complete genomes of many organisms. Although shotgun sequences from large genome organisms are still difficult to reconstruct perfect contigs each of which represents a full chromosome, those from small genomes have been assembled successfully into a very small number of contigs. In this study, we show that shotgun reads from phage genomes can be reconstructed into a single contig by controlling the number of read sequences used in de novo assembly. We have developed a pipeline to assemble small viral genomes with good reliability using a resampling method from shotgun data. This pipeline, named V-GAP (Viral Genome Assembly Pipeline), will contribute to the rapid genome typing of viruses, which are highly divergent, and thus will meet the increasing need for viral genome comparisons in metagenomic studies.

  4. The 1000 bull genome project

    Science.gov (United States)

    To meet growing global demands for high value protein from milk and meat, rates of genetic gain in domestic cattle must be accelerated. At the same time, animal health and welfare must be considered. The 1000 bull genomes project supports these goals by providing annotated sequence variants and ge...

  5. Computational genomics of specialized metabolism

    NARCIS (Netherlands)

    Medema, Marnix H.

    2018-01-01

    Microbial and plant specialized metabolites, also known as natural products, are key mediators of microbe-microbe and host-microbe interactions and constitute a rich resource for drug development. In the past decade, genome mining has emerged as a prominent strategy for natural product discovery.

  6. Genomic Heritability: What Is It?

    DEFF Research Database (Denmark)

    de los Campos, Gustavo; Sorensen, Daniel; Gianola, Daniel

    2015-01-01

    Whole-genome regression methods are being increasingly used for the analysis and prediction of complex traits and diseases. In human genetics, these methods are commonly used for inferences about genetic parameters, such as the amount of genetic variance among individuals or the proportion of phe...

  7. Targeted sequencing of plant genomes

    Science.gov (United States)

    Mark D. Huynh

    2014-01-01

    Next-generation sequencing (NGS) has revolutionized the field of genetics by providing a means for fast and relatively affordable sequencing. With the advancement of NGS, wholegenome sequencing (WGS) has become more commonplace. However, sequencing an entire genome is still not cost effective or even beneficial in all cases. In studies that do not require a whole-...

  8. Fungal genome resources at NCBI

    Science.gov (United States)

    Robbertse, B.; Tatusova, T.

    2011-01-01

    The National Center for Biotechnology Information (NCBI) is well known for the nucleotide sequence archive, GenBank and sequence analysis tool BLAST. However, NCBI integrates many types of biomolecular data from variety of sources and makes it available to the scientific community as interactive web resources as well as organized releases of bulk data. These tools are available to explore and compare fungal genomes. Searching all databases with Fungi [organism] at http://www.ncbi.nlm.nih.gov/ is the quickest way to find resources of interest with fungal entries. Some tools though are resources specific and can be indirectly accessed from a particular database in the Entrez system. These include graphical viewers and comparative analysis tools such as TaxPlot, TaxMap and UniGene DDD (found via UniGene Homepage). Gene and BioProject pages also serve as portals to external data such as community annotation websites, BioGrid and UniProt. There are many different ways of accessing genomic data at NCBI. Depending on the focus and goal of research projects or the level of interest, a user would select a particular route for accessing genomic databases and resources. This review article describes methods of accessing fungal genome data and provides examples that illustrate the use of analysis tools. PMID:22737589

  9. Genomic methods take the plunge

    DEFF Research Database (Denmark)

    Cammen, Kristina M.; Andrews, Kimberly R.; Carroll, Emma L.

    2016-01-01

    The dramatic increase in the application of genomic techniques to non-model organisms (NMOs) over the past decade has yielded numerous valuable contributions to evolutionary biology and ecology, many of which would not have been possible with traditional genetic markers. We review this recent...

  10. Ecological Genomics of Marine Picocyanobacteria†

    Science.gov (United States)

    Scanlan, D. J.; Ostrowski, M.; Mazard, S.; Dufresne, A.; Garczarek, L.; Hess, W. R.; Post, A. F.; Hagemann, M.; Paulsen, I.; Partensky, F.

    2009-01-01

    Summary: Marine picocyanobacteria of the genera Prochlorococcus and Synechococcus numerically dominate the picophytoplankton of the world ocean, making a key contribution to global primary production. Prochlorococcus was isolated around 20 years ago and is probably the most abundant photosynthetic organism on Earth. The genus comprises specific ecotypes which are phylogenetically distinct and differ markedly in their photophysiology, allowing growth over a broad range of light and nutrient conditions within the 45°N to 40°S latitudinal belt that they occupy. Synechococcus and Prochlorococcus are closely related, together forming a discrete picophytoplankton clade, but are distinguishable by their possession of dissimilar light-harvesting apparatuses and differences in cell size and elemental composition. Synechococcus strains have a ubiquitous oceanic distribution compared to that of Prochlorococcus strains and are characterized by phylogenetically discrete lineages with a wide range of pigmentation. In this review, we put our current knowledge of marine picocyanobacterial genomics into an environmental context and present previously unpublished genomic information arising from extensive genomic comparisons in order to provide insights into the adaptations of these marine microbes to their environment and how they are reflected at the genomic level. PMID:19487728

  11. Genomic applications in forensic medicine

    DEFF Research Database (Denmark)

    Børsting, Claus; Morling, Niels

    2016-01-01

    Since the 1980s, advances in DNA technology have revolutionized the scope and practice of forensic medicine. From the days of restriction fragment length polymorphisms (RFLPs) to short tandem repeats (STRs), the current focus is on the next generation genome sequencing. It has been almost a decad...

  12. Human Genome Research: Decoding DNA

    Science.gov (United States)

    dropdown arrow Site Map A-Z Index Menu Synopsis Human Genome Research: Decoding DNA Resources with of the DNA double helix during April 2003. James D. Watson, Francis Crick, and Maurice Wilkins were company Celera announced the completion of a "working draft" reference DNA sequence of the human

  13. The genome of Chenopodium quinoa

    NARCIS (Netherlands)

    Jarvis, D.E.; Shwen Ho, Yung; Lightfoot, Damien J.; Schmöckel, Sandra M.; Li, Bo; Borm, T.J.A.; Ohyanagi, Hajime; Mineta, Katsuhiko; Mitchell, Craig T.; Saber, Noha; Kharbatia, Najeh M.; Rupper, Ryan R.; Sharp, Aaron R.; Dally, Nadine; Boughton, Berin A.; Woo, Yong H.; Gao, Ge; Schijlen, E.G.W.M.; Guo, Xiujie; Momin, Afaque A.; Negräo, Sónia; Al-Babili, Salim; Gehring, Christoph; Roessner, Ute; Jung, Christian; Murphy, Kevin; Arold, Stefan T.; Gojobori, Takashi; Linden, van der C.G.; Loo, van E.N.; Jellen, Eric N.; Maughan, Peter J.; Tester, Mark

    2017-01-01

    Chenopodium quinoa (quinoa) is a highly nutritious grain identified as an important crop to improve world food security. Unfortunately, few resources are available to facilitate its genetic improvement. Here we report the assembly of a high-quality, chromosome-scale reference genome sequence for

  14. Genome position and gene amplification

    Czech Academy of Sciences Publication Activity Database

    Jirsová, Pavla; Snijders, A.M.; Kwek, S.; Roydasgupta, R.; Fridlyand, J.; Tokuyasu, T.; Pinkel, D.; Albertson, D. G.

    2007-01-01

    Roč. 8, č. 6 (2007), r120 ISSN 1474-760X Institutional research plan: CEZ:AV0Z50040507; CEZ:AV0Z50040702 Keywords : gene amplification * array comparative genomic hybridization * oncogene Subject RIV: BO - Biophysics Impact factor: 6.589, year: 2007

  15. India, Genomic diversity & Disease susceptibility

    Indian Academy of Sciences (India)

    Table of contents. India, Genomic diversity & Disease susceptibility · India, a paradise for Genetic Studies · Involved in earlier stages of Immune response protecting us from Diseases, Responsible for kidney and other transplant rejections Inherited from our parents · PowerPoint Presentation · Slide 5 · Slide 6 · Slide 7.

  16. GENOMIC FEATURES OF COTESIA PLUTELLAE POLYDNAVIRUS

    Institute of Scientific and Technical Information of China (English)

    LIUCai-ling; ZHUXiang-xiong; FuWen-jun; ZHAOMu-jun

    2003-01-01

    Polydnavirus was purified from the calyx fluid of Cotesia plutellae ovary. The genomic features of C. plutellae polydnavirus (CpPDV) were investigated. The viral genome consists of at least 12 different segments and the aggregate genome size is a lower estimate of 80kbp. By partial digestion of CpPDV DNA with BamHI and subsequent ligation with BamHI-cut plasmid Bluescript, a representative library of CpPDV genome was obtained.

  17. Reconciling Utility with Privacy in Genomics

    OpenAIRE

    Humbert, Mathias; Ayday, Erman; Hubaux, Jean-Pierre; Telenti, Amalio

    2014-01-01

    Direct-to-consumer genetic testing makes it possible for everyone to learn their genome sequences. In order to contribute to medical research, a growing number of people publish their genomic data on the Web, sometimes under their real identities. However, this is at odds not only with their own privacy but also with the privacy of their relatives. The genomes of relatives being highly correlated, some family members might be opposed to revealing any of the family's genomic data. In this pape...

  18. Genome chaos: survival strategy during crisis.

    Science.gov (United States)

    Liu, Guo; Stevens, Joshua B; Horne, Steven D; Abdallah, Batoul Y; Ye, Karen J; Bremer, Steven W; Ye, Christine J; Chen, David J; Heng, Henry H

    2014-01-01

    Genome chaos, a process of complex, rapid genome re-organization, results in the formation of chaotic genomes, which is followed by the potential to establish stable genomes. It was initially detected through cytogenetic analyses, and recently confirmed by whole-genome sequencing efforts which identified multiple subtypes including "chromothripsis", "chromoplexy", "chromoanasynthesis", and "chromoanagenesis". Although genome chaos occurs commonly in tumors, both the mechanism and detailed aspects of the process are unknown due to the inability of observing its evolution over time in clinical samples. Here, an experimental system to monitor the evolutionary process of genome chaos was developed to elucidate its mechanisms. Genome chaos occurs following exposure to chemotherapeutics with different mechanisms, which act collectively as stressors. Characterization of the karyotype and its dynamic changes prior to, during, and after induction of genome chaos demonstrates that chromosome fragmentation (C-Frag) occurs just prior to chaotic genome formation. Chaotic genomes seem to form by random rejoining of chromosomal fragments, in part through non-homologous end joining (NHEJ). Stress induced genome chaos results in increased karyotypic heterogeneity. Such increased evolutionary potential is demonstrated by the identification of increased transcriptome dynamics associated with high levels of karyotypic variance. In contrast to impacting on a limited number of cancer genes, re-organized genomes lead to new system dynamics essential for cancer evolution. Genome chaos acts as a mechanism of rapid, adaptive, genome-based evolution that plays an essential role in promoting rapid macroevolution of new genome-defined systems during crisis, which may explain some unwanted consequences of cancer treatment.

  19. Genomics-assisted breeding in fruit trees

    OpenAIRE

    Iwata, Hiroyoshi; Minamikawa, Mai F.; Kajiya-Kanegae, Hiromi; Ishimori, Motoyuki; Hayashi, Takeshi

    2016-01-01

    Recent advancements in genomic analysis technologies have opened up new avenues to promote the efficiency of plant breeding. Novel genomics-based approaches for plant breeding and genetics research, such as genome-wide association studies (GWAS) and genomic selection (GS), are useful, especially in fruit tree breeding. The breeding of fruit trees is hindered by their long generation time, large plant size, long juvenile phase, and the necessity to wait for the physiological maturity of the pl...

  20. insights from the genome of Melitaea cinxia

    OpenAIRE

    Ahola, Virpi; Wahlberg, Niklas; Frilander, Mikko J.

    2017-01-01

    The first lepidopteran genome (Bombyx mori) was published in 2004. Ten years later the genome of Melitaea cinxia came out as the third butterfly genome published, and the first eukaryotic genome sequenced in Finland. Owing to Ilkka Hanski, the M. cinxia system in the angstrom land Islands has become a famous model for metapopulation biology. More than 20 years of research on this system provides a strong ecological basis upon which a genetic framework could be built. Genetic knowledge is an e...

  1. GRAbB : Selective Assembly of Genomic Regions, a New Niche for Genomic Research

    NARCIS (Netherlands)

    Brankovics, Balázs; Zhang, Hao; van Diepeningen, Anne D; van der Lee, Theo A J; Waalwijk, Cees; de Hoog, G Sybren

    GRAbB (Genomic Region Assembly by Baiting) is a new program that is dedicated to assemble specific genomic regions from NGS data. This approach is especially useful when dealing with multi copy regions, such as mitochondrial genome and the rDNA repeat region, parts of the genome that are often

  2. Clusters of orthologous genes for 41 archaeal genomes and implications for evolutionary genomics of archaea

    OpenAIRE

    Wolf Yuri I; Novichkov Pavel S; Sorokin Alexander V; Makarova Kira S; Koonin Eugene V

    2007-01-01

    Abstract Background An evolutionary classification of genes from sequenced genomes that distinguishes between orthologs and paralogs is indispensable for genome annotation and evolutionary reconstruction. Shortly after multiple genome sequences of bacteria, archaea, and unicellular eukaryotes became available, an attempt on such a classification was implemented in Clusters of Orthologous Groups of proteins (COGs). Rapid accumulation of genome sequences creates opportunities for refining COGs ...

  3. Accounting for discovery bias in genomic EPD

    Science.gov (United States)

    Genomics has contributed substantially to genetic improvement of beef cattle. The implementation is through computation of genomically enhanced expected progeny differences (GE-EPD), which are predictions of genetic merit of individual animals based on genomic information, pedigree, and data on the ...

  4. The UCSC Genome Browser Database: update 2006

    DEFF Research Database (Denmark)

    Hinrichs, A S; Karolchik, D; Baertsch, R

    2006-01-01

    The University of California Santa Cruz Genome Browser Database (GBD) contains sequence and annotation data for the genomes of about a dozen vertebrate species and several major model organisms. Genome annotations typically include assembly data, sequence composition, genes and gene predictions, ...

  5. The UCSC genome browser database: update 2007

    DEFF Research Database (Denmark)

    Kuhn, R M; Karolchik, D; Zweig, A S

    2006-01-01

    The University of California, Santa Cruz Genome Browser Database contains, as of September 2006, sequence and annotation data for the genomes of 13 vertebrate and 19 invertebrate species. The Genome Browser displays a wide variety of annotations at all scales from the single nucleotide level up t...

  6. Impact of genomics on microbial food safety

    NARCIS (Netherlands)

    Abee, T.; Schaik, van W.; Siezen, R.J.

    2004-01-01

    Genome sequences are now available for many of the microbes that cause food-borne diseases. The information contained in pathogen genome sequences, together with the development of themed and whole-genome DNA microarrays and improved proteomics techniques, might provide tools for the rapid detection

  7. Genomic individuality and its biological implications.

    Science.gov (United States)

    Zhao, J

    1996-06-01

    It is a widely accepted fundamental concept that all somatic genomes of a human individual are identical to each other. The theoretical basis of this concept is that all of these somatic genomes are the descendants of the genome of a single fertilized cell as well as the simple replicated products of asexual reproduction, thus not forming any new recombined genomes. The question here is whether such a concept might only represent one side of somatic genome biology and, even worse, whether it has perhaps already led to a very prevalent misconception that within the organism body, there exists no variability among individual somatic genomes. A hypothesis, called genomic individuality, is proposed, simply saying that every individual somatic genome, perhaps with rare exceptions, has its own unique or individual 'genetic identity' or 'fingerprint', which is characterized by its distinctive sequences or patterns of deoxyribonucleic acid molecules, or both. Thus, no two somatic genomes can be identical to each other in every or all aspects, and consequently, there must be a great deal of genomic variation present within the body of any multicellular organism. The concept or hypothesis of genomic individuality would not only provide a more complete understanding of genome biology, but also suggest a new insight into the studies of the biology of cells and organisms.

  8. Leaner and meaner genomes in Escherichia coli

    DEFF Research Database (Denmark)

    Ussery, David

    2006-01-01

    A 'better' Escherichia coli K-12 genome has recently been engineered in which about 15% of the genome has been removed by planned deletions. Comparison with related bacterial genomes that have undergone a natural reduction in size suggests that there is plenty of scope for yet more deletions....

  9. Comparative Genomics of Carp Herpesviruses

    Science.gov (United States)

    Kurobe, Tomofumi; Gatherer, Derek; Cunningham, Charles; Korf, Ian; Fukuda, Hideo; Hedrick, Ronald P.; Waltzek, Thomas B.

    2013-01-01

    Three alloherpesviruses are known to cause disease in cyprinid fish: cyprinid herpesviruses 1 and 3 (CyHV1 and CyHV3) in common carp and koi and cyprinid herpesvirus 2 (CyHV2) in goldfish. We have determined the genome sequences of CyHV1 and CyHV2 and compared them with the published CyHV3 sequence. The CyHV1 and CyHV2 genomes are 291,144 and 290,304 bp, respectively, in size, and thus the CyHV3 genome, at 295,146 bp, remains the largest recorded among the herpesviruses. Each of the three genomes consists of a unique region flanked at each terminus by a sizeable direct repeat. The CyHV1, CyHV2, and CyHV3 genomes are predicted to contain 137, 150, and 155 unique, functional protein-coding genes, respectively, of which six, four, and eight, respectively, are duplicated in the terminal repeat. The three viruses share 120 orthologous genes in a largely colinear arrangement, of which up to 55 are also conserved in the other member of the genus Cyprinivirus, anguillid herpesvirus 1. Twelve genes are conserved convincingly in all sequenced alloherpesviruses, and two others are conserved marginally. The reference CyHV3 strain has been reported to contain five fragmented genes that are presumably nonfunctional. The CyHV2 strain has two fragmented genes, and the CyHV1 strain has none. CyHV1, CyHV2, and CyHV3 have five, six, and five families of paralogous genes, respectively. One family unique to CyHV1 is related to cellular JUNB, which encodes a transcription factor involved in oncogenesis. To our knowledge, this is the first time that JUNB-related sequences have been reported in a herpesvirus. PMID:23269803

  10. Genome Variation Map: a data repository of genome variations in BIG Data Center

    OpenAIRE

    Song, Shuhui; Tian, Dongmei; Li, Cuiping; Tang, Bixia; Dong, Lili; Xiao, Jingfa; Bao, Yiming; Zhao, Wenming; He, Hang; Zhang, Zhang

    2017-01-01

    Abstract The Genome Variation Map (GVM; http://bigd.big.ac.cn/gvm/) is a public data repository of genome variations. As a core resource in the BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, GVM dedicates to collect, integrate and visualize genome variations for a wide range of species, accepts submissions of different types of genome variations from all over the world and provides free open access to all publicly available data in support of worldwide research a...

  11. Discovering regulatory motifs in the Plasmodium genome using comparative genomics

    OpenAIRE

    Wu, Jie; Sieglaff, Douglas H.; Gervin, Joshua; Xie, Xiaohui S.

    2008-01-01

    Motivation: Understanding gene regulation in Plasmodium, the causative agent of malaria, is an important step in deciphering its complex life cycle as well as leading to possible new targets for therapeutic applications. Very little is known about gene regulation in Plasmodium, and in particular, few regulatory elements have been identified. Such discovery has been significantly hampered by the high A-T content of some of the genomes of Plasmodium species, as well as the challenge in associat...

  12. Prospects for Genomic Research in Forestry

    Directory of Open Access Journals (Sweden)

    K. V. Krutovsky

    2014-08-01

    Full Text Available Conifers are keystone species of boreal forests. Their whole genome sequencing, assembly and annotation will allow us to understand the evolution of the complex ancient giant conifer genomes that are 4 times larger in larch and 7–9 times larger in pines than the human genome. Genomic studies will allow also to obtain important whole genome sequence data and develop highly polymorphic and informative genetic markers, such as microsatellites and single nucleotide polymorphisms (SNPs that can be efficiently used in timber origin identification, for genetic variation monitoring, to study local and climate change adaptation and in tree improvement and conservation programs.

  13. Value of a newly sequenced bacterial genome

    DEFF Research Database (Denmark)

    Barbosa, Eudes; Aburjaile, Flavia F; Ramos, Rommel Tj

    2014-01-01

    and annotation will not be undertaken. It is important to know what is lost when we settle for a draft genome and to determine the "scientific value" of a newly sequenced genome. This review addresses the expected impact of newly sequenced genomes on antibacterial discovery and vaccinology. Also, it discusses...... heightened expectations that NGS would boost antibacterial discovery and vaccine development. Although many possible drug and vaccine targets have been discovered, the success rate of genome-based analysis has remained below expectations. Furthermore, NGS has had consequences for genome quality, resulting...

  14. Software for computing and annotating genomic ranges.

    Directory of Open Access Journals (Sweden)

    Michael Lawrence

    Full Text Available We describe Bioconductor infrastructure for representing and computing on annotated genomic ranges and integrating genomic data with the statistical computing features of R and its extensions. At the core of the infrastructure are three packages: IRanges, GenomicRanges, and GenomicFeatures. These packages provide scalable data structures for representing annotated ranges on the genome, with special support for transcript structures, read alignments and coverage vectors. Computational facilities include efficient algorithms for overlap and nearest neighbor detection, coverage calculation and other range operations. This infrastructure directly supports more than 80 other Bioconductor packages, including those for sequence analysis, differential expression analysis and visualization.

  15. Resolution effects in reconstructing ancestral genomes.

    Science.gov (United States)

    Zheng, Chunfang; Jeong, Yuji; Turcotte, Madisyn Gabrielle; Sankoff, David

    2018-05-09

    The reconstruction of ancestral genomes must deal with the problem of resolution, necessarily involving a trade-off between trying to identify genomic details and being overwhelmed by noise at higher resolutions. We use the median reconstruction at the synteny block level, of the ancestral genome of the order Gentianales, based on coffee, Rhazya stricta and grape, to exemplify the effects of resolution (granularity) on comparative genomic analyses. We show how decreased resolution blurs the differences between evolving genomes, with respect to rate, mutational process and other characteristics.

  16. The ecoresponsive genome of Daphnia pulex

    Energy Technology Data Exchange (ETDEWEB)

    Colbourne, John K.; Pfrender, Michael E.; Gilbert, Donald; Thomas, W. Kelley; Tucker, Abraham; Oakley, Todd H.; Tokishita, Shinichi; Aerts, Andrea; Arnold, Georg J.; Basu, Malay Kumar; Bauer, Darren J.; Caceres, Carla E.; Carmel, Liran; Casola, Claudio; Choi, Jeong-Hyeon; Detter, John C.; Dong, Qunfeng; Dusheyko, Serge; Eads, Brian D.; Frohlich, Thomas; Geiler-Samerotte, Kerry A.; Gerlach, Daniel; Hatcher, Phil; Jogdeo, Sanjuro; Krijgsveld, Jeroen; Kriventseva, Evgenia V; Kültz, Dietmar; Laforsch, Christian; Lindquist, Erika; Lopez, Jacqueline; Manak, Robert; Muller, Jean; Pangilinan, Jasmyn; Patwardhan, Rupali P.; Pitluck, Samuel; Pritham, Ellen J.; Rechtsteiner, Andreas; Rho, Mina; Rogozin, Igor B.; Sakarya, Onur; Salamov, Asaf; Schaack, Sarah; Shapiro, Harris; Shiga, Yasuhiro; Skalitzky, Courtney; Smith, Zachary; Souvorov, Alexander; Sung, Way; Tang, Zuojian; Tsuchiya, Dai; Tu, Hank; Vos, Harmjan; Wang, Mei; Wolf, Yuri I.; Yamagata, Hideo; Yamada, Takuji; Ye, Yuzhen; Shaw, Joseph R.; Andrews, Justen; Crease, Teresa J.; Tang, Haixu; Lucas, Susan M.; Robertson, Hugh M.; Bork, Peer; Koonin, Eugene V.; Zdobnov, Evgeny M.; Grigoriev, Igor V.; Lynch, Michael; Boore, Jeffrey L.

    2011-02-04

    This document provides supporting material related to the sequencing of the ecoresponsive genome of Daphnia pulex. This material includes information on materials and methods and supporting text, as well as supplemental figures, tables, and references. The coverage of materials and methods addresses genome sequence, assembly, and mapping to chromosomes, gene inventory, attributes of a compact genome, the origin and preservation of Daphnia pulex genes, implications of Daphnia's genome structure, evolutionary diversification of duplicated genes, functional significance of expanded gene families, and ecoresponsive genes. Supporting text covers chromosome studies, gene homology among Daphnia genomes, micro-RNA and transposable elements and the 46 Daphnia pulex opsins. 36 figures, 50 tables, 183 references.

  17. Software for computing and annotating genomic ranges.

    Science.gov (United States)

    Lawrence, Michael; Huber, Wolfgang; Pagès, Hervé; Aboyoun, Patrick; Carlson, Marc; Gentleman, Robert; Morgan, Martin T; Carey, Vincent J

    2013-01-01

    We describe Bioconductor infrastructure for representing and computing on annotated genomic ranges and integrating genomic data with the statistical computing features of R and its extensions. At the core of the infrastructure are three packages: IRanges, GenomicRanges, and GenomicFeatures. These packages provide scalable data structures for representing annotated ranges on the genome, with special support for transcript structures, read alignments and coverage vectors. Computational facilities include efficient algorithms for overlap and nearest neighbor detection, coverage calculation and other range operations. This infrastructure directly supports more than 80 other Bioconductor packages, including those for sequence analysis, differential expression analysis and visualization.

  18. Genomic alterations detected by comparative genomic hybridization in ovarian endometriomas

    Directory of Open Access Journals (Sweden)

    L.C. Veiga-Castelli

    2010-08-01

    Full Text Available Endometriosis is a complex and multifactorial disease. Chromosomal imbalance screening in endometriotic tissue can be used to detect hot-spot regions in the search for a possible genetic marker for endometriosis. The objective of the present study was to detect chromosomal imbalances by comparative genomic hybridization (CGH in ectopic tissue samples from ovarian endometriomas and eutopic tissue from the same patients. We evaluated 10 ovarian endometriotic tissues and 10 eutopic endometrial tissues by metaphase CGH. CGH was prepared with normal and test DNA enzymatically digested, ligated to adaptors and amplified by PCR. A second PCR was performed for DNA labeling. Equal amounts of both normal and test-labeled DNA were hybridized in human normal metaphases. The Isis FISH Imaging System V 5.0 software was used for chromosome analysis. In both eutopic and ectopic groups, 4/10 samples presented chromosomal alterations, mainly chromosomal gains. CGH identified 11q12.3-q13.1, 17p11.1-p12, 17q25.3-qter, and 19p as critical regions. Genomic imbalances in 11q, 17p, 17q, and 19p were detected in normal eutopic and/or ectopic endometrium from women with ovarian endometriosis. These regions contain genes such as POLR2G, MXRA7 and UBA52 involved in biological processes that may lead to the establishment and maintenance of endometriotic implants. This genomic imbalance may affect genes in which dysregulation impacts both eutopic and ectopic endometrium.

  19. Fungal Genomics for Energy and Environment

    Energy Technology Data Exchange (ETDEWEB)

    Grigoriev, Igor V.

    2013-03-11

    Genomes of fungi relevant to energy and environment are in focus of the Fungal Genomic Program at the US Department of Energy Joint Genome Institute (JGI). One of its projects, the Genomics Encyclopedia of Fungi, targets fungi related to plant health (symbionts, pathogens, and biocontrol agents) and biorefinery processes (cellulose degradation, sugar fermentation, industrial hosts) by means of genome sequencing and analysis. New chapters of the Encyclopedia can be opened with user proposals to the JGI Community Sequencing Program (CSP). Another JGI project, the 1000 fungal genomes, explores fungal diversity on genome level at scale and is open for users to nominate new species for sequencing. Over 200 fungal genomes have been sequenced by JGI to date and released through MycoCosm (www.jgi.doe.gov/fungi), a fungal web-portal, which integrates sequence and functional data with genome analysis tools for user community. Sequence analysis supported by functional genomics leads to developing parts list for complex systems ranging from ecosystems of biofuel crops to biorefineries. Recent examples of such parts suggested by comparative genomics and functional analysis in these areas are presented here.

  20. Insights into Conifer Giga-Genomes1

    Science.gov (United States)

    De La Torre, Amanda R.; Birol, Inanc; Bousquet, Jean; Ingvarsson, Pär K.; Jansson, Stefan; Jones, Steven J.M.; Keeling, Christopher I.; MacKay, John; Nilsson, Ove; Ritland, Kermit; Street, Nathaniel; Yanchuk, Alvin; Zerbe, Philipp; Bohlmann, Jörg

    2014-01-01

    Insights from sequenced genomes of major land plant lineages have advanced research in almost every aspect of plant biology. Until recently, however, assembled genome sequences of gymnosperms have been missing from this picture. Conifers of the pine family (Pinaceae) are a group of gymnosperms that dominate large parts of the world’s forests. Despite their ecological and economic importance, conifers seemed long out of reach for complete genome sequencing, due in part to their enormous genome size (20–30 Gb) and the highly repetitive nature of their genomes. Technological advances in genome sequencing and assembly enabled the recent publication of three conifer genomes: white spruce (Picea glauca), Norway spruce (Picea abies), and loblolly pine (Pinus taeda). These genome sequences revealed distinctive features compared with other plant genomes and may represent a window into the past of seed plant genomes. This Update highlights recent advances, remaining challenges, and opportunities in light of the publication of the first conifer and gymnosperm genomes. PMID:25349325

  1. Genome Improvement at JGI-HAGSC

    Energy Technology Data Exchange (ETDEWEB)

    Grimwood, Jane; Schmutz, Jeremy J.; Myers, Richard M.

    2012-03-03

    Since the completion of the sequencing of the human genome, the Joint Genome Institute (JGI) has rapidly expanded its scientific goals in several DOE mission-relevant areas. At the JGI-HAGSC, we have kept pace with this rapid expansion of projects with our focus on assessing, assembling, improving and finishing eukaryotic whole genome shotgun (WGS) projects for which the shotgun sequence is generated at the Production Genomic Facility (JGI-PGF). We follow this by combining the draft WGS with genomic resources generated at JGI-HAGSC or in collaborator laboratories (including BAC end sequences, genetic maps and FLcDNA sequences) to produce an improved draft sequence. For eukaryotic genomes important to the DOE mission, we then add further information from directed experiments to produce reference genomic sequences that are publicly available for any scientific researcher. Also, we have continued our program for producing BAC-based finished sequence, both for adding information to JGI genome projects and for small BAC-based sequencing projects proposed through any of the JGI sequencing programs. We have now built our computational expertise in WGS assembly and analysis and have moved eukaryotic genome assembly from the JGI-PGF to JGI-HAGSC. We have concentrated our assembly development work on large plant genomes and complex fungal and algal genomes.

  2. Genomic Diversity and Evolution of the Lyssaviruses

    Science.gov (United States)

    Delmas, Olivier; Holmes, Edward C.; Talbi, Chiraz; Larrous, Florence; Dacheux, Laurent; Bouchier, Christiane; Bourhy, Hervé

    2008-01-01

    Lyssaviruses are RNA viruses with single-strand, negative-sense genomes responsible for rabies-like diseases in mammals. To date, genomic and evolutionary studies have most often utilized partial genome sequences, particularly of the nucleoprotein and glycoprotein genes, with little consideration of genome-scale evolution. Herein, we report the first genomic and evolutionary analysis using complete genome sequences of all recognised lyssavirus genotypes, including 14 new complete genomes of field isolates from 6 genotypes and one genotype that is completely sequenced for the first time. In doing so we significantly increase the extent of genome sequence data available for these important viruses. Our analysis of these genome sequence data reveals that all lyssaviruses have the same genomic organization. A phylogenetic analysis reveals strong geographical structuring, with the greatest genetic diversity in Africa, and an independent origin for the two known genotypes that infect European bats. We also suggest that multiple genotypes may exist within the diversity of viruses currently classified as ‘Lagos Bat’. In sum, we show that rigorous phylogenetic techniques based on full length genome sequence provide the best discriminatory power for genotype classification within the lyssaviruses. PMID:18446239

  3. Genomic diversity and evolution of the lyssaviruses.

    Directory of Open Access Journals (Sweden)

    Olivier Delmas

    2008-04-01

    Full Text Available Lyssaviruses are RNA viruses with single-strand, negative-sense genomes responsible for rabies-like diseases in mammals. To date, genomic and evolutionary studies have most often utilized partial genome sequences, particularly of the nucleoprotein and glycoprotein genes, with little consideration of genome-scale evolution. Herein, we report the first genomic and evolutionary analysis using complete genome sequences of all recognised lyssavirus genotypes, including 14 new complete genomes of field isolates from 6 genotypes and one genotype that is completely sequenced for the first time. In doing so we significantly increase the extent of genome sequence data available for these important viruses. Our analysis of these genome sequence data reveals that all lyssaviruses have the same genomic organization. A phylogenetic analysis reveals strong geographical structuring, with the greatest genetic diversity in Africa, and an independent origin for the two known genotypes that infect European bats. We also suggest that multiple genotypes may exist within the diversity of viruses currently classified as 'Lagos Bat'. In sum, we show that rigorous phylogenetic techniques based on full length genome sequence provide the best discriminatory power for genotype classification within the lyssaviruses.

  4. Snake Genome Sequencing: Results and Future Prospects.

    Science.gov (United States)

    Kerkkamp, Harald M I; Kini, R Manjunatha; Pospelov, Alexey S; Vonk, Freek J; Henkel, Christiaan V; Richardson, Michael K

    2016-12-01

    Snake genome sequencing is in its infancy-very much behind the progress made in sequencing the genomes of humans, model organisms and pathogens relevant to biomedical research, and agricultural species. We provide here an overview of some of the snake genome projects in progress, and discuss the biological findings, with special emphasis on toxinology, from the small number of draft snake genomes already published. We discuss the future of snake genomics, pointing out that new sequencing technologies will help overcome the problem of repetitive sequences in assembling snake genomes. Genome sequences are also likely to be valuable in examining the clustering of toxin genes on the chromosomes, in designing recombinant antivenoms and in studying the epigenetic regulation of toxin gene expression.

  5. Insights from Human/Mouse genome comparisons

    Energy Technology Data Exchange (ETDEWEB)

    Pennacchio, Len A.

    2003-03-30

    Large-scale public genomic sequencing efforts have provided a wealth of vertebrate sequence data poised to provide insights into mammalian biology. These include deep genomic sequence coverage of human, mouse, rat, zebrafish, and two pufferfish (Fugu rubripes and Tetraodon nigroviridis) (Aparicio et al. 2002; Lander et al. 2001; Venter et al. 2001; Waterston et al. 2002). In addition, a high-priority has been placed on determining the genomic sequence of chimpanzee, dog, cow, frog, and chicken (Boguski 2002). While only recently available, whole genome sequence data have provided the unique opportunity to globally compare complete genome contents. Furthermore, the shared evolutionary ancestry of vertebrate species has allowed the development of comparative genomic approaches to identify ancient conserved sequences with functionality. Accordingly, this review focuses on the initial comparison of available mammalian genomes and describes various insights derived from such analysis.

  6. Sequencing intractable DNA to close microbial genomes.

    Directory of Open Access Journals (Sweden)

    Richard A Hurt

    Full Text Available Advancement in high throughput DNA sequencing technologies has supported a rapid proliferation of microbial genome sequencing projects, providing the genetic blueprint for in-depth studies. Oftentimes, difficult to sequence regions in microbial genomes are ruled "intractable" resulting in a growing number of genomes with sequence gaps deposited in databases. A procedure was developed to sequence such problematic regions in the "non-contiguous finished" Desulfovibrio desulfuricans ND132 genome (6 intractable gaps and the Desulfovibrio africanus genome (1 intractable gap. The polynucleotides surrounding each gap formed GC rich secondary structures making the regions refractory to amplification and sequencing. Strand-displacing DNA polymerases used in concert with a novel ramped PCR extension cycle supported amplification and closure of all gap regions in both genomes. The developed procedures support accurate gene annotation, and provide a step-wise method that reduces the effort required for genome finishing.

  7. Snake Genome Sequencing: Results and Future Prospects

    Directory of Open Access Journals (Sweden)

    Harald M. I. Kerkkamp

    2016-12-01

    Full Text Available Snake genome sequencing is in its infancy—very much behind the progress made in sequencing the genomes of humans, model organisms and pathogens relevant to biomedical research, and agricultural species. We provide here an overview of some of the snake genome projects in progress, and discuss the biological findings, with special emphasis on toxinology, from the small number of draft snake genomes already published. We discuss the future of snake genomics, pointing out that new sequencing technologies will help overcome the problem of repetitive sequences in assembling snake genomes. Genome sequences are also likely to be valuable in examining the clustering of toxin genes on the chromosomes, in designing recombinant antivenoms and in studying the epigenetic regulation of toxin gene expression.

  8. Sequencing Intractable DNA to Close Microbial Genomes

    Energy Technology Data Exchange (ETDEWEB)

    Hurt, Jr., Richard Ashley [ORNL; Brown, Steven D [ORNL; Podar, Mircea [ORNL; Palumbo, Anthony Vito [ORNL; Elias, Dwayne A [ORNL

    2012-01-01

    Advancement in high throughput DNA sequencing technologies has supported a rapid proliferation of microbial genome sequencing projects, providing the genetic blueprint for for in-depth studies. Oftentimes, difficult to sequence regions in microbial genomes are ruled intractable resulting in a growing number of genomes with sequence gaps deposited in databases. A procedure was developed to sequence such difficult regions in the non-contiguous finished Desulfovibrio desulfuricans ND132 genome (6 intractable gaps) and the Desulfovibrio africanus genome (1 intractable gap). The polynucleotides surrounding each gap formed GC rich secondary structures making the regions refractory to amplification and sequencing. Strand-displacing DNA polymerases used in concert with a novel ramped PCR extension cycle supported amplification and closure of all gap regions in both genomes. These developed procedures support accurate gene annotation, and provide a step-wise method that reduces the effort required for genome finishing.

  9. Current development and application of soybean genomics

    Institute of Scientific and Technical Information of China (English)

    Lingli HE; Jing ZHAO; Man ZHAO; Chaoying HE

    2011-01-01

    Soybean (Glycine max),an important domesticated species originated in China,constitutes a major source of edible oils and high-quality plant proteins worldwide.In spite of its complex genome as a consequence of an ancient tetraploidilization,platforms for map-based genomics,sequence-based genomics,comparative genomics and functional genomics have been well developed in the last decade,thus rich repertoires of genomic tools and resources are available,which have been influencing the soybean genetic improvement.Here we mainly review the progresses of soybean (including its wild relative Glycine soja) genomics and its impetus for soybean breeding,and raise the major biological questions needing to be addressed.Genetic maps,physical maps,QTL and EST mapping have been so well achieved that the marker assisted selection and positional cloning in soybean is feasible and even routine.Whole genome sequencing and transcriptomic analyses provide a large collection of molecular markers and predicted genes,which are instrumental to comparative genomics and functional genomics.Comparative genomics has started to reveal the evolution of soybean genome and the molecular basis of soybean domestication process.Microarrays resources,mutagenesis and efficient transformation systems become essential components of soybean functional genomics.Furthermore,phenotypic functional genomics via both forward and reverse genetic approaches has inferred functions of many genes involved in plant and seed development,in response to abiotic stresses,functioning in plant-pathogenic microbe interactions,and controlling the oil and protein content of seed.These achievements have paved the way for generation of transgenic or genetically modified (GM) soybean crops.

  10. One bacterial cell, one complete genome.

    Directory of Open Access Journals (Sweden)

    Tanja Woyke

    2010-04-01

    Full Text Available While the bulk of the finished microbial genomes sequenced to date are derived from cultured bacterial and archaeal representatives, the vast majority of microorganisms elude current culturing attempts, severely limiting the ability to recover complete or even partial genomes from these environmental species. Single cell genomics is a novel culture-independent approach, which enables access to the genetic material of an individual cell. No single cell genome has to our knowledge been closed and finished to date. Here we report the completed genome from an uncultured single cell of Candidatus Sulcia muelleri DMIN. Digital PCR on single symbiont cells isolated from the bacteriome of the green sharpshooter Draeculacephala minerva bacteriome allowed us to assess that this bacteria is polyploid with genome copies ranging from approximately 200-900 per cell, making it a most suitable target for single cell finishing efforts. For single cell shotgun sequencing, an individual Sulcia cell was isolated and whole genome amplified by multiple displacement amplification (MDA. Sanger-based finishing methods allowed us to close the genome. To verify the correctness of our single cell genome and exclude MDA-derived artifacts, we independently shotgun sequenced and assembled the Sulcia genome from pooled bacteriomes using a metagenomic approach, yielding a nearly identical genome. Four variations we detected appear to be genuine biological differences between the two samples. Comparison of the single cell genome with bacteriome metagenomic sequence data detected two single nucleotide polymorphisms (SNPs, indicating extremely low genetic diversity within a Sulcia population. This study demonstrates the power of single cell genomics to generate a complete, high quality, non-composite reference genome within an environmental sample, which can be used for population genetic analyzes.

  11. One Bacterial Cell, One Complete Genome

    Energy Technology Data Exchange (ETDEWEB)

    Woyke, Tanja; Tighe, Damon; Mavrommatis, Konstantinos; Clum, Alicia; Copeland, Alex; Schackwitz, Wendy; Lapidus, Alla; Wu, Dongying; McCutcheon, John P.; McDonald, Bradon R.; Moran, Nancy A.; Bristow, James; Cheng, Jan-Fang

    2010-04-26

    While the bulk of the finished microbial genomes sequenced to date are derived from cultured bacterial and archaeal representatives, the vast majority of microorganisms elude current culturing attempts, severely limiting the ability to recover complete or even partial genomes from these environmental species. Single cell genomics is a novel culture-independent approach, which enables access to the genetic material of an individual cell. No single cell genome has to our knowledge been closed and finished to date. Here we report the completed genome from an uncultured single cell of Candidatus Sulcia muelleri DMIN. Digital PCR on single symbiont cells isolated from the bacteriome of the green sharpshooter Draeculacephala minerva bacteriome allowed us to assess that this bacteria is polyploid with genome copies ranging from approximately 200?900 per cell, making it a most suitable target for single cell finishing efforts. For single cell shotgun sequencing, an individual Sulcia cell was isolated and whole genome amplified by multiple displacement amplification (MDA). Sanger-based finishing methods allowed us to close the genome. To verify the correctness of our single cell genome and exclude MDA-derived artifacts, we independently shotgun sequenced and assembled the Sulcia genome from pooled bacteriomes using a metagenomic approach, yielding a nearly identical genome. Four variations we detected appear to be genuine biological differences between the two samples. Comparison of the single cell genome with bacteriome metagenomic sequence data detected two single nucleotide polymorphisms (SNPs), indicating extremely low genetic diversity within a Sulcia population. This study demonstrates the power of single cell genomics to generate a complete, high quality, non-composite reference genome within an environmental sample, which can be used for population genetic analyzes.

  12. 10. international mouse genome conference

    Energy Technology Data Exchange (ETDEWEB)

    Meisler, M.H.

    1996-12-31

    Ten years after hosting the First International Mammalian Genome Conference in Paris in 1986, Dr. Jean-Louis Guenet presided over the Tenth Conference at the Pasteur Institute, October 7--10, 1996. The 1986 conference was a satellite to the Human Gene Mapping Workshop and had approximately 50 attendees. The 1996 meeting was attended by 300 scientists from around the world. In the interim, the number of mapped loci in the mouse increased from 1,000 to over 20,000. This report contains a listing of the program and its participants, and two articles that review the meeting and the role of the laboratory mouse in the Human Genome project. More than 200 papers were presented at the conference covering the following topics: International mouse chromosome committee meetings; Mutant generation and identification; Physical and genetic maps; New technology and resources; Chromatin structure and gene regulation; Rate and hamster genetic maps; Informatics and databases; and Quantitative trait analysis.

  13. Genome Organization Drives Chromosome Fragility.

    Science.gov (United States)

    Canela, Andres; Maman, Yaakov; Jung, Seolkyoung; Wong, Nancy; Callen, Elsa; Day, Amanda; Kieffer-Kwon, Kyong-Rim; Pekowska, Aleksandra; Zhang, Hongliang; Rao, Suhas S P; Huang, Su-Chen; Mckinnon, Peter J; Aplan, Peter D; Pommier, Yves; Aiden, Erez Lieberman; Casellas, Rafael; Nussenzweig, André

    2017-07-27

    In this study, we show that evolutionarily conserved chromosome loop anchors bound by CCCTC-binding factor (CTCF) and cohesin are vulnerable to DNA double strand breaks (DSBs) mediated by topoisomerase 2B (TOP2B). Polymorphisms in the genome that redistribute CTCF/cohesin occupancy rewire DNA cleavage sites to novel loop anchors. While transcription- and replication-coupled genomic rearrangements have been well documented, we demonstrate that DSBs formed at loop anchors are largely transcription-, replication-, and cell-type-independent. DSBs are continuously formed throughout interphase, are enriched on both sides of strong topological domain borders, and frequently occur at breakpoint clusters commonly translocated in cancer. Thus, loop anchors serve as fragile sites that generate DSBs and chromosomal rearrangements. VIDEO ABSTRACT. Published by Elsevier Inc.

  14. Mathematical Analysis of Genomic Evolution

    Directory of Open Access Journals (Sweden)

    Cedric Green

    2011-01-01

    Full Text Available Changes in nucleotide sequences, or mutations, accumulate from generation to generation in the genomes of all living organisms. The mutations can be advantageous, deleterious, or neutral. The goal of this project is to determine the amount of advantageous mutations it takes to get human (Homo sapiens DNA from the DNA of genetically distinct organisms. We do this by collecting the genomic data of such organisms, and estimating the amount of mutations it takes to transform yeast (Saccharomyces cerevisiae DNA to the DNA of a human. We calculate the typical number of mutations occurring annually through the organism's average life span and the average mutation rate. This allows us to determine the total number of mutations as well as the probability of advantageous mutations. Not surprisingly, this probability proves to be fairly small. A more precise estimate can be determined by accounting for the differences in the chromosomal structure and phenomena like horizontal gene transfer.

  15. A Million Cancer Genome Warehouse

    Science.gov (United States)

    2012-11-20

    of a national program for Cancer Information Donors, the American Society for Clinical Oncology (ASCO) has proposed a rapid learning system for...or Scala and Spark; “scrum” organization of small programming teams; calculating “velocity” to predict time to develop new features; and Agile...2012 to 00-00-2012 4. TITLE AND SUBTITLE A Million Cancer Genome Warehouse 5a. CONTRACT NUMBER 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 6

  16. Microbial genomes: Blueprints for life

    Energy Technology Data Exchange (ETDEWEB)

    Relman, David A.; Strauss, Evelyn

    2000-12-31

    Complete microbial genome sequences hold the promise of profound new insights into microbial pathogenesis, evolution, diagnostics, and therapeutics. From these insights will come a new foundation for understanding the evolution of single-celled life, as well as the evolution of more complex life forms. This report is an in-depth analysis of scientific issues that provides recommendations and will be widely disseminated to the scientific community, federal agencies, industry and the public.

  17. Genomic Signatures of Sexual Conflict.

    Science.gov (United States)

    Kasimatis, Katja R; Nelson, Thomas C; Phillips, Patrick C

    2017-10-30

    Sexual conflict is a specific class of intergenomic conflict that describes the reciprocal sex-specific fitness costs generated by antagonistic reproductive interactions. The potential for sexual conflict is an inherent property of having a shared genome between the sexes and, therefore, is an extreme form of an environment-dependent fitness effect. In this way, many of the predictions from environment-dependent selection can be used to formulate expected patterns of genome evolution under sexual conflict. However, the pleiotropic and transmission constraints inherent to having alleles move across sex-specific backgrounds from generation to generation further modulate the anticipated signatures of selection. We outline methods for detecting candidate sexual conflict loci both across and within populations. Additionally, we consider the ability of genome scans to identify sexually antagonistic loci by modeling allele frequency changes within males and females due to a single generation of selection. In particular, we highlight the need to integrate genotype, phenotype, and functional information to truly distinguish sexual conflict from other forms of sexual differentiation. © The American Genetic Association 2017. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  18. Multiple models for Rosaceae genomics.

    Science.gov (United States)

    Shulaev, Vladimir; Korban, Schuyler S; Sosinski, Bryon; Abbott, Albert G; Aldwinckle, Herb S; Folta, Kevin M; Iezzoni, Amy; Main, Dorrie; Arús, Pere; Dandekar, Abhaya M; Lewers, Kim; Brown, Susan K; Davis, Thomas M; Gardiner, Susan E; Potter, Daniel; Veilleux, Richard E

    2008-07-01

    The plant family Rosaceae consists of over 100 genera and 3,000 species that include many important fruit, nut, ornamental, and wood crops. Members of this family provide high-value nutritional foods and contribute desirable aesthetic and industrial products. Most rosaceous crops have been enhanced by human intervention through sexual hybridization, asexual propagation, and genetic improvement since ancient times, 4,000 to 5,000 B.C. Modern breeding programs have contributed to the selection and release of numerous cultivars having significant economic impact on the U.S. and world markets. In recent years, the Rosaceae community, both in the United States and internationally, has benefited from newfound organization and collaboration that have hastened progress in developing genetic and genomic resources for representative crops such as apple (Malus spp.), peach (Prunus spp.), and strawberry (Fragaria spp.). These resources, including expressed sequence tags, bacterial artificial chromosome libraries, physical and genetic maps, and molecular markers, combined with genetic transformation protocols and bioinformatics tools, have rendered various rosaceous crops highly amenable to comparative and functional genomics studies. This report serves as a synopsis of the resources and initiatives of the Rosaceae community, recent developments in Rosaceae genomics, and plans to apply newly accumulated knowledge and resources toward breeding and crop improvement.

  19. Group normalization for genomic data.

    Science.gov (United States)

    Ghandi, Mahmoud; Beer, Michael A

    2012-01-01

    Data normalization is a crucial preliminary step in analyzing genomic datasets. The goal of normalization is to remove global variation to make readings across different experiments comparable. In addition, most genomic loci have non-uniform sensitivity to any given assay because of variation in local sequence properties. In microarray experiments, this non-uniform sensitivity is due to different DNA hybridization and cross-hybridization efficiencies, known as the probe effect. In this paper we introduce a new scheme, called Group Normalization (GN), to remove both global and local biases in one integrated step, whereby we determine the normalized probe signal by finding a set of reference probes with similar responses. Compared to conventional normalization methods such as Quantile normalization and physically motivated probe effect models, our proposed method is general in the sense that it does not require the assumption that the underlying signal distribution be identical for the treatment and control, and is flexible enough to correct for nonlinear and higher order probe effects. The Group Normalization algorithm is computationally efficient and easy to implement. We also describe a variant of the Group Normalization algorithm, called Cross Normalization, which efficiently amplifies biologically relevant differences between any two genomic datasets.

  20. Group normalization for genomic data.

    Directory of Open Access Journals (Sweden)

    Mahmoud Ghandi

    Full Text Available Data normalization is a crucial preliminary step in analyzing genomic datasets. The goal of normalization is to remove global variation to make readings across different experiments comparable. In addition, most genomic loci have non-uniform sensitivity to any given assay because of variation in local sequence properties. In microarray experiments, this non-uniform sensitivity is due to different DNA hybridization and cross-hybridization efficiencies, known as the probe effect. In this paper we introduce a new scheme, called Group Normalization (GN, to remove both global and local biases in one integrated step, whereby we determine the normalized probe signal by finding a set of reference probes with similar responses. Compared to conventional normalization methods such as Quantile normalization and physically motivated probe effect models, our proposed method is general in the sense that it does not require the assumption that the underlying signal distribution be identical for the treatment and control, and is flexible enough to correct for nonlinear and higher order probe effects. The Group Normalization algorithm is computationally efficient and easy to implement. We also describe a variant of the Group Normalization algorithm, called Cross Normalization, which efficiently amplifies biologically relevant differences between any two genomic datasets.

  1. Parallel processing of genomics data

    Science.gov (United States)

    Agapito, Giuseppe; Guzzi, Pietro Hiram; Cannataro, Mario

    2016-10-01

    The availability of high-throughput experimental platforms for the analysis of biological samples, such as mass spectrometry, microarrays and Next Generation Sequencing, have made possible to analyze a whole genome in a single experiment. Such platforms produce an enormous volume of data per single experiment, thus the analysis of this enormous flow of data poses several challenges in term of data storage, preprocessing, and analysis. To face those issues, efficient, possibly parallel, bioinformatics software needs to be used to preprocess and analyze data, for instance to highlight genetic variation associated with complex diseases. In this paper we present a parallel algorithm for the parallel preprocessing and statistical analysis of genomics data, able to face high dimension of data and resulting in good response time. The proposed system is able to find statistically significant biological markers able to discriminate classes of patients that respond to drugs in different ways. Experiments performed on real and synthetic genomic datasets show good speed-up and scalability.

  2. The wolf reference genome sequence (Canis lupus lupus) and its implications for Canis spp. population genomics

    DEFF Research Database (Denmark)

    Gopalakrishnan, Shyam; Samaniego Castruita, Jose Alfredo; Sinding, Mikkel Holger Strander

    2017-01-01

    Background An increasing number of studies are addressing the evolutionary genomics of dog domestication, principally through resequencing dog, wolf and related canid genomes. There is, however, only one de novo assembled canid genome currently available against which to map such data - that of a......Background An increasing number of studies are addressing the evolutionary genomics of dog domestication, principally through resequencing dog, wolf and related canid genomes. There is, however, only one de novo assembled canid genome currently available against which to map such data...... that regardless of the reference genome choice, most evolutionary genomic analyses yield qualitatively similar results, including those exploring the structure between the wolves and dogs using admixture and principal component analysis. However, we do observe differences in the genomic coverage of re-mapped...

  3. The Small Nuclear Genomes of Selaginella Are Associated with a Low Rate of Genome Size Evolution.

    Science.gov (United States)

    Baniaga, Anthony E; Arrigo, Nils; Barker, Michael S

    2016-06-03

    The haploid nuclear genome size (1C DNA) of vascular land plants varies over several orders of magnitude. Much of this observed diversity in genome size is due to the proliferation and deletion of transposable elements. To date, all vascular land plant lineages with extremely small nuclear genomes represent recently derived states, having ancestors with much larger genome sizes. The Selaginellaceae represent an ancient lineage with extremely small genomes. It is unclear how small nuclear genomes evolved in Selaginella We compared the rates of nuclear genome size evolution in Selaginella and major vascular plant clades in a comparative phylogenetic framework. For the analyses, we collected 29 new flow cytometry estimates of haploid genome size in Selaginella to augment publicly available data. Selaginella possess some of the smallest known haploid nuclear genome sizes, as well as the lowest rate of genome size evolution observed across all vascular land plants included in our analyses. Additionally, our analyses provide strong support for a history of haploid nuclear genome size stasis in Selaginella Our results indicate that Selaginella, similar to other early diverging lineages of vascular land plants, has relatively low rates of genome size evolution. Further, our analyses highlight that a rapid transition to a small genome size is only one route to an extremely small genome. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  4. St2-80: a new FISH marker for St genome and genome analysis in Triticeae.

    Science.gov (United States)

    Wang, Long; Shi, Qinghua; Su, Handong; Wang, Yi; Sha, Lina; Fan, Xing; Kang, Houyang; Zhang, Haiqin; Zhou, Yonghong

    2017-07-01

    The St genome is one of the most fundamental genomes in Triticeae. Repetitive sequences are widely used to distinguish different genomes or species. The primary objectives of this study were to (i) screen a new sequence that could easily distinguish the chromosome of the St genome from those of other genomes by fluorescence in situ hybridization (FISH) and (ii) investigate the genome constitution of some species that remain uncertain and controversial. We used degenerated oligonucleotide primer PCR (Dop-PCR), Dot-blot, and FISH to screen for a new marker of the St genome and to test the efficiency of this marker in the detection of the St chromosome at different ploidy levels. Signals produced by a new FISH marker (denoted St 2 -80) were present on the entire arm of chromosomes of the St genome, except in the centromeric region. On the contrary, St 2 -80 signals were present in the terminal region of chromosomes of the E, H, P, and Y genomes. No signal was detected in the A and B genomes, and only weak signals were detected in the terminal region of chromosomes of the D genome. St 2 -80 signals were obvious and stable in chromosomes of different genomes, whether diploid or polyploid. Therefore, St 2 -80 is a potential and useful FISH marker that can be used to distinguish the St genome from those of other genomes in Triticeae.

  5. Unleashing the genome of Brassica rapa

    Directory of Open Access Journals (Sweden)

    Haibao eTang

    2012-07-01

    Full Text Available The completion and release of the Brassica rapa genome is of great benefit to researchers of the Brassicas, Arabidopsis, and genome evolution. While its lineage is closely related to the model organism Arabidopsis thaliana, the Brassicas experienced a whole genome triplication subsequent to their divergence. This event contemporaneously created three copies of its ancestral genome, which had diploidized through the process of homeologous gene loss known as fractionation. By the fractionation of homeologous gene content and genetic regulatory binding sites, Brassica’s genome is well placed to use comparative genomic techniques to identify syntenic regions, homeologous gene duplications, and putative regulatory sequences. Here, we use the comparative genomics platform CoGe to perform several different genomic analyses with which to study structural changes of its genome and dynamics of various genetic elements. Starting with whole genome comparisons, the Brassica paleohexaploidy is characterized, syntenic regions with Arabidopsis thaliana are identified, and the TOC1 gene in the circadian rhythm pathway from Arabidopsis thaliana is used to find duplicated orthologs in Brassica rapa. These TOC1 genes are further analyzed to identify conserved noncoding sequences that contain cis-acting regulatory elements and promoter sequences previously implicated in circadian rhythmicity. Each 'cookbook style' analysis includes a step-by-step walkthrough with links to CoGe to quickly reproduce each step of the analytical process.

  6. Human Genome Sequencing in Health and Disease

    Science.gov (United States)

    Gonzaga-Jauregui, Claudia; Lupski, James R.; Gibbs, Richard A.

    2013-01-01

    Following the “finished,” euchromatic, haploid human reference genome sequence, the rapid development of novel, faster, and cheaper sequencing technologies is making possible the era of personalized human genomics. Personal diploid human genome sequences have been generated, and each has contributed to our better understanding of variation in the human genome. We have consequently begun to appreciate the vastness of individual genetic variation from single nucleotide to structural variants. Translation of genome-scale variation into medically useful information is, however, in its infancy. This review summarizes the initial steps undertaken in clinical implementation of personal genome information, and describes the application of whole-genome and exome sequencing to identify the cause of genetic diseases and to suggest adjuvant therapies. Better analysis tools and a deeper understanding of the biology of our genome are necessary in order to decipher, interpret, and optimize clinical utility of what the variation in the human genome can teach us. Personal genome sequencing may eventually become an instrument of common medical practice, providing information that assists in the formulation of a differential diagnosis. We outline herein some of the remaining challenges. PMID:22248320

  7. Comparative Reannotation of 21 Aspergillus Genomes

    Energy Technology Data Exchange (ETDEWEB)

    Salamov, Asaf; Riley, Robert; Kuo, Alan; Grigoriev, Igor

    2013-03-08

    We used comparative gene modeling to reannotate 21 Aspergillus genomes. Initial automatic annotation of individual genomes may contain some errors of different nature, e.g. missing genes, incorrect exon-intron structures, 'chimeras', which fuse 2 or more real genes or alternatively splitting some real genes into 2 or more models. The main premise behind the comparative modeling approach is that for closely related genomes most orthologous families have the same conserved gene structure. The algorithm maps all gene models predicted in each individual Aspergillus genome to the other genomes and, for each locus, selects from potentially many competing models, the one which most closely resembles the orthologous genes from other genomes. This procedure is iterated until no further change in gene models is observed. For Aspergillus genomes we predicted in total 4503 new gene models ( ~;;2percent per genome), supported by comparative analysis, additionally correcting ~;;18percent of old gene models. This resulted in a total of 4065 more genes with annotated PFAM domains (~;;3percent increase per genome). Analysis of a few genomes with EST/transcriptomics data shows that the new annotation sets also have a higher number of EST-supported splice sites at exon-intron boundaries.

  8. Fueling the Future with Fungal Genomes

    Energy Technology Data Exchange (ETDEWEB)

    Grigoriev, Igor V.

    2014-10-27

    Genomes of fungi relevant to energy and environment are in focus of the JGI Fungal Genomic Program. One of its projects, the Genomics Encyclopedia of Fungi, targets fungi related to plant health (symbionts and pathogens) and biorefinery processes (cellulose degradation and sugar fermentation) by means of genome sequencing and analysis. New chapters of the Encyclopedia can be opened with user proposals to the JGI Community Science Program (CSP). Another JGI project, the 1000 fungal genomes, explores fungal diversity on genome level at scale and is open for users to nominate new species for sequencing. Over 400 fungal genomes have been sequenced by JGI to date and released through MycoCosm (www.jgi.doe.gov/fungi), a fungal web-portal, which integrates sequence and functional data with genome analysis tools for user community. Sequence analysis supported by functional genomics will lead to developing parts list for complex systems ranging from ecosystems of biofuel crops to biorefineries. Recent examples of such ‘parts’ suggested by comparative genomics and functional analysis in these areas are presented here.

  9. The wolf reference genome sequence (Canis lupus lupus) and its implications for Canis spp. population genomics

    DEFF Research Database (Denmark)

    Gopalakrishnan, Shyam; Samaniego Castruita, Jose Alfredo; Sinding, Mikkel Holger Strander

    2017-01-01

    Background An increasing number of studies are addressing the evolutionary genomics of dog domestication, principally through resequencing dog, wolf and related canid genomes. There is, however, only one de novo assembled canid genome currently available against which to map such data - that of a......Background An increasing number of studies are addressing the evolutionary genomics of dog domestication, principally through resequencing dog, wolf and related canid genomes. There is, however, only one de novo assembled canid genome currently available against which to map such data...

  10. Genome plasticity and systems evolution in Streptomyces

    Science.gov (United States)

    2012-01-01

    Background Streptomycetes are filamentous soil-dwelling bacteria. They are best known as the producers of a great variety of natural products such as antibiotics, antifungals, antiparasitics, and anticancer agents and the decomposers of organic substances for carbon recycling. They are also model organisms for the studies of gene regulatory networks, morphological differentiation, and stress response. The availability of sets of genomes from closely related Streptomyces strains makes it possible to assess the mechanisms underlying genome plasticity and systems adaptation. Results We present the results of a comprehensive analysis of the genomes of five Streptomyces species with distinct phenotypes. These streptomycetes have a pan-genome comprised of 17,362 orthologous families which includes 3,096 components in the core genome, 5,066 components in the dispensable genome, and 9,200 components that are uniquely present in only one species. The core genome makes up about 33%-45% of each genome repertoire. It contains important genes for Streptomyces biology including those involved in gene regulation, secretion, secondary metabolism and morphological differentiation. Abundant duplicate genes have been identified, with 4%-11% of the whole genomes composed of lineage-specific expansions (LSEs), suggesting that frequent gene duplication or lateral gene transfer events play a role in shaping the genome diversification within this genus. Two patterns of expansion, single gene expansion and chromosome block expansion are observed, representing different scales of duplication. Conclusions Our results provide a catalog of genome components and their potential functional roles in gene regulatory networks and metabolic networks. The core genome components reveal the minimum requirement for streptomycetes to sustain a successful lifecycle in the soil environment, reflecting the effects of both genome evolution and environmental stress acting upon the expressed phenotypes. A

  11. Rice Genome Research: Current Status and Future Perspectives

    Directory of Open Access Journals (Sweden)

    Bin Han

    2008-11-01

    Full Text Available Rice ( L. is the leading genomics system among the crop plants. The sequence of the rice genome, the first cereal plant genome, was published in 2005. This review summarizes progress made in rice genome annotations, comparative genomics, and functional genomics researches. It also maps out the status of rice genomics globally and provides a vision of future research directions and resource building.

  12. Genome digging: insight into the mitochondrial genome of Homo.

    Directory of Open Access Journals (Sweden)

    Igor V Ovchinnikov

    2010-12-01

    Full Text Available A fraction of the Neanderthal mitochondrial genome sequence has a similarity with a 5,839-bp nuclear DNA sequence of mitochondrial origin (numt on the human chromosome 1. This fact has never been interpreted. Although this phenomenon may be attributed to contamination and mosaic assembly of Neanderthal mtDNA from short sequencing reads, we explain the mysterious similarity by integration of this numt (mtAncestor-1 into the nuclear genome of the common ancestor of Neanderthals and modern humans not long before their reproductive split.Exploiting bioinformatics, we uncovered an additional numt (mtAncestor-2 with a high similarity to the Neanderthal mtDNA and indicated that both numts represent almost identical replicas of the mtDNA sequences ancestral to the mitochondrial genomes of Neanderthals and modern humans. In the proteins, encoded by mtDNA, the majority of amino acids distinguishing chimpanzees from humans and Neanderthals were acquired by the ancestral hominins. The overall rate of nonsynonymous evolution in Neanderthal mitochondrial protein-coding genes is not higher than in other lineages. The model incorporating the ancestral hominin mtDNA sequences estimates the average divergence age of the mtDNAs of Neanderthals and modern humans to be 450,000-485,000 years. The mtAncestor-1 and mtAncestor-2 sequences were incorporated into the nuclear genome approximately 620,000 years and 2,885,000 years ago, respectively.This study provides the first insight into the evolution of the mitochondrial DNA in hominins ancestral to Neanderthals and humans. We hypothesize that mtAncestor-1 and mtAncestor-2 are likely to be molecular fossils of the mtDNAs of Homo heidelbergensis and a stem Homo lineage. The d(N/d(S dynamics suggests that the effective population size of extinct hominins was low. However, the hominin lineage ancestral to humans, Neanderthals and H. heidelbergensis, had a larger effective population size and possessed genetic diversity

  13. Genome-wide identification of significant aberrations in cancer genome.

    Science.gov (United States)

    Yuan, Xiguo; Yu, Guoqiang; Hou, Xuchu; Shih, Ie-Ming; Clarke, Robert; Zhang, Junying; Hoffman, Eric P; Wang, Roger R; Zhang, Zhen; Wang, Yue

    2012-07-27

    Somatic Copy Number Alterations (CNAs) in human genomes are present in almost all human cancers. Systematic efforts to characterize such structural variants must effectively distinguish significant consensus events from random background aberrations. Here we introduce Significant Aberration in Cancer (SAIC), a new method for characterizing and assessing the statistical significance of recurrent CNA units. Three main features of SAIC include: (1) exploiting the intrinsic correlation among consecutive probes to assign a score to each CNA unit instead of single probes; (2) performing permutations on CNA units that preserve correlations inherent in the copy number data; and (3) iteratively detecting Significant Copy Number Aberrations (SCAs) and estimating an unbiased null distribution by applying an SCA-exclusive permutation scheme. We test and compare the performance of SAIC against four peer methods (GISTIC, STAC, KC-SMART, CMDS) on a large number of simulation datasets. Experimental results show that SAIC outperforms peer methods in terms of larger area under the Receiver Operating Characteristics curve and increased detection power. We then apply SAIC to analyze structural genomic aberrations acquired in four real cancer genome-wide copy number data sets (ovarian cancer, metastatic prostate cancer, lung adenocarcinoma, glioblastoma). When compared with previously reported results, SAIC successfully identifies most SCAs known to be of biological significance and associated with oncogenes (e.g., KRAS, CCNE1, and MYC) or tumor suppressor genes (e.g., CDKN2A/B). Furthermore, SAIC identifies a number of novel SCAs in these copy number data that encompass tumor related genes and may warrant further studies. Supported by a well-grounded theoretical framework, SAIC has been developed and used to identify SCAs in various cancer copy number data sets, providing useful information to study the landscape of cancer genomes. Open-source and platform-independent SAIC software is

  14. Genome-wide identification of significant aberrations in cancer genome

    Directory of Open Access Journals (Sweden)

    Yuan Xiguo

    2012-07-01

    Full Text Available Abstract Background Somatic Copy Number Alterations (CNAs in human genomes are present in almost all human cancers. Systematic efforts to characterize such structural variants must effectively distinguish significant consensus events from random background aberrations. Here we introduce Significant Aberration in Cancer (SAIC, a new method for characterizing and assessing the statistical significance of recurrent CNA units. Three main features of SAIC include: (1 exploiting the intrinsic correlation among consecutive probes to assign a score to each CNA unit instead of single probes; (2 performing permutations on CNA units that preserve correlations inherent in the copy number data; and (3 iteratively detecting Significant Copy Number Aberrations (SCAs and estimating an unbiased null distribution by applying an SCA-exclusive permutation scheme. Results We test and compare the performance of SAIC against four peer methods (GISTIC, STAC, KC-SMART, CMDS on a large number of simulation datasets. Experimental results show that SAIC outperforms peer methods in terms of larger area under the Receiver Operating Characteristics curve and increased detection power. We then apply SAIC to analyze structural genomic aberrations acquired in four real cancer genome-wide copy number data sets (ovarian cancer, metastatic prostate cancer, lung adenocarcinoma, glioblastoma. When compared with previously reported results, SAIC successfully identifies most SCAs known to be of biological significance and associated with oncogenes (e.g., KRAS, CCNE1, and MYC or tumor suppressor genes (e.g., CDKN2A/B. Furthermore, SAIC identifies a number of novel SCAs in these copy number data that encompass tumor related genes and may warrant further studies. Conclusions Supported by a well-grounded theoretical framework, SAIC has been developed and used to identify SCAs in various cancer copy number data sets, providing useful information to study the landscape of cancer genomes

  15. Molecular cytogenetic and genomic analyses reveal new insights into the origin of the wheat B genome.

    Science.gov (United States)

    Zhang, Wei; Zhang, Mingyi; Zhu, Xianwen; Cao, Yaping; Sun, Qing; Ma, Guojia; Chao, Shiaoman; Yan, Changhui; Xu, Steven S; Cai, Xiwen

    2018-02-01

    This work pinpointed the goatgrass chromosomal segment in the wheat B genome using modern cytogenetic and genomic technologies, and provided novel insights into the origin of the wheat B genome. Wheat is a typical allopolyploid with three homoeologous subgenomes (A, B, and D). The donors of the subgenomes A and D had been identified, but not for the subgenome B. The goatgrass Aegilops speltoides (genome SS) has been controversially considered a possible candidate for the donor of the wheat B genome. However, the relationship of the Ae. speltoides S genome with the wheat B genome remains largely obscure. The present study assessed the homology of the B and S genomes using an integrative cytogenetic and genomic approach, and revealed the contribution of Ae. speltoides to the origin of the wheat B genome. We discovered noticeable homology between wheat chromosome 1B and Ae. speltoides chromosome 1S, but not between other chromosomes in the B and S genomes. An Ae. speltoides-originated segment spanning a genomic region of approximately 10.46 Mb was detected on the long arm of wheat chromosome 1B (1BL). The Ae. speltoides-originated segment on 1BL was found to co-evolve with the rest of the B genome. Evidently, Ae. speltoides had been involved in the origin of the wheat B genome, but should not be considered an exclusive donor of this genome. The wheat B genome might have a polyphyletic origin with multiple ancestors involved, including Ae. speltoides. These novel findings will facilitate genome studies in wheat and other polyploids.

  16. RPAN: rice pan-genome browser for ∼3000 rice genomes.

    Science.gov (United States)

    Sun, Chen; Hu, Zhiqiang; Zheng, Tianqing; Lu, Kuangchen; Zhao, Yue; Wang, Wensheng; Shi, Jianxin; Wang, Chunchao; Lu, Jinyuan; Zhang, Dabing; Li, Zhikang; Wei, Chaochun

    2017-01-25

    A pan-genome is the union of the gene sets of all the individuals of a clade or a species and it provides a new dimension of genome complexity with the presence/absence variations (PAVs) of genes among these genomes. With the progress of sequencing technologies, pan-genome study is becoming affordable for eukaryotes with large-sized genomes. The Asian cultivated rice, Oryza sativa L., is one of the major food sources for the world and a model organism in plant biology. Recently, the 3000 Rice Genome Project (3K RGP) sequenced more than 3000 rice genomes with a mean sequencing depth of 14.3×, which provided a tremendous resource for rice research. In this paper, we present a genome browser, Rice Pan-genome Browser (RPAN), as a tool to search and visualize the rice pan-genome derived from 3K RGP. RPAN contains a database of the basic information of 3010 rice accessions, including genomic sequences, gene annotations, PAV information and gene expression data of the rice pan-genome. At least 12 000 novel genes absent in the reference genome were included. RPAN also provides multiple search and visualization functions. RPAN can be a rich resource for rice biology and rice breeding. It is available at http://cgm.sjtu.edu.cn/3kricedb/ or http://www.rmbreeding.cn/pan3k. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  17. GenColors-based comparative genome databases for small eukaryotic genomes.

    Science.gov (United States)

    Felder, Marius; Romualdi, Alessandro; Petzold, Andreas; Platzer, Matthias; Sühnel, Jürgen; Glöckner, Gernot

    2013-01-01

    Many sequence data repositories can give a quick and easily accessible overview on genomes and their annotations. Less widespread is the possibility to compare related genomes with each other in a common database environment. We have previously described the GenColors database system (http://gencolors.fli-leibniz.de) and its applications to a number of bacterial genomes such as Borrelia, Legionella, Leptospira and Treponema. This system has an emphasis on genome comparison. It combines data from related genomes and provides the user with an extensive set of visualization and analysis tools. Eukaryote genomes are normally larger than prokaryote genomes and thus pose additional challenges for such a system. We have, therefore, adapted GenColors to also handle larger datasets of small eukaryotic genomes and to display eukaryotic gene structures. Further recent developments include whole genome views, genome list options and, for bacterial genome browsers, the display of horizontal gene transfer predictions. Two new GenColors-based databases for two fungal species (http://fgb.fli-leibniz.de) and for four social amoebas (http://sacgb.fli-leibniz.de) were set up. Both new resources open up a single entry point for related genomes for the amoebozoa and fungal research communities and other interested users. Comparative genomics approaches are greatly facilitated by these resources.

  18. PGSB/MIPS Plant Genome Information Resources and Concepts for the Analysis of Complex Grass Genomes.

    Science.gov (United States)

    Spannagl, Manuel; Bader, Kai; Pfeifer, Matthias; Nussbaumer, Thomas; Mayer, Klaus F X

    2016-01-01

    PGSB (Plant Genome and Systems Biology; formerly MIPS-Munich Institute for Protein Sequences) has been involved in developing, implementing and maintaining plant genome databases for more than a decade. Genome databases and analysis resources have focused on individual genomes and aim to provide flexible and maintainable datasets for model plant genomes as a backbone against which experimental data, e.g., from high-throughput functional genomics, can be organized and analyzed. In addition, genomes from both model and crop plants form a scaffold for comparative genomics, assisted by specialized tools such as the CrowsNest viewer to explore conserved gene order (synteny) between related species on macro- and micro-levels.The genomes of many economically important Triticeae plants such as wheat, barley, and rye present a great challenge for sequence assembly and bioinformatic analysis due to their enormous complexity and large genome size. Novel concepts and strategies have been developed to deal with these difficulties and have been applied to the genomes of wheat, barley, rye, and other cereals. This includes the GenomeZipper concept, reference-guided exome assembly, and "chromosome genomics" based on flow cytometry sorted chromosomes.

  19. Big Data Analytics for Genomic Medicine.

    Science.gov (United States)

    He, Karen Y; Ge, Dongliang; He, Max M

    2017-02-15

    Genomic medicine attempts to build individualized strategies for diagnostic or therapeutic decision-making by utilizing patients' genomic information. Big Data analytics uncovers hidden patterns, unknown correlations, and other insights through examining large-scale various data sets. While integration and manipulation of diverse genomic data and comprehensive electronic health records (EHRs) on a Big Data infrastructure exhibit challenges, they also provide a feasible opportunity to develop an efficient and effective approach to identify clinically actionable genetic variants for individualized diagnosis and therapy. In this paper, we review the challenges of manipulating large-scale next-generation sequencing (NGS) data and diverse clinical data derived from the EHRs for genomic medicine. We introduce possible solutions for different challenges in manipulating, managing, and analyzing genomic and clinical data to implement genomic medicine. Additionally, we also present a practical Big Data toolset for identifying clinically actionable genetic variants using high-throughput NGS data and EHRs.

  20. Body maps on the human genome.

    Science.gov (United States)

    Cherniak, Christopher; Rodriguez-Esteban, Raul

    2013-12-20

    Chromosomes have territories, or preferred locales, in the cell nucleus. When these sites are taken into account, some large-scale structure of the human genome emerges. The synoptic picture is that genes highly expressed in particular topologically compact tissues are not randomly distributed on the genome. Rather, such tissue-specific genes tend to map somatotopically onto the complete chromosome set. They seem to form a "genome homunculus": a multi-dimensional, genome-wide body representation extending across chromosome territories of the entire spermcell nucleus. The antero-posterior axis of the body significantly corresponds to the head-tail axis of the nucleus, and the dorso-ventral body axis to the central-peripheral nucleus axis. This large-scale genomic structure includes thousands of genes. One rationale for a homuncular genome structure would be to minimize connection costs in genetic networks. Somatotopic maps in cerebral cortex have been reported for over a century.

  1. Genome aliquoting with double cut and join

    Directory of Open Access Journals (Sweden)

    Sankoff David

    2008-01-01

    Full Text Available Abstract Background The genome aliquoting probem is, given an observed genome A with n copies of each gene, presumed to descend from an n-way polyploidization event from an ordinary diploid genome B, followed by a history of chromosomal rearrangements, to reconstruct the identity of the original genome B'. The idea is to construct B', containing exactly one copy of each gene, so as to minimize the number of rearrangements d(A, B' ⊕ B' ⊕ ... ⊕ B' necessary to convert the observed genome B' ⊕ B' ⊕ ... ⊕ B' into A. Results In this paper we make the first attempt to define and solve the genome aliquoting problem. We present a heuristic algorithm for the problem as well the data from our experiments demonstrating its validity. Conclusion The heuristic performs well, consistently giving a non-trivial result. The question as to the existence or non-existence of an exact solution to this problem remains open.

  2. The Saccharomyces Genome Database Variant Viewer.

    Science.gov (United States)

    Sheppard, Travis K; Hitz, Benjamin C; Engel, Stacia R; Song, Giltae; Balakrishnan, Rama; Binkley, Gail; Costanzo, Maria C; Dalusag, Kyla S; Demeter, Janos; Hellerstedt, Sage T; Karra, Kalpana; Nash, Robert S; Paskov, Kelley M; Skrzypek, Marek S; Weng, Shuai; Wong, Edith D; Cherry, J Michael

    2016-01-04

    The Saccharomyces Genome Database (SGD; http://www.yeastgenome.org) is the authoritative community resource for the Saccharomyces cerevisiae reference genome sequence and its annotation. In recent years, we have moved toward increased representation of sequence variation and allelic differences within S. cerevisiae. The publication of numerous additional genomes has motivated the creation of new tools for their annotation and analysis. Here we present the Variant Viewer: a dynamic open-source web application for the visualization of genomic and proteomic differences. Multiple sequence alignments have been constructed across high quality genome sequences from 11 different S. cerevisiae strains and stored in the SGD. The alignments and summaries are encoded in JSON and used to create a two-tiered dynamic view of the budding yeast pan-genome, available at http://www.yeastgenome.org/variant-viewer. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  3. REGEN: Ancestral Genome Reconstruction for Bacteria

    Directory of Open Access Journals (Sweden)

    João C. Setubal

    2012-07-01

    Full Text Available Ancestral genome reconstruction can be understood as a phylogenetic study with more details than a traditional phylogenetic tree reconstruction. We present a new computational system called REGEN for ancestral bacterial genome reconstruction at both the gene and replicon levels. REGEN reconstructs gene content, contiguous gene runs, and replicon structure for each ancestral genome. Along each branch of the phylogenetic tree, REGEN infers evolutionary events, including gene creation and deletion and replicon fission and fusion. The reconstruction can be performed by either a maximum parsimony or a maximum likelihood method. Gene content reconstruction is based on the concept of neighboring gene pairs. REGEN was designed to be used with any set of genomes that are sufficiently related, which will usually be the case for bacteria within the same taxonomic order. We evaluated REGEN using simulated genomes and genomes in the Rhizobiales order.

  4. Big Data Analytics for Genomic Medicine

    Science.gov (United States)

    He, Karen Y.; Ge, Dongliang; He, Max M.

    2017-01-01

    Genomic medicine attempts to build individualized strategies for diagnostic or therapeutic decision-making by utilizing patients’ genomic information. Big Data analytics uncovers hidden patterns, unknown correlations, and other insights through examining large-scale various data sets. While integration and manipulation of diverse genomic data and comprehensive electronic health records (EHRs) on a Big Data infrastructure exhibit challenges, they also provide a feasible opportunity to develop an efficient and effective approach to identify clinically actionable genetic variants for individualized diagnosis and therapy. In this paper, we review the challenges of manipulating large-scale next-generation sequencing (NGS) data and diverse clinical data derived from the EHRs for genomic medicine. We introduce possible solutions for different challenges in manipulating, managing, and analyzing genomic and clinical data to implement genomic medicine. Additionally, we also present a practical Big Data toolset for identifying clinically actionable genetic variants using high-throughput NGS data and EHRs. PMID:28212287

  5. [Ethical considerations in genomic cohort study].

    Science.gov (United States)

    Choi, Eun Kyung; Kim, Ock-Joo

    2007-03-01

    During the last decade, genomic cohort study has been developed in many countries by linking health data and genetic data in stored samples. Genomic cohort study is expected to find key genetic components that contribute to common diseases, thereby promising great advance in genome medicine. While many countries endeavor to build biobank systems, biobank-based genome research has raised important ethical concerns including genetic privacy, confidentiality, discrimination, and informed consent. Informed consent for biobank poses an important question: whether true informed consent is possible in population-based genomic cohort research where the nature of future studies is unforeseeable when consent is obtained. Due to the sensitive character of genetic information, protecting privacy and keeping confidentiality become important topics. To minimize ethical problems and achieve scientific goals to its maximum degree, each country strives to build population-based genomic cohort research project, by organizing public consultation, trying public and expert consensus in research, and providing safeguards to protect privacy and confidentiality.

  6. Genome Evolution of Plant-Parasitic Nematodes.

    Science.gov (United States)

    Kikuchi, Taisei; Eves-van den Akker, Sebastian; Jones, John T

    2017-08-04

    Plant parasitism has evolved independently on at least four separate occasions in the phylum Nematoda. The application of next-generation sequencing (NGS) to plant-parasitic nematodes has allowed a wide range of genome- or transcriptome-level comparisons, and these have identified genome adaptations that enable parasitism of plants. Current genome data suggest that horizontal gene transfer, gene family expansions, evolution of new genes that mediate interactions with the host, and parasitism-specific gene regulation are important adaptations that allow nematodes to parasitize plants. Sequencing of a larger number of nematode genomes, including plant parasites that show different modes of parasitism or that have evolved in currently unsampled clades, and using free-living taxa as comparators would allow more detailed analysis and a better understanding of the organization of key genes within the genomes. This would facilitate a more complete understanding of the way in which parasitism has shaped the genomes of plant-parasitic nematodes.

  7. The characterization of twenty sequenced human genomes.

    Directory of Open Access Journals (Sweden)

    Kimberly Pelak

    2010-09-01

    Full Text Available We present the analysis of twenty human genomes to evaluate the prospects for identifying rare functional variants that contribute to a phenotype of interest. We sequenced at high coverage ten "case" genomes from individuals with severe hemophilia A and ten "control" genomes. We summarize the number of genetic variants emerging from a study of this magnitude, and provide a proof of concept for the identification of rare and highly-penetrant functional variants by confirming that the cause of hemophilia A is easily recognizable in this data set. We also show that the number of novel single nucleotide variants (SNVs discovered per genome seems to stabilize at about 144,000 new variants per genome, after the first 15 individuals have been sequenced. Finally, we find that, on average, each genome carries 165 homozygous protein-truncating or stop loss variants in genes representing a diverse set of pathways.

  8. Extreme-Scale De Novo Genome Assembly

    Energy Technology Data Exchange (ETDEWEB)

    Georganas, Evangelos [Intel Corporation, Santa Clara, CA (United States); Hofmeyr, Steven [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Joint Genome Inst.; Egan, Rob [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Computational Research Division; Buluc, Aydin [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Joint Genome Inst.; Oliker, Leonid [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Joint Genome Inst.; Rokhsar, Daniel [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Computational Research Division; Yelick, Katherine [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Joint Genome Inst.

    2017-09-26

    De novo whole genome assembly reconstructs genomic sequence from short, overlapping, and potentially erroneous DNA segments and is one of the most important computations in modern genomics. This work presents HipMER, a high-quality end-to-end de novo assembler designed for extreme scale analysis, via efficient parallelization of the Meraculous code. Genome assembly software has many components, each of which stresses different components of a computer system. This chapter explains the computational challenges involved in each step of the HipMer pipeline, the key distributed data structures, and communication costs in detail. We present performance results of assembling the human genome and the large hexaploid wheat genome on large supercomputers up to tens of thousands of cores.

  9. Multiscale modeling of three-dimensional genome

    Science.gov (United States)

    Zhang, Bin; Wolynes, Peter

    The genome, the blueprint of life, contains nearly all the information needed to build and maintain an entire organism. A comprehensive understanding of the genome is of paramount interest to human health and will advance progress in many areas, including life sciences, medicine, and biotechnology. The overarching goal of my research is to understand the structure-dynamics-function relationships of the human genome. In this talk, I will be presenting our efforts in moving towards that goal, with a particular emphasis on studying the three-dimensional organization, the structure of the genome with multi-scale approaches. Specifically, I will discuss the reconstruction of genome structures at both interphase and metaphase by making use of data from chromosome conformation capture experiments. Computationally modeling of chromatin fiber at atomistic level from first principles will also be presented as our effort for studying the genome structure from bottom up.

  10. Open Access Data Sharing in Genomic Research

    Directory of Open Access Journals (Sweden)

    Stacey Pereira

    2014-08-01

    Full Text Available The current emphasis on broad sharing of human genomic data generated in research in order to maximize utility and public benefit is a significant legacy of the Human Genome Project. Concerns about privacy and discrimination have led to policy responses that restrict access to genomic data as the means for protecting research participants. Our research and experience show, however, that a considerable number of research participants agree to open access sharing of their genomic data when given the choice. General policies that limit access to all genomic data fail to respect the autonomy of these participants and, at the same time, unnecessarily limit the utility of the data. We advocate instead a more balanced approach that allows for individual choice and encourages informed decision making, while protecting against the misuse of genomic data through enhanced legislation.

  11. Human Contamination in Public Genome Assemblies.

    Science.gov (United States)

    Kryukov, Kirill; Imanishi, Tadashi

    2016-01-01

    Contamination in genome assembly can lead to wrong or confusing results when using such genome as reference in sequence comparison. Although bacterial contamination is well known, the problem of human-originated contamination received little attention. In this study we surveyed 45,735 available genome assemblies for evidence of human contamination. We used lineage specificity to distinguish between contamination and conservation. We found that 154 genome assemblies contain fragments that with high confidence originate as contamination from human DNA. Majority of contaminating human sequences were present in the reference human genome assembly for over a decade. We recommend that existing contaminated genomes should be revised to remove contaminated sequence, and that new assemblies should be thoroughly checked for presence of human DNA before submitting them to public databases.

  12. Data Mining Supercomputing with SAS JMP® Genomics

    Directory of Open Access Journals (Sweden)

    Richard S. Segall

    2011-02-01

    Full Text Available JMP® Genomics is statistical discovery software that can uncover meaningful patterns in high-throughput genomics and proteomics data. JMP® Genomics is designed for biologists, biostatisticians, statistical geneticists, and those engaged in analyzing the vast stores of data that are common in genomic research (SAS, 2009. Data mining was performed using JMP® Genomics on the two collections of microarray databases available from National Center for Biotechnology Information (NCBI for lung cancer and breast cancer. The Gene Expression Omnibus (GEO of NCBI serves as a public repository for a wide range of highthroughput experimental data, including the two collections of lung cancer and breast cancer that were used for this research. The results for applying data mining using software JMP® Genomics are shown in this paper with numerous screen shots.

  13. REGEN: Ancestral Genome Reconstruction for Bacteria.

    Science.gov (United States)

    Yang, Kuan; Heath, Lenwood S; Setubal, João C

    2012-07-18

    Ancestral genome reconstruction can be understood as a phylogenetic study with more details than a traditional phylogenetic tree reconstruction. We present a new computational system called REGEN for ancestral bacterial genome reconstruction at both the gene and replicon levels. REGEN reconstructs gene content, contiguous gene runs, and replicon structure for each ancestral genome. Along each branch of the phylogenetic tree, REGEN infers evolutionary events, including gene creation and deletion and replicon fission and fusion. The reconstruction can be performed by either a maximum parsimony or a maximum likelihood method. Gene content reconstruction is based on the concept of neighboring gene pairs. REGEN was designed to be used with any set of genomes that are sufficiently related, which will usually be the case for bacteria within the same taxonomic order. We evaluated REGEN using simulated genomes and genomes in the Rhizobiales order.

  14. On the Epistemological Crisis in Genomics

    Science.gov (United States)

    Dougherty, Edward R

    2008-01-01

    There is an epistemological crisis in genomics. At issue is what constitutes scientific knowledge in genomic science, or systems biology in general. Does this crisis require a new perspective on knowledge heretofore absent from science or is it merely a matter of interpreting new scientific developments in an existing epistemological framework? This paper discusses the manner in which the experimental method, as developed and understood over recent centuries, leads naturally to a scientific epistemology grounded in an experimental-mathematical duality. It places genomics into this epistemological framework and examines the current situation in genomics. Meaning and the constitution of scientific knowledge are key concerns for genomics, and the nature of the epistemological crisis in genomics depends on how these are understood. PMID:19440447

  15. Application of Genomic Tools in Plant Breeding

    OpenAIRE

    Pérez-de-Castro, A.M.; Vilanova, S.; Cañizares, J.; Pascual, L.; Blanca, J.M.; Díez, M.J.; Prohens, J.; Picó, B.

    2012-01-01

    Plant breeding has been very successful in developing improved varieties using conventional tools and methodologies. Nowadays, the availability of genomic tools and resources is leading to a new revolution of plant breeding, as they facilitate the study of the genotype and its relationship with the phenotype, in particular for complex traits. Next Generation Sequencing (NGS) technologies are allowing the mass sequencing of genomes and transcriptomes, which is producing a vast array of genomic...

  16. The Genomic Evolution of Prostate Cancer

    Science.gov (United States)

    2017-06-01

    the proposed project : 1. To continue to acquire a comprehensive understanding of prostate cancer genomics . 2. To develop an understanding of... Genetics I • ECEV 35901 Evolutionary Genomics • Fundamentals of Clinical Research • HGEN 47400 Introduction to Probability and Statistics for Geneticists...Marc Gillard,2 David M. Hatcher,5 Westin R. Tom,5 Walter M. Stadler2 and Kevin P. White1,2,3 1Institute for Genomics and Systems Biology , Departments of

  17. Genome-scale neurogenetics: methodology and meaning.

    Science.gov (United States)

    McCarroll, Steven A; Feng, Guoping; Hyman, Steven E

    2014-06-01

    Genetic analysis is currently offering glimpses into molecular mechanisms underlying such neuropsychiatric disorders as schizophrenia, bipolar disorder and autism. After years of frustration, success in identifying disease-associated DNA sequence variation has followed from new genomic technologies, new genome data resources, and global collaborations that could achieve the scale necessary to find the genes underlying highly polygenic disorders. Here we describe early results from genome-scale studies of large numbers of subjects and the emerging significance of these results for neurobiology.

  18. Personalized medicine: new genomics, old lessons

    OpenAIRE

    Offit, Kenneth

    2011-01-01

    Personalized medicine uses traditional, as well as emerging concepts of the genetic and environmental basis of disease to individualize prevention, diagnosis and treatment. Personalized genomics plays a vital, but not exclusive role in this evolving model of personalized medicine. The distinctions between genetic and genomic medicine are more quantitative than qualitative. Personalized genomics builds on principles established by the integration of genetics into medical practice. Principles s...

  19. Discovering Complete Quasispecies In Bacterial Genomes

    OpenAIRE

    Bertels, Frederic; Gokhale, Chaitanya; Traulsen, Arne

    2017-01-01

    Mobile genetic elements can be found in almost all genomes. Possibly the most common nonautonomous mobile genetic elements in bacteria are repetitive extragenic palindromic doublets forming hairpins (REPINs) that can occur hundreds of times within a genome. The sum of all REPINs in a genome can be viewed as an evolving population because REPINs replicate and mutate. In contrast to most other biological populations, we know the exact composition of the REPIN population and the sequence of each...

  20. REGEN: Ancestral Genome Reconstruction for Bacteria

    OpenAIRE

    Yang, Kuan; Heath, Lenwood S.; Setubal, João C.

    2012-01-01

    Ancestral genome reconstruction can be understood as a phylogenetic study with more details than a traditional phylogenetic tree reconstruction. We present a new computational system called REGEN for ancestral bacterial genome reconstruction at both the gene and replicon levels. REGEN reconstructs gene content, contiguous gene runs, and replicon structure for each ancestral genome. Along each branch of the phylogenetic tree, REGEN infers evolutionary events, including gene creation and deleti...

  1. Genomic V exons from whole genome shotgun data in reptiles.

    Science.gov (United States)

    Olivieri, D N; von Haeften, B; Sánchez-Espinel, C; Faro, J; Gambón-Deza, F

    2014-08-01

    Reptiles and mammals diverged over 300 million years ago, creating two parallel evolutionary lineages amongst terrestrial vertebrates. In reptiles, two main evolutionary lines emerged: one gave rise to Squamata, while the other gave rise to Testudines, Crocodylia, and Aves. In this study, we determined the genomic variable (V) exons from whole genome shotgun sequencing (WGS) data in reptiles corresponding to the three main immunoglobulin (IG) loci and the four main T cell receptor (TR) loci. We show that Squamata lack the TRG and TRD genes, and snakes lack the IGKV genes. In representative species of Testudines and Crocodylia, the seven major IG and TR loci are maintained. As in mammals, genes of the IG loci can be grouped into well-defined IMGT clans through a multi-species phylogenetic analysis. We show that the reptilian IGHV and IGLV genes are distributed amongst the established mammalian clans, while their IGKV genes are found within a single clan, nearly exclusive from the mammalian sequences. The reptilian and mammalian TRAV genes cluster into six common evolutionary clades (since IMGT clans have not been defined for TR). In contrast, the reptilian TRBV genes cluster into three clades, which have few mammalian members. In this locus, the V exon sequences from mammals appear to have undergone different evolutionary diversification processes that occurred outside these shared reptilian clans. These sequences can be obtained in a freely available public repository (http://vgenerepertoire.org).

  2. What does it mean to be genomically literate?: National Human Genome Research Institute Meeting Report.

    Science.gov (United States)

    Hurle, Belen; Citrin, Toby; Jenkins, Jean F; Kaphingst, Kimberly A; Lamb, Neil; Roseman, Jo Ellen; Bonham, Vence L

    2013-08-01

    Genomic discoveries will increasingly advance the science of medicine. Limited genomic literacy may adversely impact the public's understanding and use of the power of genetics and genomics in health care and public health. In November 2011, a meeting was held by the National Human Genome Research Institute to examine the challenge of achieving genomic literacy for the general public, from kindergarten to grade 12 to adult education. The role of the media in disseminating scientific messages and in perpetuating or reducing misconceptions was also discussed. Workshop participants agreed that genomic literacy will be achieved only through active engagement between genomics experts and the varied constituencies that comprise the public. This report summarizes the background, content, and outcomes from this meeting, including recommendations for a research agenda to inform decisions about how to advance genomic literacy in our society.

  3. The Global Invertebrate Genomics Alliance (GIGA): Developing Community Resources to Study Diverse Invertebrate Genomes

    KAUST Repository

    Bracken-Grissom, Heather; Collins, Allen G.; Collins, Timothy; Crandall, Keith; Distel, Daniel; Dunn, Casey; Giribet, Gonzalo; Haddock, Steven; Knowlton, Nancy; Martindale, Mark; Medina, Monica; Messing, Charles; O'Brien, Stephen J.; Paulay, Gustav; Putnam, Nicolas; Ravasi, Timothy; Rouse, Greg W.; Ryan, Joseph F.; Schulze, Anja; Worheide, Gert; Adamska, Maja; Bailly, Xavier; Breinholt, Jesse; Browne, William E.; Diaz, M. Christina; Evans, Nathaniel; Flot, Jean-Francois; Fogarty, Nicole; Johnston, Matthew; Kamel, Bishoy; Kawahara, Akito Y.; Laberge, Tammy; Lavrov, Dennis; Michonneau, Francois; Moroz, Leonid L.; Oakley, Todd; Osborne, Karen; Pomponi, Shirley A.; Rhodes, Adelaide; Rodriguez-Lanetty, Mauricio; Santos, Scott R.; Satoh, Nori; Thacker, Robert W.; Van de Peer, Yves; Voolstra, Christian R.; Welch, David Mark; Winston, Judith; Zhou, Xin

    2013-01-01

    Over 95% of all metazoan (animal) species comprise the invertebrates, but very few genomes from these organisms have been sequenced. We have, therefore, formed a Global Invertebrate Genomics Alliance (GIGA). Our intent is to build a collaborative

  4. The life cycle of a genome project: perspectives and guidelines inspired by insect genome projects.

    Science.gov (United States)

    Papanicolaou, Alexie

    2016-01-01

    Many research programs on non-model species biology have been empowered by genomics. In turn, genomics is underpinned by a reference sequence and ancillary information created by so-called "genome projects". The most reliable genome projects are the ones created as part of an active research program and designed to address specific questions but their life extends past publication. In this opinion paper I outline four key insights that have facilitated maintaining genomic communities: the key role of computational capability, the iterative process of building genomic resources, the value of community participation and the importance of manual curation. Taken together, these ideas can and do ensure the longevity of genome projects and the growing non-model species community can use them to focus a discussion with regards to its future genomic infrastructure.

  5. The genome portal of the Department of Energy Joint Genome Institute: 2014 updates

    Energy Technology Data Exchange (ETDEWEB)

    Nordberg, Henrik [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Cantor, Michael [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Dusheyko, Serge [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Hua, Susan [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Poliakov, Alexander [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Shabalov, Igor [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Smirnova, Tatyana [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Grigoriev, Igor V. [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Dubchak, Inna [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States)

    2013-11-12

    The U.S. Department of Energy (DOE) Joint Genome Institute (JGI), a national user facility, serves the diverse scientific community by providing integrated high-throughput sequencing and computational analysis to enable system-based scientific approaches in support of DOE missions related to clean energy generation and environmental characterization. The JGI Genome Portal (http://genome.jgi.doe.gov) provides unified access to all JGI genomic databases and analytical tools. The JGI maintains extensive data management systems and specialized analytical capabilities to manage and interpret complex genomic data. A user can search, download and explore multiple data sets available for all DOE JGI sequencing projects including their status, assemblies and annotations of sequenced genomes. In this paper, we describe major updates of the Genome Portal in the past 2 years with a specific emphasis on efficient handling of the rapidly growing amount of diverse genomic data accumulated in JGI.

  6. [Genomic selection and its application].

    Science.gov (United States)

    Li, Heng-De; Bao, Zhen-Min; Sun, Xiao-Wen

    2011-12-01

    Selective breeding is very important in agricultural production and breeding value estimation is the core of selective breeding. With the development of genetic markers, especially high throughput genotyping technology, it becomes available to estimate breeding value at genome level, i.e. genomic selection (GS). In this review, the methods of GS was categorized into two groups: one is to predict genomic estimated breeding value (GEBV) based on the allele effect, such as least squares, random regression - best linear unbiased prediction (RR-BLUP), Bayes and principle component analysis, etc; the other is to predict GEBV with genetic relationship matrix, which constructs genetic relationship matrix via high throughput genetic markers and then predicts GEBV through linear mixed model, i.e. GBLUP. The basic principles of these methods were also introduced according to the above two classifications. Factors affecting GS accuracy include markers of type and density, length of haplotype, the size of reference population, the extent between marker-QTL and so on. Among the methods of GS, Bayes and GBLUP are usually more accurate than the others and least squares is the worst. GBLUP is time-efficient and can combine pedigree with genotypic information, hence it is superior to other methods. Although progress was made in GS, there are still some challenges, for examples, united breeding, long-term genetic gain with GS, and disentangling markers with and without contribution to the traits. GS has been applied in animal and plant breeding practice and also has the potential to predict genetic predisposition in humans and study evolutionary dynamics. GS, which is more precise than the traditional method, is a breakthrough at measuring genetic relationship. Therefore, GS will be a revolutionary event in the history of animal and plant breeding.

  7. Genomic dysregulation in gastric tumors.

    Science.gov (United States)

    Janjigian, Yelena Y; Kelsen, David P

    2013-03-01

    Gastric cancer is among the most common human malignancies and the second leading cause of cancer-related death. The different epidemiologic and histopathology of subtypes of gastric cancer are associated with different genomic patterns. Data suggests that gene expression patterns of proximal, distal gastric cancers-intestinal type, and diffuse/signet cell are well separated. This review summarizes the genetic and epigenetic changes thought to drive gastric cancer and the emerging paradigm of gastric cancer as three unique disease subtypes. Copyright © 2012 Wiley Periodicals, Inc.

  8. Quality Assessment of Domesticated Animal Genome Assemblies

    DEFF Research Database (Denmark)

    Seemann, Stefan E; Anthon, Christian; Palasca, Oana

    2015-01-01

    affected by the lack of genomic sequence. Herein, we quantify the quality of the genome assemblies of 20 domesticated animals and related species by assessing a range of measurable parameters, and we show that there is a positive correlation between the fraction of mappable reads from RNAseq data...... domesticated animal genomes still need to be sequenced deeper in order to produce high-quality assemblies. In the meanwhile, ironically, the extent to which RNAseq and other next-generation data is produced frequently far exceeds that of the genomic sequence. Furthermore, basic comparative analysis is often...

  9. Privacy Challenges of Genomic Big Data.

    Science.gov (United States)

    Shen, Hong; Ma, Jian

    2017-01-01

    With the rapid advancement of high-throughput DNA sequencing technologies, genomics has become a big data discipline where large-scale genetic information of human individuals can be obtained efficiently with low cost. However, such massive amount of personal genomic data creates tremendous challenge for privacy, especially given the emergence of direct-to-consumer (DTC) industry that provides genetic testing services. Here we review the recent development in genomic big data and its implications on privacy. We also discuss the current dilemmas and future challenges of genomic privacy.

  10. Genome technologies and personalized dental medicine.

    Science.gov (United States)

    Eng, G; Chen, A; Vess, T; Ginsburg, G S

    2012-04-01

    The addition of genomic information to our understanding of oral disease is driving important changes in oral health care. It is anticipated that genome-derived information will promote a deeper understanding of disease etiology and permit earlier diagnosis, allowing for preventative measures prior to disease onset rather than treatment that attempts to repair the diseased state. Advances in genome technologies have fueled expectations for this proactive healthcare approach. Application of genomic testing is expanding and has already begun to find its way into the practice of clinical dentistry. To take full advantage of the information and technologies currently available, it is vital that dental care providers, consumers, and policymakers be aware of genomic approaches to understanding of oral diseases and the application of genomic testing to disease diagnosis and treatment. Ethical, legal, clinical, and educational initiatives are also required to responsibly incorporate genomic information into the practice of dentistry. This article provides an overview of the application of genomic technologies to oral health care and introduces issues that require consideration if we are to realize the full potential of genomics to enable the practice of personalized dental medicine. © 2011 John Wiley & Sons A/S.

  11. Microbial species delineation using whole genome sequences.

    Science.gov (United States)

    Varghese, Neha J; Mukherjee, Supratim; Ivanova, Natalia; Konstantinidis, Konstantinos T; Mavrommatis, Kostas; Kyrpides, Nikos C; Pati, Amrita

    2015-08-18

    Increased sequencing of microbial genomes has revealed that prevailing prokaryotic species assignments can be inconsistent with whole genome information for a significant number of species. The long-standing need for a systematic and scalable species assignment technique can be met by the genome-wide Average Nucleotide Identity (gANI) metric, which is widely acknowledged as a robust measure of genomic relatedness. In this work, we demonstrate that the combination of gANI and the alignment fraction (AF) between two genomes accurately reflects their genomic relatedness. We introduce an efficient implementation of AF,gANI and discuss its successful application to 86.5M genome pairs between 13,151 prokaryotic genomes assigned to 3032 species. Subsequently, by comparing the genome clusters obtained from complete linkage clustering of these pairs to existing taxonomy, we observed that nearly 18% of all prokaryotic species suffer from anomalies in species definition. Our results can be used to explore central questions such as whether microorganisms form a continuum of genetic diversity or distinct species represented by distinct genetic signatures. We propose that this precise and objective AF,gANI-based species definition: the MiSI (Microbial Species Identifier) method, be used to address previous inconsistencies in species classification and as the primary guide for new taxonomic species assignment, supplemented by the traditional polyphasic approach, as required. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  12. Ecology and genomics of Bacillus subtilis.

    Science.gov (United States)

    Earl, Ashlee M; Losick, Richard; Kolter, Roberto

    2008-06-01

    Bacillus subtilis is a remarkably diverse bacterial species that is capable of growth within many environments. Recent microarray-based comparative genomic analyses have revealed that members of this species also exhibit considerable genomic diversity. The identification of strain-specific genes might explain how B. subtilis has become so broadly adapted. The goal of identifying ecologically adaptive genes could soon be realized with the imminent release of several new B. subtilis genome sequences. As we embark upon this exciting new era of B. subtilis comparative genomics we review what is currently known about the ecology and evolution of this species.

  13. Environmental Medicine Genome Bank (EMGB): Current Composition

    National Research Council Canada - National Science Library

    Sonna, Larry

    2000-01-01

    The USARIEM Environmental Medicine Genome Bank (EMGB) project is an ongoing effort to identify and characterize genes relevant to environmental injuries and illnesses and to human physical performance...

  14. DEFINING THE CHEMICAL SPACE OF PUBLIC GENOMIC ...

    Science.gov (United States)

    The current project aims to chemically index the genomics content of public genomic databases to make these data accessible in relation to other publicly available, chemically-indexed toxicological information. By defining the chemical space of public genomic data, it is possible to identify classes of chemicals on which to develop methodologies for the integration of chemogenomic data into predictive toxicology. The chemical space of public genomic data will be presented as well as the methodologies and tools developed to identify this chemical space.

  15. Genomics for paediatricians: promises and pitfalls.

    Science.gov (United States)

    Hammond, Carrie Louise; Willoughby, Josh Matthew; Parker, Michael James

    2018-03-24

    In recent years, there have been significant advances in genetic technologies, evolving the field of genomics from genetics. This has huge diagnostic potential, as genomic testing increasingly becomes part of mainstream medicine. However, there are numerous potential pitfalls in the interpretation of genomic data. It is therefore essential that we educate clinicians more widely about the appropriate interpretation and utilisation of genomic testing. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.

  16. A Genomics Approach to Tumor Gemome Analysis

    National Research Council Canada - National Science Library

    Collins, Colin

    2002-01-01

    Genomes of solid tumors are often highly rearranged and these rearrangements promote cancer progression through disruption of genes mediating immortality, survival, metastasis, and resistance to therapy...

  17. Initiating genomic selection in tetraploid potato

    DEFF Research Database (Denmark)

    Sverrisdóttir, Elsa; Janss, Luc; Byrne, Stephen

    Breeding for more space and resource efficient crops is important to feed the world’s increasing population. Potatoes produce approximately twice the amount of calories per hectare compared to cereals. The traditional “mate and phenotype” breeding approach is costly and time-consuming; however......, the completion of the genome sequence of potato has enabled the application of genomics-assisted breeding technologies. Genomic selection using genome-wide molecular markers is becoming increasingly applicable to crops as the genotyping costs continue to reduce and it is thus an attractive breeding alternative...... selection, can be obtained with good prediction accuracies in tetraploid potato....

  18. Radiation-induced instability of human genome

    International Nuclear Information System (INIS)

    Ryabchenko, N.N.; Demina, Eh.A.

    2014-01-01

    A brief review is dedicated to the phenomenon of radiation-induced genomic instability where the increased level of genomic changes in the offspring of irradiated cells is characteristic. Particular attention is paid to the problems of genomic instability induced by the low-dose radiation, role of the bystander effect in formation of radiation-induced instability, and its relationship with individual radiosensitivity. We believe that in accordance with the paradigm of modern radiobiology the increased human individual radiosensitivity can be formed due to the genome instability onset and is a significant risk factor for radiation-induced cancer

  19. The infinite sites model of genome evolution.

    Science.gov (United States)

    Ma, Jian; Ratan, Aakrosh; Raney, Brian J; Suh, Bernard B; Miller, Webb; Haussler, David

    2008-09-23

    We formalize the problem of recovering the evolutionary history of a set of genomes that are related to an unseen common ancestor genome by operations of speciation, deletion, insertion, duplication, and rearrangement of segments of bases. The problem is examined in the limit as the number of bases in each genome goes to infinity. In this limit, the chromosomes are represented by continuous circles or line segments. For such an infinite-sites model, we present a polynomial-time algorithm to find the most parsimonious evolutionary history of any set of related present-day genomes.

  20. Evolution of the Largest Mammalian Genome.

    Science.gov (United States)

    Evans, Ben J; Upham, Nathan S; Golding, Goeffrey B; Ojeda, Ricardo A; Ojeda, Agustina A

    2017-06-01

    The genome of the red vizcacha rat (Rodentia, Octodontidae, Tympanoctomys barrerae) is the largest of all mammals, and about double the size of their close relative, the mountain vizcacha rat Octomys mimax, even though the lineages that gave rise to these species diverged from each other only about 5 Ma. The mechanism for this rapid genome expansion is controversial, and hypothesized to be a consequence of whole genome duplication or accumulation of repetitive elements. To test these alternative but nonexclusive hypotheses, we gathered and evaluated evidence from whole transcriptome and whole genome sequences of T. barrerae and O. mimax. We recovered support for genome expansion due to accumulation of a diverse assemblage of repetitive elements, which represent about one half and one fifth of the genomes of T. barrerae and O. mimax, respectively, but we found no strong signal of whole genome duplication. In both species, repetitive sequences were rare in transcribed regions as compared with the rest of the genome, and mostly had no close match to annotated repetitive sequences from other rodents. These findings raise new questions about the genomic dynamics of these repetitive elements, their connection to widespread chromosomal fissions that occurred in the T. barrerae ancestor, and their fitness effects-including during the evolution of hypersaline dietary tolerance in T. barrerae. ©The Author(s) 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  1. Deleterious mutation accumulation in organelle genomes.

    Science.gov (United States)

    Lynch, M; Blanchard, J L

    1998-01-01

    It is well established on theoretical grounds that the accumulation of mildly deleterious mutations in nonrecombining genomes is a major extinction risk in obligately asexual populations. Sexual populations can also incur mutational deterioration in genomic regions that experience little or no recombination, i.e., autosomal regions near centromeres, Y chromosomes, and organelle genomes. Our results suggest, for a wide array of genes (transfer RNAs, ribosomal RNAs, and proteins) in a diverse collection of species (animals, plants, and fungi), an almost universal increase in the fixation probabilities of mildly deleterious mutations arising in mitochondrial and chloroplast genomes relative to those arising in the recombining nuclear genome. This enhanced width of the selective sieve in organelle genomes does not appear to be a consequence of relaxed selection, but can be explained by the decline in the efficiency of selection that results from the reduction of effective population size induced by uniparental inheritance. Because of the very low mutation rates of organelle genomes (on the order of 10(-4) per genome per year), the reduction in fitness resulting from mutation accumulation in such genomes is a very long-term process, not likely to imperil many species on time scales of less than a million years, but perhaps playing some role in phylogenetic lineage sorting on time scales of 10 to 100 million years.

  2. Goodbye genome paper, hello genome report: the increasing popularity of 'genome announcements' and their impact on science.

    Science.gov (United States)

    Smith, David Roy

    2017-05-01

    Next-generation sequencing technologies have revolutionized genomics and altered the scientific publication landscape. Life-science journals abound with genome papers-peer-reviewed descriptions of newly sequenced chromosomes. Although they once filled the pages of Nature and Science, genome papers are now mostly relegated to journals with low-impact factors. Some have forecast the death of the genome paper and argued that they are using up valuable resources and not advancing science. However, the publication rate of genome papers is on the rise. This increase is largely because some journals have created a new category of manuscript called genome reports, which are short, fast-tracked papers describing a chromosome sequence(s), its GenBank accession number and little else. In 2015, for example, more than 2000 genome reports were published, and 2016 is poised to bring even more. Here, I highlight the growing popularity of genome reports and discuss their merits, drawbacks and impact on science and the academic publication infrastructure. Genome reports can be excellent assets for the research community, but they are also being used as quick and easy routes to a publication, and in some instances they are not peer reviewed. One of the best arguments for genome reports is that they are a citable, user-generated genomic resource providing essential methodological and biological information, which may not be present in the sequence database. But they are expensive and time-consuming avenues for achieving such a goal. © The Author 2016. Published by Oxford University Press.

  3. Population Genomics of Infectious and Integrated Wolbachia pipientis Genomes in Drosophila ananassae

    Science.gov (United States)

    Choi, Jae Young; Bubnell, Jaclyn E.; Aquadro, Charles F.

    2015-01-01

    Coevolution between Drosophila and its endosymbiont Wolbachia pipientis has many intriguing aspects. For example, Drosophila ananassae hosts two forms of W. pipientis genomes: One being the infectious bacterial genome and the other integrated into the host nuclear genome. Here, we characterize the infectious and integrated genomes of W. pipientis infecting D. ananassae (wAna), by genome sequencing 15 strains of D. ananassae that have either the infectious or integrated wAna genomes. Results indicate evolutionarily stable maternal transmission for the infectious wAna genome suggesting a relatively long-term coevolution with its host. In contrast, the integrated wAna genome showed pseudogene-like characteristics accumulating many variants that are predicted to have deleterious effects if present in an infectious bacterial genome. Phylogenomic analysis of sequence variation together with genotyping by polymerase chain reaction of large structural variations indicated several wAna variants among the eight infectious wAna genomes. In contrast, only a single wAna variant was found among the seven integrated wAna genomes examined in lines from Africa, south Asia, and south Pacific islands suggesting that the integration occurred once from a single infectious wAna genome and then spread geographically. Further analysis revealed that for all D. ananassae we examined with the integrated wAna genomes, the majority of the integrated wAna genomic regions is represented in at least two copies suggesting a double integration or single integration followed by an integrated genome duplication. The possible evolutionary mechanism underlying the widespread geographical presence of the duplicate integration of the wAna genome is an intriguing question remaining to be answered. PMID:26254486

  4. Genomic landscapes of Chinese hamster ovary cell lines as revealed by the Cricetulus griseus draft genome

    DEFF Research Database (Denmark)

    Lewis, Nathan E; Liu, Xin; Li, Yuxiang

    2013-01-01

    stymied by the lack of a unifying genomic resource for CHO cells. Here we report a 2.4-Gb draft genome sequence of a female Chinese hamster, Cricetulus griseus, harboring 24,044 genes. We also resequenced and analyzed the genomes of six CHO cell lines from the CHO-K1, DG44 and CHO-S lineages...

  5. GI-POP: a combinational annotation and genomic island prediction pipeline for ongoing microbial genome projects.

    Science.gov (United States)

    Lee, Chi-Ching; Chen, Yi-Ping Phoebe; Yao, Tzu-Jung; Ma, Cheng-Yu; Lo, Wei-Cheng; Lyu, Ping-Chiang; Tang, Chuan Yi

    2013-04-10

    Sequencing of microbial genomes is important because of microbial-carrying antibiotic and pathogenetic activities. However, even with the help of new assembling software, finishing a whole genome is a time-consuming task. In most bacteria, pathogenetic or antibiotic genes are carried in genomic islands. Therefore, a quick genomic island (GI) prediction method is useful for ongoing sequencing genomes. In this work, we built a Web server called GI-POP (http://gipop.life.nthu.edu.tw) which integrates a sequence assembling tool, a functional annotation pipeline, and a high-performance GI predicting module, in a support vector machine (SVM)-based method called genomic island genomic profile scanning (GI-GPS). The draft genomes of the ongoing genome projects in contigs or scaffolds can be submitted to our Web server, and it provides the functional annotation and highly probable GI-predicting results. GI-POP is a comprehensive annotation Web server designed for ongoing genome project analysis. Researchers can perform annotation and obtain pre-analytic information include possible GIs, coding/non-coding sequences and functional analysis from their draft genomes. This pre-analytic system can provide useful information for finishing a genome sequencing project. Copyright © 2012 Elsevier B.V. All rights reserved.

  6. Spaces of genomics : exploring the innovation journey of genomics in research on common disease

    NARCIS (Netherlands)

    Bitsch, L.

    2013-01-01

    Genomics was introduced with big promises and expectations of its future contribution to our society. Medical genomics was introduced as that which would lay the foundation for a revolution in our management of common diseases. Genomics would lead the way towards a future of personalised medicine.

  7. Tolerance of Whole-Genome Doubling Propagates Chromosomal Instability and Accelerates Cancer Genome Evolution

    DEFF Research Database (Denmark)

    Dewhurst, Sally M.; McGranahan, Nicholas; Burrell, Rebecca A.

    2014-01-01

    The contribution of whole-genome doubling to chromosomal instability (CIN) and tumor evolution is unclear. We use long-term culture of isogenic tetraploid cells from a stable diploid colon cancer progenitor to investigate how a genome-doubling event affects genome stability over time. Rare cells...

  8. A Trichosporonales genome tree based on 27 haploid and three evolutionarily conserved 'natural' hybrid genomes.

    Science.gov (United States)

    Takashima, Masako; Sriswasdi, Sira; Manabe, Ri-Ichiroh; Ohkuma, Moriya; Sugita, Takashi; Iwasaki, Wataru

    2018-01-01

    To construct a backbone tree consisting of basidiomycetous yeasts, draft genome sequences from 25 species of Trichosporonales (Tremellomycetes, Basidiomycota) were generated. In addition to the hybrid genomes of Trichosporon coremiiforme and Trichosporon ovoides that we described previously, we identified an interspecies hybrid genome in Cutaneotrichosporon mucoides (formerly Trichosporon mucoides). This hybrid genome had a gene retention rate of ~55%, and its closest haploid relative was Cutaneotrichosporon dermatis. After constructing the C. mucoides subgenomes, we generated a phylogenetic tree using genome data from the 27 haploid species and the subgenome data from the three hybrid genome species. It was a high-quality tree with 100% bootstrap support for all of the branches. The genome-based tree provided superior resolution compared with previous multi-gene analyses. Although our backbone tree does not include all Trichosporonales genera (e.g. Cryptotrichosporon), it will be valuable for future analyses of genome data. Interest in interspecies hybrid fungal genomes has recently increased because they may provide a basis for new technologies. The three Trichosporonales hybrid genomes described in this study are different from well-characterized hybrid genomes (e.g. those of Saccharomyces pastorianus and Saccharomyces bayanus) because these hybridization events probably occurred in the distant evolutionary past. Hence, they will be useful for studying genome stability following hybridization and speciation events. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.

  9. Rhipicephalus (Boophilus) microplus strain Deutsch, whole genome shotgun sequencing project first submission of genome sequence

    Science.gov (United States)

    The size and repetitive nature of the Rhipicephalus microplus genome makes obtaining a full genome sequence difficult. Cot filtration/selection techniques were used to reduce the repetitive fraction of the tick genome and enrich for the fraction of DNA with gene-containing regions. The Cot-selected ...

  10. Genomic analyses of the Chlamydia trachomatis core genome show an association between chromosomal genome, plasmid type and disease

    NARCIS (Netherlands)

    Versteeg, Bart; Bruisten, Sylvia M.; Pannekoek, Yvonne; Jolley, Keith A.; Maiden, Martin C. J.; van der Ende, Arie; Harrison, Odile B.

    2018-01-01

    Background: Chlamydia trachomatis (Ct) plasmid has been shown to encode genes essential for infection. We evaluated the population structure of Ct using whole-genome sequence data (WGS). In particular, the relationship between the Ct genome, plasmid and disease was investigated. Results: WGS data

  11. Genomic analysis of Fusarium verticillioides.

    Science.gov (United States)

    Brown, D W; Butchko, R A E; Proctor, R H

    2008-09-01

    Fusarium verticillioides (teleomorph Gibberella moniliformis) can be either an endophyte of maize, causing no visible disease, or a pathogen-causing disease of ears, stalks, roots and seedlings. At any stage, this fungus can synthesize fumonisins, a family of mycotoxins structurally similar to the sphingolipid sphinganine. Ingestion of fumonisin-contaminated maize has been associated with a number of animal diseases, including cancer in rodents, and exposure has been correlated with human oesophageal cancer in some regions of the world, and some evidence suggests that fumonisins are a risk factor for neural tube defects. A primary goal of the authors' laboratory is to eliminate fumonisin contamination of maize and maize products. Understanding how and why these toxins are made and the F. verticillioides-maize disease process will allow one to develop novel strategies to limit tissue destruction (rot) and fumonisin production. To meet this goal, genomic sequence data, expressed sequence tags (ESTs) and microarrays are being used to identify F. verticillioides genes involved in the biosynthesis of toxins and plant pathogenesis. This paper describes the current status of F. verticillioides genomic resources and three approaches being used to mine microarray data from a wild-type strain cultured in liquid fumonisin production medium for 12, 24, 48, 72, 96 and 120h. Taken together, these approaches demonstrate the power of microarray technology to provide information on different biological processes.

  12. DNABIT Compress - Genome compression algorithm.

    Science.gov (United States)

    Rajarajeswari, Pothuraju; Apparao, Allam

    2011-01-22

    Data compression is concerned with how information is organized in data. Efficient storage means removal of redundancy from the data being stored in the DNA molecule. Data compression algorithms remove redundancy and are used to understand biologically important molecules. We present a compression algorithm, "DNABIT Compress" for DNA sequences based on a novel algorithm of assigning binary bits for smaller segments of DNA bases to compress both repetitive and non repetitive DNA sequence. Our proposed algorithm achieves the best compression ratio for DNA sequences for larger genome. Significantly better compression results show that "DNABIT Compress" algorithm is the best among the remaining compression algorithms. While achieving the best compression ratios for DNA sequences (Genomes),our new DNABIT Compress algorithm significantly improves the running time of all previous DNA compression programs. Assigning binary bits (Unique BIT CODE) for (Exact Repeats, Reverse Repeats) fragments of DNA sequence is also a unique concept introduced in this algorithm for the first time in DNA compression. This proposed new algorithm could achieve the best compression ratio as much as 1.58 bits/bases where the existing best methods could not achieve a ratio less than 1.72 bits/bases.

  13. Functional Insights from Structural Genomics

    Energy Technology Data Exchange (ETDEWEB)

    Forouhar,F.; Kuzin, A.; Seetharaman, J.; Lee, I.; Zhou, W.; Abashidze, M.; Chen, Y.; Montelione, G.; Tong, L.; et al

    2007-01-01

    Structural genomics efforts have produced structural information, either directly or by modeling, for thousands of proteins over the past few years. While many of these proteins have known functions, a large percentage of them have not been characterized at the functional level. The structural information has provided valuable functional insights on some of these proteins, through careful structural analyses, serendipity, and structure-guided functional screening. Some of the success stories based on structures solved at the Northeast Structural Genomics Consortium (NESG) are reported here. These include a novel methyl salicylate esterase with important role in plant innate immunity, a novel RNA methyltransferase (H. influenzae yggJ (HI0303)), a novel spermidine/spermine N-acetyltransferase (B. subtilis PaiA), a novel methyltransferase or AdoMet binding protein (A. fulgidus AF{_}0241), an ATP:cob(I)alamin adenosyltransferase (B. subtilis YvqK), a novel carboxysome pore (E. coli EutN), a proline racemase homolog with a disrupted active site (B. melitensis BME11586), an FMN-dependent enzyme (S. pneumoniae SP{_}1951), and a 12-stranded {beta}-barrel with a novel fold (V. parahaemolyticus VPA1032).

  14. Observing copepods through a genomic lens

    Science.gov (United States)

    2011-01-01

    Background Copepods outnumber every other multicellular animal group. They are critical components of the world's freshwater and marine ecosystems, sensitive indicators of local and global climate change, key ecosystem service providers, parasites and predators of economically important aquatic animals and potential vectors of waterborne disease. Copepods sustain the world fisheries that nourish and support human populations. Although genomic tools have transformed many areas of biological and biomedical research, their power to elucidate aspects of the biology, behavior and ecology of copepods has only recently begun to be exploited. Discussion The extraordinary biological and ecological diversity of the subclass Copepoda provides both unique advantages for addressing key problems in aquatic systems and formidable challenges for developing a focused genomics strategy. This article provides an overview of genomic studies of copepods and discusses strategies for using genomics tools to address key questions at levels extending from individuals to ecosystems. Genomics can, for instance, help to decipher patterns of genome evolution such as those that occur during transitions from free living to symbiotic and parasitic lifestyles and can assist in the identification of genetic mechanisms and accompanying physiological changes associated with adaptation to new or physiologically challenging environments. The adaptive significance of the diversity in genome size and unique mechanisms of genome reorganization during development could similarly be explored. Genome-wide and EST studies of parasitic copepods of salmon and large EST studies of selected free-living copepods have demonstrated the potential utility of modern genomics approaches for the study of copepods and have generated resources such as EST libraries, shotgun genome sequences, BAC libraries, genome maps and inbred lines that will be invaluable in assisting further efforts to provide genomics tools for

  15. Observing copepods through a genomic lens

    Directory of Open Access Journals (Sweden)

    Johnson Stewart C

    2011-09-01

    Full Text Available Abstract Background Copepods outnumber every other multicellular animal group. They are critical components of the world's freshwater and marine ecosystems, sensitive indicators of local and global climate change, key ecosystem service providers, parasites and predators of economically important aquatic animals and potential vectors of waterborne disease. Copepods sustain the world fisheries that nourish and support human populations. Although genomic tools have transformed many areas of biological and biomedical research, their power to elucidate aspects of the biology, behavior and ecology of copepods has only recently begun to be exploited. Discussion The extraordinary biological and ecological diversity of the subclass Copepoda provides both unique advantages for addressing key problems in aquatic systems and formidable challenges for developing a focused genomics strategy. This article provides an overview of genomic studies of copepods and discusses strategies for using genomics tools to address key questions at levels extending from individuals to ecosystems. Genomics can, for instance, help to decipher patterns of genome evolution such as those that occur during transitions from free living to symbiotic and parasitic lifestyles and can assist in the identification of genetic mechanisms and accompanying physiological changes associated with adaptation to new or physiologically challenging environments. The adaptive significance of the diversity in genome size and unique mechanisms of genome reorganization during development could similarly be explored. Genome-wide and EST studies of parasitic copepods of salmon and large EST studies of selected free-living copepods have demonstrated the potential utility of modern genomics approaches for the study of copepods and have generated resources such as EST libraries, shotgun genome sequences, BAC libraries, genome maps and inbred lines that will be invaluable in assisting further efforts to

  16. Aligning the unalignable: bacteriophage whole genome alignments.

    Science.gov (United States)

    Bérard, Sèverine; Chateau, Annie; Pompidor, Nicolas; Guertin, Paul; Bergeron, Anne; Swenson, Krister M

    2016-01-13

    In recent years, many studies focused on the description and comparison of large sets of related bacteriophage genomes. Due to the peculiar mosaic structure of these genomes, few informative approaches for comparing whole genomes exist: dot plots diagrams give a mostly qualitative assessment of the similarity/dissimilarity between two or more genomes, and clustering techniques are used to classify genomes. Multiple alignments are conspicuously absent from this scene. Indeed, whole genome aligners interpret lack of similarity between sequences as an indication of rearrangements, insertions, or losses. This behavior makes them ill-prepared to align bacteriophage genomes, where even closely related strains can accomplish the same biological function with highly dissimilar sequences. In this paper, we propose a multiple alignment strategy that exploits functional collinearity shared by related strains of bacteriophages, and uses partial orders to capture mosaicism of sets of genomes. As classical alignments do, the computed alignments can be used to predict that genes have the same biological function, even in the absence of detectable similarity. The Alpha aligner implements these ideas in visual interactive displays, and is used to compute several examples of alignments of Staphylococcus aureus and Mycobacterium bacteriophages, involving up to 29 genomes. Using these datasets, we prove that Alpha alignments are at least as good as those computed by standard aligners. Comparison with the progressive Mauve aligner - which implements a partial order strategy, but whose alignments are linearized - shows a greatly improved interactive graphic display, while avoiding misalignments. Multiple alignments of whole bacteriophage genomes work, and will become an important conceptual and visual tool in comparative genomics of sets of related strains. A python implementation of Alpha, along with installation instructions for Ubuntu and OSX, is available on bitbucket (https://bitbucket.org/thekswenson/alpha).

  17. Dynamics of genome rearrangement in bacterial populations.

    Directory of Open Access Journals (Sweden)

    Aaron E Darling

    2008-07-01

    Full Text Available Genome structure variation has profound impacts on phenotype in organisms ranging from microbes to humans, yet little is known about how natural selection acts on genome arrangement. Pathogenic bacteria such as Yersinia pestis, which causes bubonic and pneumonic plague, often exhibit a high degree of genomic rearrangement. The recent availability of several Yersinia genomes offers an unprecedented opportunity to study the evolution of genome structure and arrangement. We introduce a set of statistical methods to study patterns of rearrangement in circular chromosomes and apply them to the Yersinia. We constructed a multiple alignment of eight Yersinia genomes using Mauve software to identify 78 conserved segments that are internally free from genome rearrangement. Based on the alignment, we applied Bayesian statistical methods to infer the phylogenetic inversion history of Yersinia. The sampling of genome arrangement reconstructions contains seven parsimonious tree topologies, each having different histories of 79 inversions. Topologies with a greater number of inversions also exist, but were sampled less frequently. The inversion phylogenies agree with results suggested by SNP patterns. We then analyzed reconstructed inversion histories to identify patterns of rearrangement. We confirm an over-representation of "symmetric inversions"-inversions with endpoints that are equally distant from the origin of chromosomal replication. Ancestral genome arrangements demonstrate moderate preference for replichore balance in Yersinia. We found that all inversions are shorter than expected under a neutral model, whereas inversions acting within a single replichore are much shorter than expected. We also found evidence for a canonical configuration of the origin and terminus of replication. Finally, breakpoint reuse analysis reveals that inversions with endpoints proximal to the origin of DNA replication are nearly three times more frequent. Our findings

  18. Simultaneous gene finding in multiple genomes.

    Science.gov (United States)

    König, Stefanie; Romoth, Lars W; Gerischer, Lizzy; Stanke, Mario

    2016-11-15

    As the tree of life is populated with sequenced genomes ever more densely, the new challenge is the accurate and consistent annotation of entire clades of genomes. We address this problem with a new approach to comparative gene finding that takes a multiple genome alignment of closely related species and simultaneously predicts the location and structure of protein-coding genes in all input genomes, thereby exploiting negative selection and sequence conservation. The model prefers potential gene structures in the different genomes that are in agreement with each other, or-if not-where the exon gains and losses are plausible given the species tree. We formulate the multi-species gene finding problem as a binary labeling problem on a graph. The resulting optimization problem is NP hard, but can be efficiently approximated using a subgradient-based dual decomposition approach. The proposed method was tested on whole-genome alignments of 12 vertebrate and 12 Drosophila species. The accuracy was evaluated for human, mouse and Drosophila melanogaster and compared to competing methods. Results suggest that our method is well-suited for annotation of (a large number of) genomes of closely related species within a clade, in particular, when RNA-Seq data are available for many of the genomes. The transfer of existing annotations from one genome to another via the genome alignment is more accurate than previous approaches that are based on protein-spliced alignments, when the genomes are at close to medium distances. The method is implemented in C ++ as part of Augustus and available open source at http://bioinf.uni-greifswald.de/augustus/ CONTACT: stefaniekoenig@ymail.com or mario.stanke@uni-greifswald.deSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  19. The whole genome sequences and experimentally phased haplotypes of over 100 personal genomes.

    Science.gov (United States)

    Mao, Qing; Ciotlos, Serban; Zhang, Rebecca Yu; Ball, Madeleine P; Chin, Robert; Carnevali, Paolo; Barua, Nina; Nguyen, Staci; Agarwal, Misha R; Clegg, Tom; Connelly, Abram; Vandewege, Ward; Zaranek, Alexander Wait; Estep, Preston W; Church, George M; Drmanac, Radoje; Peters, Brock A

    2016-10-11

    Since the completion of the Human Genome Project in 2003, it is estimated that more than 200,000 individual whole human genomes have been sequenced. A stunning accomplishment in such a short period of time. However, most of these were sequenced without experimental haplotype data and are therefore missing an important aspect of genome biology. In addition, much of the genomic data is not available to the public and lacks phenotypic information. As part of the Personal Genome Project, blood samples from 184 participants were collected and processed using Complete Genomics' Long Fragment Read technology. Here, we present the experimental whole genome haplotyping and sequencing of these samples to an average read coverage depth of 100X. This is approximately three-fold higher than the read coverage applied to most whole human genome assemblies and ensures the highest quality results. Currently, 114 genomes from this dataset are freely available in the GigaDB repository and are associated with rich phenotypic data; the remaining 70 should be added in the near future as they are approved through the PGP data release process. For reproducibility analyses, 20 genomes were sequenced at least twice using independent LFR barcoded libraries. Seven genomes were also sequenced using Complete Genomics' standard non-barcoded library process. In addition, we report 2.6 million high-quality, rare variants not previously identified in the Single Nucleotide Polymorphisms database or the 1000 Genomes Project Phase 3 data. These genomes represent a unique source of haplotype and phenotype data for the scientific community and should help to expand our understanding of human genome evolution and function.

  20. Genome-wide identification of direct HBx genomic targets

    KAUST Repository

    Guerrieri, Francesca

    2017-02-17

    Background The Hepatitis B Virus (HBV) HBx regulatory protein is required for HBV replication and involved in HBV-related carcinogenesis. HBx interacts with chromatin modifying enzymes and transcription factors to modulate histone post-translational modifications and to regulate viral cccDNA transcription and cellular gene expression. Aiming to identify genes and non-coding RNAs (ncRNAs) directly targeted by HBx, we performed a chromatin immunoprecipitation sequencing (ChIP-Seq) to analyse HBV recruitment on host cell chromatin in cells replicating HBV. Results ChIP-Seq high throughput sequencing of HBx-bound fragments was used to obtain a high-resolution, unbiased, mapping of HBx binding sites across the genome in HBV replicating cells. Protein-coding genes and ncRNAs involved in cell metabolism, chromatin dynamics and cancer were enriched among HBx targets together with genes/ncRNAs known to modulate HBV replication. The direct transcriptional activation of genes/miRNAs that potentiate endocytosis (Ras-related in brain (RAB) GTPase family) and autophagy (autophagy related (ATG) genes, beclin-1, miR-33a) and the transcriptional repression of microRNAs (miR-138, miR-224, miR-576, miR-596) that directly target the HBV pgRNA and would inhibit HBV replication, contribute to HBx-mediated increase of HBV replication. Conclusions Our ChIP-Seq analysis of HBx genome wide chromatin recruitment defined the repertoire of genes and ncRNAs directly targeted by HBx and led to the identification of new mechanisms by which HBx positively regulates cccDNA transcription and HBV replication.

  1. Dramatic improvement in genome assembly achieved using doubled-haploid genomes.

    Science.gov (United States)

    Zhang, Hong; Tan, Engkong; Suzuki, Yutaka; Hirose, Yusuke; Kinoshita, Shigeharu; Okano, Hideyuki; Kudoh, Jun; Shimizu, Atsushi; Saito, Kazuyoshi; Watabe, Shugo; Asakawa, Shuichi

    2014-10-27

    Improvement in de novo assembly of large genomes is still to be desired. Here, we improved draft genome sequence quality by employing doubled-haploid individuals. We sequenced wildtype and doubled-haploid Takifugu rubripes genomes, under the same conditions, using the Illumina platform and assembled contigs with SOAPdenovo2. We observed 5.4-fold and 2.6-fold improvement in the sizes of the N50 contig and scaffold of doubled-haploid individuals, respectively, compared to the wildtype, indicating that the use of a doubled-haploid genome aids in accurate genome analysis.

  2. Accounting for discovery bias in genomic prediction

    Science.gov (United States)

    Our objective was to evaluate an approach to mitigating discovery bias in genomic prediction. Accuracy may be improved by placing greater emphasis on regions of the genome expected to be more influential on a trait. Methods emphasizing regions result in a phenomenon known as “discovery bias” if info...

  3. Integrating genomics into undergraduate nursing education.

    Science.gov (United States)

    Daack-Hirsch, Sandra; Dieter, Carla; Quinn Griffin, Mary T

    2011-09-01

    To prepare the next generation of nurses, faculty are now faced with the challenge of incorporating genomics into curricula. Here we discuss how to meet this challenge. Steps to initiate curricular changes to include genomics are presented along with a discussion on creating a genomic curriculum thread versus a standalone course. Ideas for use of print material and technology on genomic topics are also presented. Information is based on review of the literature and curriculum change efforts by the authors. In recognition of advances in genomics, the nursing profession is increasing an emphasis on the integration of genomics into professional practice and educational standards. Incorporating genomics into nurses' practices begins with changes in our undergraduate curricula. Information given in didactic courses should be reinforced in clinical practica, and Internet-based tools such as WebQuest, Second Life, and wikis offer attractive, up-to-date platforms to deliver this now crucial content. To provide information that may assist faculty to prepare the next generation of nurses to practice using genomics. © 2011 Sigma Theta Tau International.

  4. The Arab genome: Health and wealth.

    Science.gov (United States)

    Zayed, Hatem

    2016-11-05

    The 22 Arab nations have a unique genetic structure, which reflects both conserved and diverse gene pools due to the prevalent endogamous and consanguineous marriage culture and the long history of admixture among different ethnic subcultures descended from the Asian, European, and African continents. Human genome sequencing has enabled large-scale genomic studies of different populations and has become a powerful tool for studying disease predictions and diagnosis. Despite the importance of the Arab genome for better understanding the dynamics of the human genome, discovering rare genetic variations, and studying early human migration out of Africa, it is poorly represented in human genome databases, such as HapMap and the 1000 Genomes Project. In this review, I demonstrate the significance of sequencing the Arab genome and setting an Arab genome reference(s) for better understanding the molecular pathogenesis of genetic diseases, discovering novel/rare variants, and identifying a meaningful genotype-phenotype correlation for complex diseases. Copyright © 2016. Published by Elsevier B.V.

  5. The UCSC Genome Browser Database: 2008 update

    DEFF Research Database (Denmark)

    Karolchik, D; Kuhn, R M; Baertsch, R

    2007-01-01

    The University of California, Santa Cruz, Genome Browser Database (GBD) provides integrated sequence and annotation data for a large collection of vertebrate and model organism genomes. Seventeen new assemblies have been added to the database in the past year, for a total coverage of 19 vertebrat...

  6. Population genomics of fungal and oomycete pathogens

    Science.gov (United States)

    We are entering a new era in plant pathology where whole-genome sequences of many individuals of a pathogen species are becoming readily available. This era of pathogen population genomics will provide new opportunities and challenges, requiring new computational and analytical tools. Population gen...

  7. Genome sequence of Lactobacillus rhamnosus ATCC 8530.

    Science.gov (United States)

    Pittet, Vanessa; Ewen, Emily; Bushell, Barry R; Ziola, Barry

    2012-02-01

    Lactobacillus rhamnosus is found in the human gastrointestinal tract and is important for probiotics. We became interested in L. rhamnosus isolate ATCC 8530 in relation to beer spoilage and hops resistance. We report here the genome sequence of this isolate, along with a brief comparison to other available L. rhamnosus genome sequences.

  8. Genome Sequence of Lactobacillus rhamnosus ATCC 8530

    OpenAIRE

    Pittet, Vanessa; Ewen, Emily; Bushell, Barry R.; Ziola, Barry

    2012-01-01

    Lactobacillus rhamnosus is found in the human gastrointestinal tract and is important for probiotics. We became interested in L. rhamnosus isolate ATCC 8530 in relation to beer spoilage and hops resistance. We report here the genome sequence of this isolate, along with a brief comparison to other available L. rhamnosus genome sequences.

  9. Assembly of viral genomes from metagenomes

    Directory of Open Access Journals (Sweden)

    Saskia L Smits

    2014-12-01

    Full Text Available Viral infections remain a serious global health issue. Metagenomic approaches are increasingly used in the detection of novel viral pathogens but also to generate complete genomes of uncultivated viruses. In silico identification of complete viral genomes from sequence data would allow rapid phylogenetic characterization of these new viruses. Often, however, complete viral genomes are not recovered, but rather several distinct contigs derived from a single entity, some of which have no sequence homology to any known proteins. De novo assembly of single viruses from a metagenome is challenging, not only because of the lack of a reference genome, but also because of intrapopulation variation and uneven or insufficient coverage. Here we explored different assembly algorithms, remote homology searches, genome-specific sequence motifs, k-mer frequency ranking, and coverage profile binning to detect and obtain viral target genomes from metagenomes. All methods were tested on 454-generated sequencing datasets containing three recently described RNA viruses with a relatively large genome which were divergent to previously known viruses from the viral families Rhabdoviridae and Coronaviridae. Depending on specific characteristics of the target virus and the metagenomic community, different assembly and in silico gap closure strategies were successful in obtaining near complete viral genomes.

  10. Assembly of viral genomes from metagenomes

    NARCIS (Netherlands)

    S.L. Smits (Saskia); R. Bodewes (Rogier); A. Ruiz-Gonzalez (Aritz); V. Baumgärtner (Volkmar); M.P.G. Koopmans D.V.M. (Marion); A.D.M.E. Osterhaus (Albert); A. Schürch (Anita)

    2014-01-01

    textabstractViral infections remain a serious global health issue. Metagenomic approaches are increasingly used in the detection of novel viral pathogens but also to generate complete genomes of uncultivated viruses. In silico identification of complete viral genomes from sequence data would allow

  11. Genomics-assisted breeding in fruit trees.

    Science.gov (United States)

    Iwata, Hiroyoshi; Minamikawa, Mai F; Kajiya-Kanegae, Hiromi; Ishimori, Motoyuki; Hayashi, Takeshi

    2016-01-01

    Recent advancements in genomic analysis technologies have opened up new avenues to promote the efficiency of plant breeding. Novel genomics-based approaches for plant breeding and genetics research, such as genome-wide association studies (GWAS) and genomic selection (GS), are useful, especially in fruit tree breeding. The breeding of fruit trees is hindered by their long generation time, large plant size, long juvenile phase, and the necessity to wait for the physiological maturity of the plant to assess the marketable product (fruit). In this article, we describe the potential of genomics-assisted breeding, which uses these novel genomics-based approaches, to break through these barriers in conventional fruit tree breeding. We first introduce the molecular marker systems and whole-genome sequence data that are available for fruit tree breeding. Next we introduce the statistical methods for biparental linkage and quantitative trait locus (QTL) mapping as well as GWAS and GS. We then review QTL mapping, GWAS, and GS studies conducted on fruit trees. We also review novel technologies for rapid generation advancement. Finally, we note the future prospects of genomics-assisted fruit tree breeding and problems that need to be overcome in the breeding.

  12. Comparative Genomics of Green Sulfur Bacteria

    DEFF Research Database (Denmark)

    Ussery, David; Davenport, C; Tümmler, B

    2010-01-01

    Eleven completely sequenced Chlorobi genomes were compared in oligonucleotide usage, gene contents, and synteny. The green sulfur bacteria (GSB) are equipped with a core genome that sustains their anoxygenic phototrophic lifestyle by photosynthesis, sulfur oxidation, and CO(2) fixation. Whole...... weight of 10(6), and are probably instrumental for the bacteria to generate their own intimate (micro)environment....

  13. Pathway and network analysis of cancer genomes

    DEFF Research Database (Denmark)

    Creixell, Pau; Reimand, Jueri; Haider, Syed

    2015-01-01

    Genomic information on tumors from 50 cancer types cataloged by the International Cancer Genome Consortium (ICGC) shows that only a few well-studied driver genes are frequently mutated, in contrast to many infrequently mutated genes that may also contribute to tumor biology. Hence there has been...

  14. Harvesting Legume Genomes: Plant Genetic Resources

    Science.gov (United States)

    Genomics and high through-put phenotyping are ushering in a new era of accessing genetic diversity held in plant genetic resources, the cornerstone of both traditional and genomics-assisted breeding efforts of food legume crops. Acknowledged or not, yield plateaus must be broken given the daunting ...

  15. Benchmarking of methods for genomic taxonomy

    DEFF Research Database (Denmark)

    Larsen, Mette Voldby; Cosentino, Salvatore; Lukjancenko, Oksana

    2014-01-01

    . Nevertheless, the method has been found to have a number of shortcomings. In the current study, we trained and benchmarked five methods for whole-genome sequence-based prokaryotic species identification on a common data set of complete genomes: (i) SpeciesFinder, which is based on the complete 16S rRNA gene...

  16. Uses of antimicrobial genes from microbial genome

    Science.gov (United States)

    Sorek, Rotem; Rubin, Edward M.

    2013-08-20

    We describe a method for mining microbial genomes to discover antimicrobial genes and proteins having broad spectrum of activity. Also described are antimicrobial genes and their expression products from various microbial genomes that were found using this method. The products of such genes can be used as antimicrobial agents or as tools for molecular biology.

  17. Multiple Genome Sequences of Lactobacillus plantarum Strains

    OpenAIRE

    Kafka, Thomas A.; Geissler, Andreas J.; Vogel, Rudi F.

    2017-01-01

    ABSTRACT We report here the genome sequences of four Lactobacillus plantarum strains which vary in surface hydrophobicity. Bioinformatic analysis, using additional genomes of Lactobacillus plantarum strains, revealed a possible correlation between the cell wall teichoic acid-type and cell surface hydrophobicity and provide the basis for consecutive analyses.

  18. Analysis of Genome-Scale Data

    NARCIS (Netherlands)

    Kemmeren, P.P.C.W.

    2005-01-01

    The genetic material of every cell in an organism is stored inside DNA in the form of genes, which together form the genome. The information stored in the DNA is translated to RNA and subsequently to proteins, which form complex biological systems. The availability of whole genome sequences has

  19. Microbial minimalism: genome reduction in bacterial pathogens.

    Science.gov (United States)

    Moran, Nancy A

    2002-03-08

    When bacterial lineages make the transition from free-living or facultatively parasitic life cycles to permanent associations with hosts, they undergo a major loss of genes and DNA. Complete genome sequences are providing an understanding of how extreme genome reduction affects evolutionary directions and metabolic capabilities of obligate pathogens and symbionts.

  20. Genome instability in Lactobacillus rhamnosus GG

    NARCIS (Netherlands)

    Sybesma, W.; Molenaar, D.; IJcken, W. van; Venema, K.; Korta, R.

    2013-01-01

    We describe here a comparative genome analysis of three dairy product isolates of Lactobacillus rhamnosus GG (LGG) and the ATCC 53103 reference strain to the published genome sequence of L. rhamnosus GG. The analysis showed that in two of three isolates, major DNA segments were missing from the

  1. HGVA: the Human Genome Variation Archive.

    Science.gov (United States)

    Lopez, Javier; Coll, Jacobo; Haimel, Matthias; Kandasamy, Swaathi; Tarraga, Joaquin; Furio-Tari, Pedro; Bari, Wasim; Bleda, Marta; Rueda, Antonio; Gräf, Stefan; Rendon, Augusto; Dopazo, Joaquin; Medina, Ignacio

    2017-07-03

    High-profile genomic variation projects like the 1000 Genomes project or the Exome Aggregation Consortium, are generating a wealth of human genomic variation knowledge which can be used as an essential reference for identifying disease-causing genotypes. However, accessing these data, contrasting the various studies and integrating those data in downstream analyses remains cumbersome. The Human Genome Variation Archive (HGVA) tackles these challenges and facilitates access to genomic data for key reference projects in a clean, fast and integrated fashion. HGVA provides an efficient and intuitive web-interface for easy data mining, a comprehensive RESTful API and client libraries in Python, Java and JavaScript for fast programmatic access to its knowledge base. HGVA calculates population frequencies for these projects and enriches their data with variant annotation provided by CellBase, a rich and fast annotation solution. HGVA serves as a proof-of-concept of the genome analysis developments being carried out by the University of Cambridge together with UK's 100 000 genomes project and the National Institute for Health Research BioResource Rare-Diseases, in particular, deploying open-source for Computational Biology (OpenCB) software platform for storing and analyzing massive genomic datasets. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  2. Simple sequence repeats in mycobacterial genomes

    Indian Academy of Sciences (India)

    2006-12-18

    Dec 18, 2006 ... Although prokaryotic genomes derive some plasticity due to microsatellite mutations they have in-built mechanisms to arrest undue expansions of microsatellites and one such mechanism is constituted by post-replicative DNA repair enzymes MutL, MutH and MutS. The mycobacterial genomes lack these ...

  3. BGD: a database of bat genomes.

    Science.gov (United States)

    Fang, Jianfei; Wang, Xuan; Mu, Shuo; Zhang, Shuyi; Dong, Dong

    2015-01-01

    Bats account for ~20% of mammalian species, and are the only mammals with true powered flight. For the sake of their specialized phenotypic traits, many researches have been devoted to examine the evolution of bats. Until now, some whole genome sequences of bats have been assembled and annotated, however, a uniform resource for the annotated bat genomes is still unavailable. To make the extensive data associated with the bat genomes accessible to the general biological communities, we established a Bat Genome Database (BGD). BGD is an open-access, web-available portal that integrates available data of bat genomes and genes. It hosts data from six bat species, including two megabats and four microbats. Users can query the gene annotations using efficient searching engine, and it offers browsable tracks of bat genomes. Furthermore, an easy-to-use phylogenetic analysis tool was also provided to facilitate online phylogeny study of genes. To the best of our knowledge, BGD is the first database of bat genomes. It will extend our understanding of the bat evolution and be advantageous to the bat sequences analysis. BGD is freely available at: http://donglab.ecnu.edu.cn/databases/BatGenome/.

  4. Complete Genome Sequence of Staphylococcus epidermidis 1457.

    Science.gov (United States)

    Galac, Madeline R; Stam, Jason; Maybank, Rosslyn; Hinkle, Mary; Mack, Dietrich; Rohde, Holger; Roth, Amanda L; Fey, Paul D

    2017-06-01

    Staphylococcus epidermidis 1457 is a frequently utilized strain that is amenable to genetic manipulation and has been widely used for biofilm-related research. We report here the whole-genome sequence of this strain, which encodes 2,277 protein-coding genes and 81 RNAs within its 2.4-Mb genome and plasmid. Copyright © 2017 Galac et al.

  5. Inferences from Genomic Models in Stratified Populations

    DEFF Research Database (Denmark)

    Janss, Luc; de los Campos, Gustavo; Sheehan, Nuala

    2012-01-01

    Unaccounted population stratification can lead to spurious associations in genome-wide association studies (GWAS) and in this context several methods have been proposed to deal with this problem. An alternative line of research uses whole-genome random regression (WGRR) models that fit all marker...

  6. Comparison of 61 Sequenced Escherichia coli Genomes

    DEFF Research Database (Denmark)

    Lukjancenko, Oksana; Wassenaar, T. M.; Ussery, David

    2010-01-01

    Escherichia coli is an important component of the biosphere and is an ideal model for studies of processes involved in bacterial genome evolution. Sixty-one publically available E. coli and Shigella spp. sequenced genomes are compared, using basic methods to produce phylogenetic and proteomics...

  7. BGD: a database of bat genomes.

    Directory of Open Access Journals (Sweden)

    Jianfei Fang

    Full Text Available Bats account for ~20% of mammalian species, and are the only mammals with true powered flight. For the sake of their specialized phenotypic traits, many researches have been devoted to examine the evolution of bats. Until now, some whole genome sequences of bats have been assembled and annotated, however, a uniform resource for the annotated bat genomes is still unavailable. To make the extensive data associated with the bat genomes accessible to the general biological communities, we established a Bat Genome Database (BGD. BGD is an open-access, web-available portal that integrates available data of bat genomes and genes. It hosts data from six bat species, including two megabats and four microbats. Users can query the gene annotations using efficient searching engine, and it offers browsable tracks of bat genomes. Furthermore, an easy-to-use phylogenetic analysis tool was also provided to facilitate online phylogeny study of genes. To the best of our knowledge, BGD is the first database of bat genomes. It will extend our understanding of the bat evolution and be advantageous to the bat sequences analysis. BGD is freely available at: http://donglab.ecnu.edu.cn/databases/BatGenome/.

  8. Genomics approaches in the understanding of Entamoeba ...

    African Journals Online (AJOL)

    Entamoeba histolytica is the intestinal protozoan parasite responsible for amebic colitis and liver abscesses, which cause mortality in many developing countries. The sequencing of the parasite genome provides new insights into the cellular workings and genome evolution of this major human pathogen. Here, we reviewed ...

  9. Genomic Approaches in Marine Biodiversity and Aquaculture

    Directory of Open Access Journals (Sweden)

    Jorge A Huete-Pérez

    2013-01-01

    Full Text Available Recent advances in genomic and post-genomic technologies have now established the new standard in medical and biotechnological research. The introduction of next-generation sequencing, NGS,has resulted in the generation of thousands of genomes from all domains of life, including the genomes of complex uncultured microbial communities revealed through metagenomics. Although the application of genomics to marine biodiversity remains poorly developed overall, some noteworthy progress has been made in recent years. The genomes of various model marine organisms have been published and a few more are underway. In addition, the recent large-scale analysis of marine microbes, along with transcriptomic and proteomic approaches to the study of teleost fishes, mollusks and crustaceans, to mention a few, has provided a better understanding of phenotypic variability and functional genomics. The past few years have also seen advances in applications relevant to marine aquaculture and fisheries. In this review we introduce several examples of recent discoveries and progress made towards engendering genomic resources aimed at enhancing our understanding of marine biodiversity and promoting the development of aquaculture. Finally, we discuss the need for auspicious science policies to address challenges confronting smaller nations in the appropriate oversight of this growing domain as they strive to guarantee food security and conservation of their natural resources.

  10. GENOME ANALYSIS OF BURKHOLDERIA CEPACIA AC1100

    Science.gov (United States)

    Burkholderia cepacia is an important organism in bioremediation of environmental pollutants and it is also of increasing interest as a human pathogen. The genomic organization of B. cepacia is being studied in order to better understand its unusual adaptive capacity and genome pl...

  11. Feast and famine in plant genomes.

    Science.gov (United States)

    Jonathan F. Wendel; Richard C. Cronn; J. Spencer Jonhston; H. James. Price

    2002-01-01

    Plant genomes vary over several orders of magnitude in size, even among closely related species, yet the origin, genesis and significance of this variation are not clear. Because DNA content varies over a sevenfold range among diploid species in the cotton genus (Gossypium) and its allies, this group offers opportunities for exploring patterns and mechanisms of genome...

  12. Genome bioinformatics of tomato and potato

    NARCIS (Netherlands)

    Datema, E.

    2011-01-01

    In the past two decades genome sequencing has developed from a laborious and costly technology employed by large international consortia to a widely used, automated and affordable tool used worldwide by many individual research groups. Genome sequences of many food animals and crop plants have

  13. Fungal biology: compiling genomes and exploiting them

    Energy Technology Data Exchange (ETDEWEB)

    Labbe, Jessy L [ORNL; Uehling, Jessie K [ORNL; Payen, Thibaut [INRA; Plett, Jonathan [University of Western Sydney, Australia

    2014-01-01

    The last 10 years have seen the cost of sequencing complete genomes decrease at an incredible speed. This has led to an increase in the number of genomes sequenced in all the fungal tree of life as well as a wide variety of plant genomes. The increase in sequencing has permitted us to study the evolution of organisms on a genomic scale. A number of talks during the conference discussed the importance of transposable elements (TEs) that are present in almost all species of fungi. These TEs represent an especially large percentage of genomic space in fungi that interact with plants. Thierry Rouxel (INRA, Nancy, France) showed the link between speciation in the Leptosphaeria complex and the expansion of TE families. For example in the Leptosphaeria complex, one species associated with oilseed rape has experienced a recent and massive burst of movement by a few TE families. The alterations caused by these TEs took place in discrete regions of the genome leading to shuffling of the genomic landscape and the appearance of genes specific to the species, such as effectors useful for the interactions with a particular plant (Rouxel et al., 2011). Other presentations showed the importance of TEs in affecting genome organization. For example, in Amanita different species appear to have been invaded by different TE families (Veneault-Fourrey & Martin, 2011).

  14. Fungal genomics: forensic evidence of sexual activity.

    Science.gov (United States)

    Gow, Neil A R

    2005-07-12

    The genome sequence of the 'asexual' human pathogenic fungus Aspergillus fumigatus suggests it has the capability to undergo mating and meiosis. That this organism engages in clandestine sexual activity is also suggested by observations of two equally distributed complementary mating types in nature, the expression of mating type genes and evidence of recent genome recombination events.

  15. Pichia stipitis genomics, transcriptomics, and gene clusters

    Science.gov (United States)

    Thomas W. Jeffries; Jennifer R. Headman Van Vleet

    2009-01-01

    Genome sequencing and subsequent global gene expression studies have advanced our understanding of the lignocellulose-fermenting yeast Pichia stipitis. These studies have provided an insight into its central carbon metabolism, and analysis of its genome has revealed numerous functional gene clusters and tandem repeats. Specialized physiological traits are often the...

  16. Genomics in health and disease | Shawky | Egyptian Journal of ...

    African Journals Online (AJOL)

    Genomics is the study of all person's genes including interactions of those genes ... Our environment and our biology are two factors that strongly influence our health. ... The completion of the Human Genome Project signaled that the genome ...

  17. The reach of the genome signature in prokaryotes

    NARCIS (Netherlands)

    van Passel, M.W.J.; Kuramae, E.E.; Luyf, A.C.M.; Bart, A.; Boekhout, T.

    2006-01-01

    Background: With the increased availability of sequenced genomes there have been several initiatives to infer evolutionary relationships by whole genome characteristics. One of these studies suggested good congruence between genome synteny, shared gene content, 16S ribosomal DNA identity, codon

  18. Genome Sequences of Marine Shrimp Exopalaemon carinicauda Holthuis Provide Insights into Genome Size Evolution of Caridea.

    Science.gov (United States)

    Yuan, Jianbo; Gao, Yi; Zhang, Xiaojun; Wei, Jiankai; Liu, Chengzhang; Li, Fuhua; Xiang, Jianhai

    2017-07-05

    Crustacea, particularly Decapoda, contains many economically important species, such as shrimps and crabs. Crustaceans exhibit enormous (nearly 500-fold) variability in genome size. However, limited genome resources are available for investigating these species. Exopalaemon carinicauda Holthuis, an economical caridean shrimp, is a potential ideal experimental animal for research on crustaceans. In this study, we performed low-coverage sequencing and de novo assembly of the E. carinicauda genome. The assembly covers more than 95% of coding regions. E. carinicauda possesses a large complex genome (5.73 Gb), with size twice higher than those of many decapod shrimps. As such, comparative genomic analyses were implied to investigate factors affecting genome size evolution of decapods. However, clues associated with genome duplication were not identified, and few horizontally transferred sequences were detected. Ultimately, the burst of transposable elements, especially retrotransposons, was determined as the major factor influencing genome expansion. A total of 2 Gb repeats were identified, and RTE-BovB, Jockey, Gypsy, and DIRS were the four major retrotransposons that significantly expanded. Both recent (Jockey and Gypsy) and ancestral (DIRS) originated retrotransposons responsible for the genome evolution. The E. carinicauda genome also exhibited potential for the genomic and experimental research of shrimps.

  19. Genome engineering for microbial natural product discovery.

    Science.gov (United States)

    Choi, Si-Sun; Katsuyama, Yohei; Bai, Linquan; Deng, Zixin; Ohnishi, Yasuo; Kim, Eung-Soo

    2018-03-03

    The discovery and development of microbial natural products (MNPs) have played pivotal roles in the fields of human medicine and its related biotechnology sectors over the past several decades. The post-genomic era has witnessed the development of microbial genome mining approaches to isolate previously unsuspected MNP biosynthetic gene clusters (BGCs) hidden in the genome, followed by various BGC awakening techniques to visualize compound production. Additional microbial genome engineering techniques have allowed higher MNP production titers, which could complement a traditional culture-based MNP chasing approach. Here, we describe recent developments in the MNP research paradigm, including microbial genome mining, NP BGC activation, and NP overproducing cell factory design. Copyright © 2018 Elsevier Ltd. All rights reserved.

  20. Genomic definition of species. Revision 2

    Energy Technology Data Exchange (ETDEWEB)

    Crkvenjakov, R.; Drmanac, R.

    1993-03-01

    A genome is the sum total of the DNA sequences in the cells of an individual organism. The common usage that species possess genomes comes naturally to biochemists, who have shown that all protein and nucleic acid molecules are at the same time species- and individual-specific, with minor individual variations being superimposed on a consensus sequence that is constant for a species. By extension, this property is attributed to the common features of DNA in the chromosomes of members of a given species and is called species genome. Our proposal for the definition of a biological species is as follows: A species comprises a group of actual and potential biological organisms built according to a unique genome program that is recorded, and at least in part expressed, in the structures of their genomic nucleic acid molecule(s), having intragroup sequence differences which can be fully interconverted in the process of organismal reproduction.

  1. Radiotaxons and reliability of a genome

    International Nuclear Information System (INIS)

    Korogodin, V.I.

    1982-01-01

    Radiosensitivity of cells (D 0 ) is considered with regard to the structural organization of the genome. The following terms are introduced: ''karyotaxon'', organisms with identical structural organization of the genome, and ''specific genome stability'' K=D 0 C, where C is the quantity of DNA in the cell nucleus; K is the amount of energy (eV) the sorption of which in DNA is necessary and sufficient for one elementary damage to occur. It was shown that Ksub(i)=const. within every karyotaxon ''i''. K 1 =100 eV for viruses, and K 4 =61000 eV for the highest level of genome organization (diploid eukaryotes including man). Potential mechanisms of increasing Ksub(i) with increasing level of genome organization and the role of this factor in evolution are discussed [ru

  2. Punctuated evolution of prostate cancer genomes.

    Science.gov (United States)

    Baca, Sylvan C; Prandi, Davide; Lawrence, Michael S; Mosquera, Juan Miguel; Romanel, Alessandro; Drier, Yotam; Park, Kyung; Kitabayashi, Naoki; MacDonald, Theresa Y; Ghandi, Mahmoud; Van Allen, Eliezer; Kryukov, Gregory V; Sboner, Andrea; Theurillat, Jean-Philippe; Soong, T David; Nickerson, Elizabeth; Auclair, Daniel; Tewari, Ashutosh; Beltran, Himisha; Onofrio, Robert C; Boysen, Gunther; Guiducci, Candace; Barbieri, Christopher E; Cibulskis, Kristian; Sivachenko, Andrey; Carter, Scott L; Saksena, Gordon; Voet, Douglas; Ramos, Alex H; Winckler, Wendy; Cipicchio, Michelle; Ardlie, Kristin; Kantoff, Philip W; Berger, Michael F; Gabriel, Stacey B; Golub, Todd R; Meyerson, Matthew; Lander, Eric S; Elemento, Olivier; Getz, Gad; Demichelis, Francesca; Rubin, Mark A; Garraway, Levi A

    2013-04-25

    The analysis of exonic DNA from prostate cancers has identified recurrently mutated genes, but the spectrum of genome-wide alterations has not been profiled extensively in this disease. We sequenced the genomes of 57 prostate tumors and matched normal tissues to characterize somatic alterations and to study how they accumulate during oncogenesis and progression. By modeling the genesis of genomic rearrangements, we identified abundant DNA translocations and deletions that arise in a highly interdependent manner. This phenomenon, which we term "chromoplexy," frequently accounts for the dysregulation of prostate cancer genes and appears to disrupt multiple cancer genes coordinately. Our modeling suggests that chromoplexy may induce considerable genomic derangement over relatively few events in prostate cancer and other neoplasms, supporting a model of punctuated cancer evolution. By characterizing the clonal hierarchy of genomic lesions in prostate tumors, we charted a path of oncogenic events along which chromoplexy may drive prostate carcinogenesis. Copyright © 2013 Elsevier Inc. All rights reserved.

  3. Transcription as a Threat to Genome Integrity.

    Science.gov (United States)

    Gaillard, Hélène; Aguilera, Andrés

    2016-06-02

    Genomes undergo different types of sporadic alterations, including DNA damage, point mutations, and genome rearrangements, that constitute the basis for evolution. However, these changes may occur at high levels as a result of cell pathology and trigger genome instability, a hallmark of cancer and a number of genetic diseases. In the last two decades, evidence has accumulated that transcription constitutes an important natural source of DNA metabolic errors that can compromise the integrity of the genome. Transcription can create the conditions for high levels of mutations and recombination by its ability to open the DNA structure and remodel chromatin, making it more accessible to DNA insulting agents, and by its ability to become a barrier to DNA replication. Here we review the molecular basis of such events from a mechanistic perspective with particular emphasis on the role of transcription as a genome instability determinant.

  4. Quantifying Temporal Genomic Erosion in Endangered Species.

    Science.gov (United States)

    Díez-Del-Molino, David; Sánchez-Barreiro, Fatima; Barnes, Ian; Gilbert, M Thomas P; Dalén, Love

    2018-03-01

    Many species have undergone dramatic population size declines over the past centuries. Although stochastic genetic processes during and after such declines are thought to elevate the risk of extinction, comparative analyses of genomic data from several endangered species suggest little concordance between genome-wide diversity and current population sizes. This is likely because species-specific life-history traits and ancient bottlenecks overshadow the genetic effect of recent demographic declines. Therefore, we advocate that temporal sampling of genomic data provides a more accurate approach to quantify genetic threats in endangered species. Specifically, genomic data from predecline museum specimens will provide valuable baseline data that enable accurate estimation of recent decreases in genome-wide diversity, increases in inbreeding levels, and accumulation of deleterious genetic variation. Copyright © 2017 Elsevier Ltd. All rights reserved.

  5. WormBase: Annotating many nematode genomes.

    Science.gov (United States)

    Howe, Kevin; Davis, Paul; Paulini, Michael; Tuli, Mary Ann; Williams, Gary; Yook, Karen; Durbin, Richard; Kersey, Paul; Sternberg, Paul W

    2012-01-01

    WormBase (www.wormbase.org) has been serving the scientific community for over 11 years as the central repository for genomic and genetic information for the soil nematode Caenorhabditis elegans. The resource has evolved from its beginnings as a database housing the genomic sequence and genetic and physical maps of a single species, and now represents the breadth and diversity of nematode research, currently serving genome sequence and annotation for around 20 nematodes. In this article, we focus on WormBase's role of genome sequence annotation, describing how we annotate and integrate data from a growing collection of nematode species and strains. We also review our approaches to sequence curation, and discuss the impact on annotation quality of large functional genomics projects such as modENCODE.

  6. The Arabidopsis lyrata genome sequence and the basis of rapid genome size change

    Energy Technology Data Exchange (ETDEWEB)

    Hu, Tina T.; Pattyn, Pedro; Bakker, Erica G.; Cao, Jun; Cheng, Jan-Fang; Clark, Richard M.; Fahlgren, Noah; Fawcett, Jeffrey A.; Grimwood, Jane; Gundlach, Heidrun; Haberer, Georg; Hollister, Jesse D.; Ossowski, Stephan; Ottilar, Robert P.; Salamov, Asaf A.; Schneeberger, Korbinian; Spannagl, Manuel; Wang, Xi; Yang, Liang; Nasrallah, Mikhail E.; Bergelson, Joy; Carrington, James C.; Gaut, Brandon S.; Schmutz, Jeremy; Mayer, Klaus F. X.; Van de Peer, Yves; Grigoriev, Igor V.; Nordborg, Magnus; Weigel, Detlef; Guo, Ya-Long

    2011-04-29

    In our manuscript, we present a high-quality genome sequence of the Arabidopsis thaliana relative, Arabidopsis lyrata, produced by dideoxy sequencing. We have performed the usual types of genome analysis (gene annotation, dN/dS studies etc. etc.), but this is relegated to the Supporting Information. Instead, we focus on what was a major motivation for sequencing this genome, namely to understand how A. thaliana lost half its genome in a few million years and lived to tell the tale. The rather surprising conclusion is that there is not a single genomic feature that accounts for the reduced genome, but that every aspect centromeres, intergenic regions, transposable elements, gene family number is affected through hundreds of thousands of cuts. This strongly suggests that overall genome size in itself is what has been under selection, a suggestion that is strongly supported by our demonstration (using population genetics data from A. thaliana) that new deletions seem to be driven to fixation.

  7. Correction for Measurement Error from Genotyping-by-Sequencing in Genomic Variance and Genomic Prediction Models

    DEFF Research Database (Denmark)

    Ashraf, Bilal; Janss, Luc; Jensen, Just

    sample). The GBSeq data can be used directly in genomic models in the form of individual SNP allele-frequency estimates (e.g., reference reads/total reads per polymorphic site per individual), but is subject to measurement error due to the low sequencing depth per individual. Due to technical reasons....... In the current work we show how the correction for measurement error in GBSeq can also be applied in whole genome genomic variance and genomic prediction models. Bayesian whole-genome random regression models are proposed to allow implementation of large-scale SNP-based models with a per-SNP correction...... for measurement error. We show correct retrieval of genomic explained variance, and improved genomic prediction when accounting for the measurement error in GBSeq data...

  8. Rapid detection of structural variation in a human genome using nanochannel-based genome mapping technology

    DEFF Research Database (Denmark)

    Cao, Hongzhi; Hastie, Alex R.; Cao, Dandan

    2014-01-01

    mutations; however, none of the current detection methods are comprehensive, and currently available methodologies are incapable of providing sufficient resolution and unambiguous information across complex regions in the human genome. To address these challenges, we applied a high-throughput, cost......-effective genome mapping technology to comprehensively discover genome-wide SVs and characterize complex regions of the YH genome using long single molecules (>150 kb) in a global fashion. RESULTS: Utilizing nanochannel-based genome mapping technology, we obtained 708 insertions/deletions and 17 inversions larger...... fosmid data. Of the remaining 270 SVs, 260 are insertions and 213 overlap known SVs in the Database of Genomic Variants. Overall, 609 out of 666 (90%) variants were supported by experimental orthogonal methods or historical evidence in public databases. At the same time, genome mapping also provides...

  9. Detection of genomic rearrangements in cucumber using genomecmp software

    Science.gov (United States)

    Kulawik, Maciej; Pawełkowicz, Magdalena Ewa; Wojcieszek, Michał; PlÄ der, Wojciech; Nowak, Robert M.

    2017-08-01

    Comparative genomic by increasing information about the genomes sequences available in the databases is a rapidly evolving science. A simple comparison of the general features of genomes such as genome size, number of genes, and chromosome number presents an entry point into comparative genomic analysis. Here we present the utility of the new tool genomecmp for finding rearrangements across the compared sequences and applications in plant comparative genomics.

  10. Chloroplast Genome Evolution in Early Diverged Leptosporangiate Ferns

    OpenAIRE

    Kim, Hyoung Tae; Chung, Myong Gi; Kim, Ki-Joong

    2014-01-01

    In this study, the chloroplast (cp) genome sequences from three early diverged leptosporangiate ferns were completed and analyzed in order to understand the evolution of the genome of the fern lineages. The complete cp genome sequence of Osmunda cinnamomea (Osmundales) was 142,812 base pairs (bp). The cp genome structure was similar to that of eusporangiate ferns. The gene/intron losses that frequently occurred in the cp genome of leptosporangiate ferns were not found in the cp genome of O. c...

  11. Big Data Analysis of Human Genome Variations

    KAUST Repository

    Gojobori, Takashi

    2016-01-25

    Since the human genome draft sequence was in public for the first time in 2000, genomic analyses have been intensively extended to the population level. The following three international projects are good examples for large-scale studies of human genome variations: 1) HapMap Data (1,417 individuals) (http://hapmap.ncbi.nlm.nih.gov/downloads/genotypes/2010-08_phaseII+III/forward/), 2) HGDP (Human Genome Diversity Project) Data (940 individuals) (http://www.hagsc.org/hgdp/files.html), 3) 1000 genomes Data (2,504 individuals) http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ If we can integrate all three data into a single volume of data, we should be able to conduct a more detailed analysis of human genome variations for a total number of 4,861 individuals (= 1,417+940+2,504 individuals). In fact, we successfully integrated these three data sets by use of information on the reference human genome sequence, and we conducted the big data analysis. In particular, we constructed a phylogenetic tree of about 5,000 human individuals at the genome level. As a result, we were able to identify clusters of ethnic groups, with detectable admixture, that were not possible by an analysis of each of the three data sets. Here, we report the outcome of this kind of big data analyses and discuss evolutionary significance of human genomic variations. Note that the present study was conducted in collaboration with Katsuhiko Mineta and Kosuke Goto at KAUST.

  12. The complete mitochondrial genome of Gossypium hirsutum and evolutionary analysis of higher plant mitochondrial genomes.

    Science.gov (United States)

    Liu, Guozheng; Cao, Dandan; Li, Shuangshuang; Su, Aiguo; Geng, Jianing; Grover, Corrinne E; Hu, Songnian; Hua, Jinping

    2013-01-01

    Mitochondria are the main manufacturers of cellular ATP in eukaryotes. The plant mitochondrial genome contains large number of foreign DNA and repeated sequences undergone frequently intramolecular recombination. Upland Cotton (Gossypium hirsutum L.) is one of the main natural fiber crops and also an important oil-producing plant in the world. Sequencing of the cotton mitochondrial (mt) genome could be helpful for the evolution research of plant mt genomes. We utilized 454 technology for sequencing and combined with Fosmid library of the Gossypium hirsutum mt genome screening and positive clones sequencing and conducted a series of evolutionary analysis on Cycas taitungensis and 24 angiosperms mt genomes. After data assembling and contigs joining, the complete mitochondrial genome sequence of G. hirsutum was obtained. The completed G.hirsutum mt genome is 621,884 bp in length, and contained 68 genes, including 35 protein genes, four rRNA genes and 29 tRNA genes. Five gene clusters are found conserved in all plant mt genomes; one and four clusters are specifically conserved in monocots and dicots, respectively. Homologous sequences are distributed along the plant mt genomes and species closely related share the most homologous sequences. For species that have both mt and chloroplast genome sequences available, we checked the location of cp-like migration and found several fragments closely linked with mitochondrial genes. The G. hirsutum mt genome possesses most of the common characters of higher plant mt genomes. The existence of syntenic gene clusters, as well as the conservation of some intergenic sequences and genic content among the plant mt genomes suggest that evolution of mt genomes is consistent with plant taxonomy but independent among different species.

  13. Zygotic Genome Activation in Vertebrates.

    Science.gov (United States)

    Jukam, David; Shariati, S Ali M; Skotheim, Jan M

    2017-08-21

    The first major developmental transition in vertebrate embryos is the maternal-to-zygotic transition (MZT) when maternal mRNAs are degraded and zygotic transcription begins. During the MZT, the embryo takes charge of gene expression to control cell differentiation and further development. This spectacular organismal transition requires nuclear reprogramming and the initiation of RNAPII at thousands of promoters. Zygotic genome activation (ZGA) is mechanistically coordinated with other embryonic events, including changes in the cell cycle, chromatin state, and nuclear-to-cytoplasmic component ratios. Here, we review progress in understanding vertebrate ZGA dynamics in frogs, fish, mice, and humans to explore differences and emphasize common features. Copyright © 2017 Elsevier Inc. All rights reserved.

  14. The year of encroaching genomics

    Directory of Open Access Journals (Sweden)

    MG Manfredi-Romanini

    2009-12-01

    Full Text Available The year 2000 has been called “The year of the genome”. Perhaps better would have been to call it “The year of encroaching genomics” (Einarson and Golemis, 2000 intending by this vivid term the torrents of information pouring out of multiorganizational genome sequence projects, large research groups, consortia and industries into the field of active biological research. This is bound to have a strong impact on how research strategies will be designed from now on and it is our expectation that the year 2001 will be the year in which the small academic laboratories now at the forefront of research will have to cope with this new situation if they are to remain at the cutting edge. They will have to show granting institutions how their new research project proposals involve collaboration with large scientific groups and industries such as those mentioned above in order to win financial backing.

  15. The evolution of genome size in ants

    Directory of Open Access Journals (Sweden)

    Spagna Joseph C

    2008-02-01

    Full Text Available Abstract Background Despite the economic and ecological importance of ants, genomic tools for this family (Formicidae remain woefully scarce. Knowledge of genome size, for example, is a useful and necessary prerequisite for the development of many genomic resources, yet it has been reported for only one ant species (Solenopsis invicta, and the two published estimates for this species differ by 146.7 Mb (0.15 pg. Results Here, we report the genome size for 40 species of ants distributed across 10 of the 20 currently recognized subfamilies, thus making Formicidae the 4th most surveyed insect family and elevating the Hymenoptera to the 5th most surveyed insect order. Our analysis spans much of the ant phylogeny, from the less derived Amblyoponinae and Ponerinae to the more derived Myrmicinae, Formicinae and Dolichoderinae. We include a number of interesting and important taxa, including the invasive Argentine ant (Linepithema humile, Neotropical army ants (genera Eciton and Labidus, trapjaw ants (Odontomachus, fungus-growing ants (Apterostigma, Atta and Sericomyrmex, harvester ants (Messor, Pheidole and Pogonomyrmex, carpenter ants (Camponotus, a fire ant (Solenopsis, and a bulldog ant (Myrmecia. Our results show that ants possess small genomes relative to most other insects, yet genome size varies three-fold across this insect family. Moreover, our data suggest that two whole-genome duplications may have occurred in the ancestors of the modern Ectatomma and Apterostigma. Although some previous studies of other taxa have revealed a relationship between genome size and body size, our phylogenetically-controlled analysis of this correlation did not reveal a significant relationship. Conclusion This is the first analysis of genome size in ants (Formicidae and the first across multiple species of social insects. We show that genome size is a variable trait that can evolve gradually over long time spans, as well as rapidly, through processes that may

  16. Origins of the Human Genome Project.

    Science.gov (United States)

    Watson, J D; Cook-Deegan, R M

    1991-01-01

    The Human Genome Project has become a reality. Building on a debate that dates back to 1985, several genome projects are now in full stride around the world, and more are likely to form in the next several years. Italy began its genome program in 1987, and the United Kingdom and U.S.S.R. in 1988. The European communities mounted several genome projects on yeast, bacteria, Drosophila, and Arabidospis thaliana (a rapidly growing plant with a small genome) in 1988, and in 1990 commenced a new 2-year program on the human genome. In the United States, we have completed the first year of operation of the National Center for Human Genome Research at the National Institutes of Health (NIH), now the largest single funding source for genome research in the world. There have been dedicated budgets focused on genome-scale research at NIH, the U.S. Department of Energy, and the Howard Hughes Medical Institute for several years, and results are beginning to accumulate. There were three annual meetings on genome mapping and sequencing at Cold Spring Harbor, New York, in the spring of 1988, 1989, and 1990; the talks have shifted from a discussion about how to approach problems to presenting results from experiments already performed. We have finally begun to work rather than merely talk. The purpose of genome projects is to assemble data on the structure of DNA in human chromosomes and those of other organisms. A second goal is to develop new technologies to perform mapping and sequencing. There have been impressive technical advances in the past 5 years since the debate about the human genome project began. We are on the verge of beginning pilot projects to test several approaches to sequencing long stretches of DNA, using both automation and manual methods. Ordered sets of yeast artificial chromosome and cosmid clones have been assembled to span more than 2 million base pairs of several human chromosomes, and a region of 10 million base pairs has been assembled for

  17. Genomics Research: World Survey of Public Funding

    Directory of Open Access Journals (Sweden)

    Cook-Deegan Robert M

    2008-10-01

    Full Text Available Abstract Background Over the past two decades, genomics has evolved as a scientific research discipline. Genomics research was fueled initially by government and nonprofit funding sources, later augmented by private research and development (R&D funding. Citizens and taxpayers of many countries have funded much of the research, and have expectations about access to the resulting information and knowledge. While access to knowledge gained from all publicly funded research is desired, access is especially important for fields that have broad social impact and stimulate public dialogue. Genomics is one such field, where public concerns are raised for reasons such as health care and insurance implications, as well as personal and ancestral identification. Thus, genomics has grown rapidly as a field, and attracts considerable interest. Results One way to study the growth of a field of research is to examine its funding. This study focuses on public funding of genomics research, identifying and collecting data from major government and nonprofit organizations around the world, and updating previous estimates of world genomics research funding, including information about geographical origins. We initially identified 89 publicly funded organizations; we requested information about each organization's funding of genomics research. Of these organizations, 48 responded and 34 reported genomics research expenditures (of those that responded but did not supply information, some did not fund such research, others could not quantify it. The figures reported here include all the largest funders and we estimate that we have accounted for most of the genomics research funding from government and nonprofit sources. Conclusion Aggregate spending on genomics research from 34 funding sources averaged around $2.9 billion in 2003 – 2006. The United States spent more than any other country on genomics research, corresponding to 35% of the overall worldwide public

  18. GAAP: Genome-organization-framework-Assisted Assembly Pipeline for prokaryotic genomes.

    Science.gov (United States)

    Yuan, Lina; Yu, Yang; Zhu, Yanmin; Li, Yulai; Li, Changqing; Li, Rujiao; Ma, Qin; Siu, Gilman Kit-Hang; Yu, Jun; Jiang, Taijiao; Xiao, Jingfa; Kang, Yu

    2017-01-25

    Next-generation sequencing (NGS) technologies have greatly promoted the genomic study of prokaryotes. However, highly fragmented assemblies due to short reads from NGS are still a limiting factor in gaining insights into the genome biology. Reference-assisted tools are promising in genome assembly, but tend to result in false assembly when the assigned reference has extensive rearrangements. Herein, we present GAAP, a genome assembly pipeline for scaffolding based on core-gene-defined Genome Organizational Framework (cGOF) described in our previous study. Instead of assigning references, we use the multiple-reference-derived cGOFs as indexes to assist in order and orientation of the scaffolds and build a skeleton structure, and then use read pairs to extend scaffolds, called local scaffolding, and distinguish between true and chimeric adjacencies in the scaffolds. In our performance tests using both empirical and simulated data of 15 genomes in six species with diverse genome size, complexity, and all three categories of cGOFs, GAAP outcompetes or achieves comparable results when compared to three other reference-assisted programs, AlignGraph, Ragout and MeDuSa. GAAP uses both cGOF and pair-end reads to create assemblies in genomic scale, and performs better than the currently available reference-assisted assembly tools as it recovers more assemblies and makes fewer false locations, especially for species with extensive rearranged genomes. Our method is a promising solution for reconstruction of genome sequence from short reads of NGS.

  19. Enhanced annotations and features for comparing thousands of Pseudomonas genomes in the Pseudomonas genome database.

    Science.gov (United States)

    Winsor, Geoffrey L; Griffiths, Emma J; Lo, Raymond; Dhillon, Bhavjinder K; Shay, Julie A; Brinkman, Fiona S L

    2016-01-04

    The Pseudomonas Genome Database (http://www.pseudomonas.com) is well known for the application of community-based annotation approaches for producing a high-quality Pseudomonas aeruginosa PAO1 genome annotation, and facilitating whole-genome comparative analyses with other Pseudomonas strains. To aid analysis of potentially thousands of complete and draft genome assemblies, this database and analysis platform was upgraded to integrate curated genome annotations and isolate metadata with enhanced tools for larger scale comparative analysis and visualization. Manually curated gene annotations are supplemented with improved computational analyses that help identify putative drug targets and vaccine candidates or assist with evolutionary studies by identifying orthologs, pathogen-associated genes and genomic islands. The database schema has been updated to integrate isolate metadata that will facilitate more powerful analysis of genomes across datasets in the future. We continue to place an emphasis on providing high-quality updates to gene annotations through regular review of the scientific literature and using community-based approaches including a major new Pseudomonas community initiative for the assignment of high-quality gene ontology terms to genes. As we further expand from thousands of genomes, we plan to provide enhancements that will aid data visualization and analysis arising from whole-genome comparative studies including more pan-genome and population-based approaches. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  20. i-Genome: A database to summarize oligonucleotide data in genomes

    Directory of Open Access Journals (Sweden)

    Chang Yu-Chung

    2004-10-01

    Full Text Available Abstract Background Information on the occurrence of sequence features in genomes is crucial to comparative genomics, evolutionary analysis, the analyses of regulatory sequences and the quantitative evaluation of sequences. Computing the frequencies and the occurrences of a pattern in complete genomes is time-consuming. Results The proposed database provides information about sequence features generated by exhaustively computing the sequences of the complete genome. The repetitive elements in the eukaryotic genomes, such as LINEs, SINEs, Alu and LTR, are obtained from Repbase. The database supports various complete genomes including human, yeast, worm, and 128 microbial genomes. Conclusions This investigation presents and implements an efficiently computational approach to accumulate the occurrences of the oligonucleotides or patterns in complete genomes. A database is established to maintain the information of the sequence features, including the distributions of oligonucleotide, the gene distribution, the distribution of repetitive elements in genomes and the occurrences of the oligonucleotides. The database can provide more effective and efficient way to access the repetitive features in genomes.

  1. Genomic Resource and Genome Guided Comparison of Twenty Type Strains of the Genus Methylobacterium

    Directory of Open Access Journals (Sweden)

    Vasvi Chaudhry

    2017-12-01

    Full Text Available Bacteria of the genus Methylobacterium are widespread in diverse habitats ranging from soil, water and plant (phyllosphere, rhizosphere and endosphere. In the present study, we in house generated genomic data resource of six type strains along with fourteen database genomes of the Methylobacterium genus to carry out phylogenomic, taxonomic, comparative and ecological studies of this genus. Overall, the genus shows high diversity and genetic variation primarily due to its ability to acquire genetic material from diverse sources through horizontal gene transfer. As majority of species identified in this study are plant associated with their genomes equipped with methylotrophy and photosynthesis related gene along with genes for plant probiotic traits. Most of the species genomes are equipped with genes for adaptation and defense for UV radiation, oxidative stress and desiccation. The genus has an open pan-genome and we predicted the role of gain/loss of prophages and CRISPR elements in diversity and evolution. Our genomic resource with annotation and analysis provides a platform for interspecies genomic comparisons in the genus Methylobacterium, and to unravel their natural genome diversity and to study how natural selection shapes their genome with the adaptive mechanisms which allow them to acquire diverse habitat lifestyles. This type strains genomic data display power of Next Generation Sequencing in rapidly creating resource paving the way for studies on phylogeny and taxonomy as well as for basic and applied research for this important genus.

  2. Using Partial Genomic Fosmid Libraries for Sequencing CompleteOrganellar Genomes

    Energy Technology Data Exchange (ETDEWEB)

    McNeal, Joel R.; Leebens-Mack, James H.; Arumuganathan, K.; Kuehl, Jennifer V.; Boore, Jeffrey L.; dePamphilis, Claude W.

    2005-08-26

    Organellar genome sequences provide numerous phylogenetic markers and yield insight into organellar function and molecular evolution. These genomes are much smaller in size than their nuclear counterparts; thus, their complete sequencing is much less expensive than total nuclear genome sequencing, making broader phylogenetic sampling feasible. However, for some organisms it is challenging to isolate plastid DNA for sequencing using standard methods. To overcome these difficulties, we constructed partial genomic libraries from total DNA preparations of two heterotrophic and two autotrophic angiosperm species using fosmid vectors. We then used macroarray screening to isolate clones containing large fragments of plastid DNA. A minimum tiling path of clones comprising the entire genome sequence of each plastid was selected, and these clones were shotgun-sequenced and assembled into complete genomes. Although this method worked well for both heterotrophic and autotrophic plants, nuclear genome size had a dramatic effect on the proportion of screened clones containing plastid DNA and, consequently, the overall number of clones that must be screened to ensure full plastid genome coverage. This technique makes it possible to determine complete plastid genome sequences for organisms that defy other available organellar genome sequencing methods, especially those for which limited amounts of tissue are available.

  3. Comparative Genome Analysis of Enterobacter cloacae

    Science.gov (United States)

    Liu, Wing-Yee; Wong, Chi-Fat; Chung, Karl Ming-Kar; Jiang, Jing-Wei; Leung, Frederick Chi-Ching

    2013-01-01

    The Enterobacter cloacae species includes an extremely diverse group of bacteria that are associated with plants, soil and humans. Publication of the complete genome sequence of the plant growth-promoting endophytic E. cloacae subsp. cloacae ENHKU01 provided an opportunity to perform the first comparative genome analysis between strains of this dynamic species. Examination of the pan-genome of E. cloacae showed that the conserved core genome retains the general physiological and survival genes of the species, while genomic factors in plasmids and variable regions determine the virulence of the human pathogenic E. cloacae strain; additionally, the diversity of fimbriae contributes to variation in colonization and host determination of different E. cloacae strains. Comparative genome analysis further illustrated that E. cloacae strains possess multiple mechanisms for antagonistic action against other microorganisms, which involve the production of siderophores and various antimicrobial compounds, such as bacteriocins, chitinases and antibiotic resistance proteins. The presence of Type VI secretion systems is expected to provide further fitness advantages for E. cloacae in microbial competition, thus allowing it to survive in different environments. Competition assays were performed to support our observations in genomic analysis, where E. cloacae subsp. cloacae ENHKU01 demonstrated antagonistic activities against a wide range of plant pathogenic fungal and bacterial species. PMID:24069314

  4. The Switchgrass Genome: Tools and Strategies

    Directory of Open Access Journals (Sweden)

    Michael D. Casler

    2011-11-01

    Full Text Available Switchgrass ( L. is a perennial grass species receiving significant focus as a potential bioenergy crop. In the last 5 yr the switchgrass research community has produced a genetic linkage map, an expressed sequence tag (EST database, a set of single nucleotide polymorphism (SNP markers that are distributed across the 18 linkage groups, 4x sampling of the AP13 genome in 400-bp reads, and bacterial artificial chromosome (BAC libraries containing over 200,000 clones. These studies have revealed close collinearity of the switchgrass genome with those of sorghum [ (L. Moench], rice ( L., and (L. P. Beauv. Switchgrass researchers have also developed several microarray technologies for gene expression studies. Switchgrass genomic resources will accelerate the ability of plant breeders to enhance productivity, pest resistance, and nutritional quality. Because switchgrass is a relative newcomer to the genomics world, many secrets of the switchgrass genome have yet to be revealed. To continue to efficiently explore basic and applied topics in switchgrass, it will be critical to capture and exploit the knowledge of plant geneticists and breeders on the next logical steps in the development and utilization of genomic resources for this species. To this end, the community has established a switchgrass genomics executive committee and work group ( [verified 28 Oct. 2011].

  5. Microbial genome analysis: the COG approach.

    Science.gov (United States)

    Galperin, Michael Y; Kristensen, David M; Makarova, Kira S; Wolf, Yuri I; Koonin, Eugene V

    2017-09-14

    For the past 20 years, the Clusters of Orthologous Genes (COG) database had been a popular tool for microbial genome annotation and comparative genomics. Initially created for the purpose of evolutionary classification of protein families, the COG have been used, apart from straightforward functional annotation of sequenced genomes, for such tasks as (i) unification of genome annotation in groups of related organisms; (ii) identification of missing and/or undetected genes in complete microbial genomes; (iii) analysis of genomic neighborhoods, in many cases allowing prediction of novel functional systems; (iv) analysis of metabolic pathways and prediction of alternative forms of enzymes; (v) comparison of organisms by COG functional categories; and (vi) prioritization of targets for structural and functional characterization. Here we review the principles of the COG approach and discuss its key advantages and drawbacks in microbial genome analysis. Published by Oxford University Press 2017. This work is written by US Government employees and is in the public domain in the US.

  6. Exploratory analysis of genomic segmentations with Segtools

    Directory of Open Access Journals (Sweden)

    Buske Orion J

    2011-10-01

    Full Text Available Abstract Background As genome-wide experiments and annotations become more prevalent, researchers increasingly require tools to help interpret data at this scale. Many functional genomics experiments involve partitioning the genome into labeled segments, such that segments sharing the same label exhibit one or more biochemical or functional traits. For example, a collection of ChlP-seq experiments yields a compendium of peaks, each labeled with one or more associated DNA-binding proteins. Similarly, manually or automatically generated annotations of functional genomic elements, including cis-regulatory modules and protein-coding or RNA genes, can also be summarized as genomic segmentations. Results We present a software toolkit called Segtools that simplifies and automates the exploration of genomic segmentations. The software operates as a series of interacting tools, each of which provides one mode of summarization. These various tools can be pipelined and summarized in a single HTML page. We describe the Segtools toolkit and demonstrate its use in interpreting a collection of human histone modification data sets and Plasmodium falciparum local chromatin structure data sets. Conclusions Segtools provides a convenient, powerful means of interpreting a genomic segmentation.

  7. Genome Sequence Databases (Overview): Sequencing and Assembly

    Energy Technology Data Exchange (ETDEWEB)

    Lapidus, Alla L.

    2009-01-01

    From the date its role in heredity was discovered, DNA has been generating interest among scientists from different fields of knowledge: physicists have studied the three dimensional structure of the DNA molecule, biologists tried to decode the secrets of life hidden within these long molecules, and technologists invent and improve methods of DNA analysis. The analysis of the nucleotide sequence of DNA occupies a special place among the methods developed. Thanks to the variety of sequencing technologies available, the process of decoding the sequence of genomic DNA (or whole genome sequencing) has become robust and inexpensive. Meanwhile the assembly of whole genome sequences remains a challenging task. In addition to the need to assemble millions of DNA fragments of different length (from 35 bp (Solexa) to 800 bp (Sanger)), great interest in analysis of microbial communities (metagenomes) of different complexities raises new problems and pushes some new requirements for sequence assembly tools to the forefront. The genome assembly process can be divided into two steps: draft assembly and assembly improvement (finishing). Despite the fact that automatically performed assembly (or draft assembly) is capable of covering up to 98% of the genome, in most cases, it still contains incorrectly assembled reads. The error rate of the consensus sequence produced at this stage is about 1/2000 bp. A finished genome represents the genome assembly of much higher accuracy (with no gaps or incorrectly assembled areas) and quality ({approx}1 error/10,000 bp), validated through a number of computer and laboratory experiments.

  8. Genome Sequence of the Palaeopolyploid soybean

    Energy Technology Data Exchange (ETDEWEB)

    Schmutz, Jeremy; Cannon, Steven B.; Schlueter, Jessica; Ma, Jianxin; Mitros, Therese; Nelson, William; Hyten, David L.; Song, Qijian; Thelen, Jay J.; Cheng, Jianlin; Xu, Dong; Hellsten, Uffe; May, Gregory D.; Yu, Yeisoo; Sakura, Tetsuya; Umezawa, Taishi; Bhattacharyya, Madan K.; Sandhu, Devinder; Valliyodan, Babu; Lindquist, Erika; Peto, Myron; Grant, David; Shu, Shengqiang; Goodstein, David; Barry, Kerrie; Futrell-Griggs, Montona; Abernathy, Brian; Du, Jianchang; Tian, Zhixi; Zhu, Liucun; Gill, Navdeep; Joshi, Trupti; Libault, Marc; Sethuraman, Anand; Zhang, Xue-Cheng; Shinozaki, Kazuo; Nguyen, Henry T.; Wing, Rod A.; Cregan, Perry; Specht, James; Grimwood, Jane; Rokhsar, Dan; Stacey, Gary; Shoemaker, Randy C.; Jackson, Scott A.

    2009-08-03

    Soybean (Glycine max) is one of the most important crop plants for seed protein and oil content, and for its capacity to fix atmospheric nitrogen through symbioses with soil-borne microorganisms. We sequenced the 1.1-gigabase genome by a whole-genome shotgun approach and integrated it with physical and high-density genetic maps to create a chromosome-scale draft sequence assembly. We predict 46,430 protein-coding genes, 70percent more than Arabidopsis and similar to the poplar genome which, like soybean, is an ancient polyploid (palaeopolyploid). About 78percent of the predicted genes occur in chromosome ends, which comprise less than one-half of the genome but account for nearly all of the genetic recombination. Genome duplications occurred at approximately 59 and 13 million years ago, resulting in a highly duplicated genome with nearly 75percent of the genes present in multiple copies. The two duplication events were followed by gene diversification and loss, and numerous chromosome rearrangements. An accurate soybean genome sequence will facilitate the identification of the genetic basis of many soybean traits, and accelerate the creation of improved soybean varieties.

  9. SINEs as driving forces in genome evolution.

    Science.gov (United States)

    Schmitz, J

    2012-01-01

    SINEs are short interspersed elements derived from cellular RNAs that repetitively retropose via RNA intermediates and integrate more or less randomly back into the genome. SINEs propagate almost entirely vertically within their host cells and, once established in the germline, are passed on from generation to generation. As non-autonomous elements, their reverse transcription (from RNA to cDNA) and genomic integration depends on the activity of the enzymatic machinery of autonomous retrotransposons, such as long interspersed elements (LINEs). SINEs are widely distributed in eukaryotes, but are especially effectively propagated in mammalian species. For example, more than a million Alu-SINE copies populate the human genome (approximately 13% of genomic space), and few master copies of them are still active. In the organisms where they occur, SINEs are a challenge to genomic integrity, but in the long term also can serve as beneficial building blocks for evolution, contributing to phenotypic heterogeneity and modifying gene regulatory networks. They substantially expand the genomic space and introduce structural variation to the genome. SINEs have the potential to mutate genes, to alter gene expression, and to generate new parts of genes. A balanced distribution and controlled activity of such properties is crucial to maintaining the organism's dynamic and thriving evolution. Copyright © 2012 S. Karger AG, Basel.

  10. Genome organization, instabilities, stem cells, and cancer

    Directory of Open Access Journals (Sweden)

    Senthil Kumar Pazhanisamy

    2009-01-01

    Full Text Available It is now widely recognized that advances in exploring genome organization provide remarkable insights on the induction and progression of chromosome abnormalities. Much of what we know about how mutations evolve and consequently transform into genome instabilities has been characterized in the spatial organization context of chromatin. Nevertheless, many underlying concepts of impact of the chromatin organization on perpetuation of multiple mutations and on propagation of chromosomal aberrations remain to be investigated in detail. Genesis of genome instabilities from accumulation of multiple mutations that drive tumorigenesis is increasingly becoming a focal theme in cancer studies. This review focuses on structural alterations evolve to raise a variety of genome instabilities that are manifested at the nucleotide, gene or sub-chromosomal, and whole chromosome level of genome. Here we explore an underlying connection between genome instability and cancer in the light of genome architecture. This review is limited to studies directed towards spatial organizational aspects of origin and propagation of aberrations into genetically unstable tumors.

  11. Pseudo Boolean Programming for Partially Ordered Genomes

    Science.gov (United States)

    Angibaud, Sébastien; Fertin, Guillaume; Thévenin, Annelyse; Vialette, Stéphane

    Comparing genomes of different species is a crucial problem in comparative genomics. Different measures have been proposed to compare two genomes: number of common intervals, number of adjacencies, number of reversals, etc. These measures are classically used between two totally ordered genomes. However, genetic mapping techniques often give rise to different maps with some unordered genes. Starting from a partial order between genes of a genome, one method to find a total order consists in optimizing a given measure between a linear extension of this partial order and a given total order of a close and well-known genome. However, for most common measures, the problem turns out to be NP-hard. In this paper, we propose a (0,1)-linear programming approach to compute a linear extension of one genome that maximizes the number of common intervals (resp. the number of adjacencies) between this linear extension and a given total order. Next, we propose an algorithm to find linear extensions of two partial orders that maximize the number of adjacencies.

  12. Dengue Virus Genome Uncoating Requires Ubiquitination

    Directory of Open Access Journals (Sweden)

    Laura A. Byk

    2016-06-01

    Full Text Available The process of genome release or uncoating after viral entry is one of the least-studied steps in the flavivirus life cycle. Flaviviruses are mainly arthropod-borne viruses, including emerging and reemerging pathogens such as dengue, Zika, and West Nile viruses. Currently, dengue virus is one of the most significant human viral pathogens transmitted by mosquitoes and is responsible for about 390 million infections every year around the world. Here, we examined for the first time molecular aspects of dengue virus genome uncoating. We followed the fate of the capsid protein and RNA genome early during infection and found that capsid is degraded after viral internalization by the host ubiquitin-proteasome system. However, proteasome activity and capsid degradation were not necessary to free the genome for initial viral translation. Unexpectedly, genome uncoating was blocked by inhibiting ubiquitination. Using different assays to bypass entry and evaluate the first rounds of viral translation, a narrow window of time during infection that requires ubiquitination but not proteasome activity was identified. In this regard, ubiquitin E1-activating enzyme inhibition was sufficient to stabilize the incoming viral genome in the cytoplasm of infected cells, causing its retention in either endosomes or nucleocapsids. Our data support a model in which dengue virus genome uncoating requires a nondegradative ubiquitination step, providing new insights into this crucial but understudied viral process.

  13. A Distance Measure for Genome Phylogenetic Analysis

    Science.gov (United States)

    Cao, Minh Duc; Allison, Lloyd; Dix, Trevor

    Phylogenetic analyses of species based on single genes or parts of the genomes are often inconsistent because of factors such as variable rates of evolution and horizontal gene transfer. The availability of more and more sequenced genomes allows phylogeny construction from complete genomes that is less sensitive to such inconsistency. For such long sequences, construction methods like maximum parsimony and maximum likelihood are often not possible due to their intensive computational requirement. Another class of tree construction methods, namely distance-based methods, require a measure of distances between any two genomes. Some measures such as evolutionary edit distance of gene order and gene content are computational expensive or do not perform well when the gene content of the organisms are similar. This study presents an information theoretic measure of genetic distances between genomes based on the biological compression algorithm expert model. We demonstrate that our distance measure can be applied to reconstruct the consensus phylogenetic tree of a number of Plasmodium parasites from their genomes, the statistical bias of which would mislead conventional analysis methods. Our approach is also used to successfully construct a plausible evolutionary tree for the γ-Proteobacteria group whose genomes are known to contain many horizontally transferred genes.

  14. The value of new genome references.

    Science.gov (United States)

    Worley, Kim C; Richards, Stephen; Rogers, Jeffrey

    2017-09-15

    Genomic information has become a ubiquitous and almost essential aspect of biological research. Over the last 10-15 years, the cost of generating sequence data from DNA or RNA samples has dramatically declined and our ability to interpret those data increased just as remarkably. Although it is still possible for biologists to conduct interesting and valuable research on species for which genomic data are not available, the impact of having access to a high quality whole genome reference assembly for a given species is nothing short of transformational. Research on a species for which we have no DNA or RNA sequence data is restricted in fundamental ways. In contrast, even access to an initial draft quality genome (see below for definitions) opens a wide range of opportunities that are simply not available without that reference genome assembly. Although a complete discussion of the impact of genome sequencing and assembly is beyond the scope of this short paper, the goal of this review is to summarize the most common and highest impact contributions that whole genome sequencing and assembly has had on comparative and evolutionary biology. Copyright © 2016. Published by Elsevier Inc.

  15. Genome edited animals: Learning from GM crops?

    Science.gov (United States)

    Bruce, Ann

    2017-06-01

    Genome editing of livestock is poised to become commercial reality, yet questions remain as to appropriate regulation, potential impact on the industry sector and public acceptability of products. This paper looks at how genome editing of livestock has attempted to learn some of the lessons from commercialisation of GM crops, and takes a systemic approach to explore some of the complexity and ambiguity in incorporating genome edited animals in a food production system. Current applications of genome editing are considered, viewed from the perspective of past technological applications. The question of what is genome editing, and can it be considered natural is examined. The implications of regulation on development of different sectors of livestock production systems are studied, with a particular focus on the veterinary sector. From an EU perspective, regulation of genome edited animals, although not necessarily the same as for GM crops, is advocated from a number of different perspectives. This paper aims to open up new avenues of research on genome edited animals, extending from the current primary focus on science and regulation, to engage with a wider-range of food system actors.

  16. Application of genomic tools in plant breeding.

    Science.gov (United States)

    Pérez-de-Castro, A M; Vilanova, S; Cañizares, J; Pascual, L; Blanca, J M; Díez, M J; Prohens, J; Picó, B

    2012-05-01

    Plant breeding has been very successful in developing improved varieties using conventional tools and methodologies. Nowadays, the availability of genomic tools and resources is leading to a new revolution of plant breeding, as they facilitate the study of the genotype and its relationship with the phenotype, in particular for complex traits. Next Generation Sequencing (NGS) technologies are allowing the mass sequencing of genomes and transcriptomes, which is producing a vast array of genomic information. The analysis of NGS data by means of bioinformatics developments allows discovering new genes and regulatory sequences and their positions, and makes available large collections of molecular markers. Genome-wide expression studies provide breeders with an understanding of the molecular basis of complex traits. Genomic approaches include TILLING and EcoTILLING, which make possible to screen mutant and germplasm collections for allelic variants in target genes. Re-sequencing of genomes is very useful for the genome-wide discovery of markers amenable for high-throughput genotyping platforms, like SSRs and SNPs, or the construction of high density genetic maps. All these tools and resources facilitate studying the genetic diversity, which is important for germplasm management, enhancement and use. Also, they allow the identification of markers linked to genes and QTLs, using a diversity of techniques like bulked segregant analysis (BSA), fine genetic mapping, or association mapping. These new markers are used for marker assisted selection, including marker assisted backcross selection, 'breeding by design', or new strategies, like genomic selection. In conclusion, advances in genomics are providing breeders with new tools and methodologies that allow a great leap forward in plant breeding, including the 'superdomestication' of crops and the genetic dissection and breeding for complex traits.

  17. Genomic stability during cellular reprogramming: Mission impossible?

    Energy Technology Data Exchange (ETDEWEB)

    Joest, Mathieu von; Búa Aguín, Sabela; Li, Han, E-mail: han.li@pasteur.fr

    2016-06-15

    The generation of induced pluripotent stem cells (iPSCs) from adult somatic cells is one of the most exciting discoveries in recent biomedical research. It holds tremendous potential in drug discovery and regenerative medicine. However, a series of reports highlighting genomic instability in iPSCs raises concerns about their clinical application. Although the mechanisms cause genomic instability during cellular reprogramming are largely unknown, several potential sources have been suggested. This review summarizes current knowledge on this active research field and discusses the latest efforts to alleviate the genomic insults during cellular reprogramming to generate iPSCs with enhanced quality and safety.

  18. Annotation of the Clostridium Acetobutylicum Genome

    Energy Technology Data Exchange (ETDEWEB)

    Daly, M. J.

    2004-06-09

    The genome sequence of the solvent producing bacterium Clostridium acetobutylicum ATCC824, has been determined by the shotgun approach. The genome consists of a 3.94 Mb chromosome and a 192 kb megaplasmid that contains the majority of genes responsible for solvent production. Comparison of C. acetobutylicum to Bacillus subtilis reveals significant local conservation of gene order, which has not been seen in comparisons of other genomes with similar, or, in some cases, closer, phylogenetic proximity. This conservation allows the prediction of many previously undetected operons in both bacteria.

  19. Genome instability: Linking ageing and brain degeneration.

    Science.gov (United States)

    Barzilai, Ari; Schumacher, Björn; Shiloh, Yosef

    2017-01-01

    Ageing is a multifactorial process affected by cumulative physiological changes resulting from stochastic processes combined with genetic factors, which together alter metabolic homeostasis. Genetic variation in maintenance of genome stability is emerging as an important determinant of ageing pace. Genome instability is also closely associated with a broad spectrum of conditions involving brain degeneration. Similarities and differences can be found between ageing-associated decline of brain functionality and the detrimental effect of genome instability on brain functionality and development. This review discusses these similarities and differences and highlights cell classes whose role in these processes might have been underestimated-glia and microglia. Copyright © 2016. Published by Elsevier B.V.

  20. Gene conversion in the rice genome

    DEFF Research Database (Denmark)

    Xu, Shuqing; Clark, Terry; Zheng, Hongkun

    2008-01-01

    -chromosomal conversions distributed between chromosome 1 and 5, 2 and 6, and 3 and 5 are more frequent than genome average (Z-test, P ... is not tightly linked to natural selection in the rice genome. To assess the contribution of segmental duplication on gene conversion statistics, we determined locations of conversion partners with respect to inter-chromosomal segment duplication. The number of conversions associated with segmentation is less...... involved in conversion events. CONCLUSION: The evolution of gene families in the rice genome may have been accelerated by conversion with pseudogenes. Our analysis suggests a possible role for gene conversion in the evolution of pathogen-response genes....

  1. The genome draft of coconut (Cocos nucifera).

    Science.gov (United States)

    Xiao, Yong; Xu, Pengwei; Fan, Haikuo; Baudouin, Luc; Xia, Wei; Bocs, Stéphanie; Xu, Junyang; Li, Qiong; Guo, Anping; Zhou, Lixia; Li, Jing; Wu, Yi; Ma, Zilong; Armero, Alix; Issali, Auguste Emmanuel; Liu, Na; Peng, Ming; Yang, Yaodong

    2017-11-01

    Coconut palm (Cocos nucifera,2n = 32), a member of genus Cocos and family Arecaceae (Palmaceae), is an important tropical fruit and oil crop. Currently, coconut palm is cultivated in 93 countries, including Central and South America, East and West Africa, Southeast Asia and the Pacific Islands, with a total growth area of more than 12 million hectares [1]. Coconut palm is generally classified into 2 main categories: "Tall" (flowering 8-10 years after planting) and "Dwarf" (flowering 4-6 years after planting), based on morphological characteristics and breeding habits. This Palmae species has a long growth period before reproductive years, which hinders conventional breeding progress. In spite of initial successes, improvements made by conventional breeding have been very slow. In the present study, we obtained de novo sequences of the Cocos nucifera genome: a major genomic resource that could be used to facilitate molecular breeding in Cocos nucifera and accelerate the breeding process in this important crop. A total of 419.67 gigabases (Gb) of raw reads were generated by the Illumina HiSeq 2000 platform using a series of paired-end and mate-pair libraries, covering the predicted Cocos nucifera genome length (2.42 Gb, variety "Hainan Tall") to an estimated ×173.32 read depth. A total scaffold length of 2.20 Gb was generated (N50 = 418 Kb), representing 90.91% of the genome. The coconut genome was predicted to harbor 28 039 protein-coding genes, which is less than in Phoenix dactylifera (PDK30: 28 889), Phoenix dactylifera (DPV01: 41 660), and Elaeis guineensis (EG5: 34 802). BUSCO evaluation demonstrated that the obtained scaffold sequences covered 90.8% of the coconut genome and that the genome annotation was 74.1% complete. Genome annotation results revealed that 72.75% of the coconut genome consisted of transposable elements, of which long-terminal repeat retrotransposons elements (LTRs) accounted for the largest proportion (92.23%). Comparative analysis of the

  2. On Sorting Genomes with DCJ and Indels

    Science.gov (United States)

    Braga, Marília D. V.

    A previous work of Braga, Willing and Stoye compared two genomes with unequal content, but without duplications, and presented a new linear time algorithm to compute the genomic distance, considering double cut and join (DCJ) operations, insertions and deletions. Here we derive from this approach an algorithm to sort one genome into another one also using DCJ, insertions and deletions. The optimal sorting scenarios can have different compositions and we compare two types of sorting scenarios: one that maximizes and one that minimizes the number of DCJ operations with respect to the number of insertions and deletions.

  3. A post-assembly genome-improvement toolkit (PAGIT) to obtain annotated genomes from contigs.

    Science.gov (United States)

    Swain, Martin T; Tsai, Isheng J; Assefa, Samual A; Newbold, Chris; Berriman, Matthew; Otto, Thomas D

    2012-06-07

    Genome projects now produce draft assemblies within weeks owing to advanced high-throughput sequencing technologies. For milestone projects such as Escherichia coli or Homo sapiens, teams of scientists were employed to manually curate and finish these genomes to a high standard. Nowadays, this is not feasible for most projects, and the quality of genomes is generally of a much lower standard. This protocol describes software (PAGIT) that is used to improve the quality of draft genomes. It offers flexible functionality to close gaps in scaffolds, correct base errors in the consensus sequence and exploit reference genomes (if available) in order to improve scaffolding and generating annotations. The protocol is most accessible for bacterial and small eukaryotic genomes (up to 300 Mb), such as pathogenic bacteria, malaria and parasitic worms. Applying PAGIT to an E. coli assembly takes ∼24 h: it doubles the average contig size and annotates over 4,300 gene models.

  4. Genomic treasure troves: complete genome sequencing of herbarium and insect museum specimens.

    Science.gov (United States)

    Staats, Martijn; Erkens, Roy H J; van de Vossenberg, Bart; Wieringa, Jan J; Kraaijeveld, Ken; Stielow, Benjamin; Geml, József; Richardson, James E; Bakker, Freek T

    2013-01-01

    Unlocking the vast genomic diversity stored in natural history collections would create unprecedented opportunities for genome-scale evolutionary, phylogenetic, domestication and population genomic studies. Many researchers have been discouraged from using historical specimens in molecular studies because of both generally limited success of DNA extraction and the challenges associated with PCR-amplifying highly degraded DNA. In today's next-generation sequencing (NGS) world, opportunities and prospects for historical DNA have changed dramatically, as most NGS methods are actually designed for taking short fragmented DNA molecules as templates. Here we show that using a standard multiplex and paired-end Illumina sequencing approach, genome-scale sequence data can be generated reliably from dry-preserved plant, fungal and insect specimens collected up to 115 years ago, and with minimal destructive sampling. Using a reference-based assembly approach, we were able to produce the entire nuclear genome of a 43-year-old Arabidopsis thaliana (Brassicaceae) herbarium specimen with high and uniform sequence coverage. Nuclear genome sequences of three fungal specimens of 22-82 years of age (Agaricus bisporus, Laccaria bicolor, Pleurotus ostreatus) were generated with 81.4-97.9% exome coverage. Complete organellar genome sequences were assembled for all specimens. Using de novo assembly we retrieved between 16.2-71.0% of coding sequence regions, and hence remain somewhat cautious about prospects for de novo genome assembly from historical specimens. Non-target sequence contaminations were observed in 2 of our insect museum specimens. We anticipate that future museum genomics projects will perhaps not generate entire genome sequences in all cases (our specimens contained relatively small and low-complexity genomes), but at least generating vital comparative genomic data for testing (phylo)genetic, demographic and genetic hypotheses, that become increasingly more horizontal

  5. Mining genome sequencing data to identify the genomic features linked to breast cancer histopathology

    Science.gov (United States)

    Ping, Zheng; Siegal, Gene P.; Almeida, Jonas S.; Schnitt, Stuart J.; Shen, Dejun

    2014-01-01

    Background: Genetics and genomics have radically altered our understanding of breast cancer progression. However, the genomic basis of various histopathologic features of breast cancer is not yet well-defined. Materials and Methods: The Cancer Genome Atlas (TCGA) is an international database containing a large collection of human cancer genome sequencing data. cBioPortal is a web tool developed for mining these sequencing data. We performed mining of TCGA sequencing data in an attempt to characterize the genomic features correlated with breast cancer histopathology. We first assessed the quality of the TCGA data using a group of genes with known alterations in various cancers. Both genome-wide gene mutation and copy number changes as well as a group of genes with a high frequency of genetic changes were then correlated with various histopathologic features of invasive breast cancer. Results: Validation of TCGA data using a group of genes with known alterations in breast cancer suggests that the TCGA has accurately documented the genomic abnormalities of multiple malignancies. Further analysis of TCGA breast cancer sequencing data shows that accumulation of specific genomic defects is associated with higher tumor grade, larger tumor size and receptor negativity. Distinct groups of genomic changes were found to be associated with the different grades of invasive ductal carcinoma. The mutator role of the TP53 gene was validated by genomic sequencing data of invasive breast cancer and TP53 mutation was found to play a critical role in defining high tumor grade. Conclusions: Data mining of the TCGA genome sequencing data is an innovative and reliable method to help characterize the genomic abnormalities associated with histopathologic features of invasive breast cancer. PMID:24672738

  6. Mining genome sequencing data to identify the genomic features linked to breast cancer histopathology

    Directory of Open Access Journals (Sweden)

    Zheng Ping

    2014-01-01

    Full Text Available Background: Genetics and genomics have radically altered our understanding of breast cancer progression. However, the genomic basis of various histopathologic features of breast cancer is not yet well-defined. Materials and Methods: The Cancer Genome Atlas (TCGA is an international database containing a large collection of human cancer genome sequencing data. cBioPortal is a web tool developed for mining these sequencing data. We performed mining of TCGA sequencing data in an attempt to characterize the genomic features correlated with breast cancer histopathology. We first assessed the quality of the TCGA data using a group of genes with known alterations in various cancers. Both genome-wide gene mutation and copy number changes as well as a group of genes with a high frequency of genetic changes were then correlated with various histopathologic features of invasive breast cancer. Results: Validation of TCGA data using a group of genes with known alterations in breast cancer suggests that the TCGA has accurately documented the genomic abnormalities of multiple malignancies. Further analysis of TCGA breast cancer sequencing data shows that accumulation of specific genomic defects is associated with higher tumor grade, larger tumor size and receptor negativity. Distinct groups of genomic changes were found to be associated with the different grades of invasive ductal carcinoma. The mutator role of the TP53 gene was validated by genomic sequencing data of invasive breast cancer and TP53 mutation was found to play a critical role in defining high tumor grade. Conclusions: Data mining of the TCGA genome sequencing data is an innovative and reliable method to help characterize the genomic abnormalities associated with histopathologic features of invasive breast cancer.

  7. PopGenome: An Efficient Swiss Army Knife for Population Genomic Analyses in R

    OpenAIRE

    Pfeifer, Bastian; Wittelsbürger, Ulrich; Ramos-Onsins, Sebastian E.; Lercher, Martin J.

    2014-01-01

    Although many computer programs can perform population genetics calculations, they are typically limited in the analyses and data input formats they offer; few applications can process the large data sets produced by whole-genome resequencing projects. Furthermore, there is no coherent framework for the easy integration of new statistics into existing pipelines, hindering the development and application of new population genetics and genomics approaches. Here, we present PopGenome, a populati...

  8. Next-Generation Genomics Facility at C-CAMP: Accelerating Genomic Research in India

    Science.gov (United States)

    S, Chandana; Russiachand, Heikham; H, Pradeep; S, Shilpa; M, Ashwini; S, Sahana; B, Jayanth; Atla, Goutham; Jain, Smita; Arunkumar, Nandini; Gowda, Malali

    2014-01-01

    Next-Generation Sequencing (NGS; http://www.genome.gov/12513162) is a recent life-sciences technological revolution that allows scientists to decode genomes or transcriptomes at a much faster rate with a lower cost. Genomic-based studies are in a relatively slow pace in India due to the non-availability of genomics experts, trained personnel and dedicated service providers. Using NGS there is a lot of potential to study India's national diversity (of all kinds). We at the Centre for Cellular and Molecular Platforms (C-CAMP) have launched the Next Generation Genomics Facility (NGGF) to provide genomics service to scientists, to train researchers and also work on national and international genomic projects. We have HiSeq1000 from Illumina and GS-FLX Plus from Roche454. The long reads from GS FLX Plus, and high sequence depth from HiSeq1000, are the best and ideal hybrid approaches for de novo and re-sequencing of genomes and transcriptomes. At our facility, we have sequenced around 70 different organisms comprising of more than 388 genomes and 615 transcriptomes – prokaryotes and eukaryotes (fungi, plants and animals). In addition we have optimized other unique applications such as small RNA (miRNA, siRNA etc), long Mate-pair sequencing (2 to 20 Kb), Coding sequences (Exome), Methylome (ChIP-Seq), Restriction Mapping (RAD-Seq), Human Leukocyte Antigen (HLA) typing, mixed genomes (metagenomes) and target amplicons, etc. Translating DNA sequence data from NGS sequencer into meaningful information is an important exercise. Under NGGF, we have bioinformatics experts and high-end computing resources to dissect NGS data such as genome assembly and annotation, gene expression, target enrichment, variant calling (SSR or SNP), comparative analysis etc. Our services (sequencing and bioinformatics) have been utilized by more than 45 organizations (academia and industry) both within India and outside, resulting several publications in peer-reviewed journals and several genomic

  9. Comparative genomic hybridizations reveal absence of large Streptomyces coelicolor genomic islands in Streptomyces lividans

    OpenAIRE

    Jayapal, Karthik P; Lian, Wei; Glod, Frank; Sherman, David H; Hu, Wei-Shou

    2007-01-01

    Abstract Background The genomes of Streptomyces coelicolor and Streptomyces lividans bear a considerable degree of synteny. While S. coelicolor is the model streptomycete for studying antibiotic synthesis and differentiation, S. lividans is almost exclusively considered as the preferred host, among actinomycetes, for cloning and expression of exogenous DNA. We used whole genome microarrays as a comparative genomics tool for identifying the subtle differences between these two chromosomes. Res...

  10. The Human Genome Diversity Project

    Energy Technology Data Exchange (ETDEWEB)

    Cavalli-Sforza, L. [Stanford Univ., CA (United States)

    1994-12-31

    The Human Genome Diversity Project (HGD Project) is an international anthropology project that seeks to study the genetic richness of the entire human species. This kind of genetic information can add a unique thread to the tapestry knowledge of humanity. Culture, environment, history, and other factors are often more important, but humanity`s genetic heritage, when analyzed with recent technology, brings another type of evidence for understanding species` past and present. The Project will deepen the understanding of this genetic richness and show both humanity`s diversity and its deep and underlying unity. The HGD Project is still largely in its planning stages, seeking the best ways to reach its goals. The continuing discussions of the Project, throughout the world, should improve the plans for the Project and their implementation. The Project is as global as humanity itself; its implementation will require the kinds of partnerships among different nations and cultures that make the involvement of UNESCO and other international organizations particularly appropriate. The author will briefly discuss the Project`s history, describe the Project, set out the core principles of the Project, and demonstrate how the Project will help combat the scourge of racism.

  11. Genomes of foodborne and waterborne pathogens

    National Research Council Canada - National Science Library

    Fratamico, Pina M; Liu, Yanhong; Kathariou, Sophia

    2011-01-01

    ... of Pathogenic Vibrio cholerae * 85 Salvador Almagro-Moreno, Ronan A. Murphy, and E. Fidelma Boyd 8. Genomics of the Enteropathogenic Yersiniae * 101 Alan McNally, Nicholas R. Thomson, and Brendan W. ...

  12. Whole Genome Epidemiological Typing of Salmonella

    DEFF Research Database (Denmark)

    Leekitcharoenphon, Pimlapas

    available Salmonella enterica genomes (accessed in April 2011). A consensus tree based on variation of the core genes gives better resolution than 16S rRNA and MLST that rarely provide separation between closely related strains. The performance of the pan-genome tree which is based on the presence....../absence of all genes across genomes, is similar to the consensus tree but with higher branching confidence value. The core genes can be divided into two categories: a few highly variable genes and a larger set of conserved core genes, with low variance. These core genes are useful for investigating molecular...... evolution and remain useful as candidate genes for bacterial genome typing-even if they cannot be expected to differentiate highly clonal isolates e.g. outbreak cases of Salmonella [I]. To achieve successful ‘real-time’ monitoring and identification of outbreaks, rapid and reliable sub-typing is essential...

  13. 10KP: A phylodiverse genome sequencing plan

    Science.gov (United States)

    Cheng, Shifeng; Melkonian, Michael; Brockington, Samuel; Archibald, John M; Delaux, Pierre-Marc; Melkonian, Barbara; Mavrodiev, Evgeny V; Sun, Wenjing; Fu, Yuan; Yang, Huanming; Soltis, Douglas E; Graham, Sean W; Soltis, Pamela S; Liu, Xin; Xu, Xun

    2018-01-01

    Abstract Understanding plant evolution and diversity in a phylogenomic context is an enormous challenge due, in part, to limited availability of genome-scale data across phylodiverse species. The 10KP (10,000 Plants) Genome Sequencing Project will sequence and characterize representative genomes from every major clade of embryophytes, green algae, and protists (excluding fungi) within the next 5 years. By implementing and continuously improving leading-edge sequencing technologies and bioinformatics tools, 10KP will catalogue the genome content of plant and protist diversity and make these data freely available as an enduring foundation for future scientific discoveries and applications. 10KP is structured as an international consortium, open to the global community, including botanical gardens, plant research institutes, universities, and private industry. Our immediate goal is to establish a policy framework for this endeavor, the principles of which are outlined here. PMID:29618049

  14. Initial genomics of the human nucleolus.

    Directory of Open Access Journals (Sweden)

    Attila Németh

    2010-03-01

    Full Text Available We report for the first time the genomics of a nuclear compartment of the eukaryotic cell. 454 sequencing and microarray analysis revealed the pattern of nucleolus-associated chromatin domains (NADs in the linear human genome and identified different gene families and certain satellite repeats as the major building blocks of NADs, which constitute about 4% of the genome. Bioinformatic evaluation showed that NAD-localized genes take part in specific biological processes, like the response to other organisms, odor perception, and tissue development. 3D FISH and immunofluorescence experiments illustrated the spatial distribution of NAD-specific chromatin within interphase nuclei and its alteration upon transcriptional changes. Altogether, our findings describe the nature of DNA sequences associated with the human nucleolus and provide insights into the function of the nucleolus in genome organization and establishment of nuclear architecture.

  15. Initial Genomics of the Human Nucleolus

    Science.gov (United States)

    Németh, Attila; Conesa, Ana; Santoyo-Lopez, Javier; Medina, Ignacio; Montaner, David; Péterfia, Bálint; Solovei, Irina; Cremer, Thomas; Dopazo, Joaquin; Längst, Gernot

    2010-01-01

    We report for the first time the genomics of a nuclear compartment of the eukaryotic cell. 454 sequencing and microarray analysis revealed the pattern of nucleolus-associated chromatin domains (NADs) in the linear human genome and identified different gene families and certain satellite repeats as the major building blocks of NADs, which constitute about 4% of the genome. Bioinformatic evaluation showed that NAD–localized genes take part in specific biological processes, like the response to other organisms, odor perception, and tissue development. 3D FISH and immunofluorescence experiments illustrated the spatial distribution of NAD–specific chromatin within interphase nuclei and its alteration upon transcriptional changes. Altogether, our findings describe the nature of DNA sequences associated with the human nucleolus and provide insights into the function of the nucleolus in genome organization and establishment of nuclear architecture. PMID:20361057

  16. Cloud Technology May Widen Genomic Bottleneck - TCGA

    Science.gov (United States)

    Computational biologist Dr. Ilya Shmulevich suggests that renting cloud computing power might widen the bottleneck for analyzing genomic data. Learn more about his experience with the Cloud in this TCGA in Action Case Study.

  17. Creating a platform for collaborative genomic research

    Directory of Open Access Journals (Sweden)

    Mark Smithson

    2017-04-01

    The developed genomics informatics platform provides a step-change in this type of genetic research, accelerating reproducible collaborative research across multiple disparate organisations and data sources, of varying type and complexity.

  18. Genomics Data for Cowpea Pests in Africa

    Data.gov (United States)

    US Agency for International Development — This dataset contains the complete mitochondrial genome of Anoplocnemis curvipes F. (Coreinea, Coreidae, Heteroptera), a pest of fresh cowpea pods. To get to the...

  19. CRISPR-Cas: Revolutionising genome engineering

    African Journals Online (AJOL)

    Within the repeating As, Cs, Gs and Ts of the human genome is the .... Medicine, South African Medical Research Council Extramural Unit for Stem Cell Research and Therapy, ... Second, should we allow embryonic/germline engineering, or.

  20. Genomic definition of species. Revision 1

    Energy Technology Data Exchange (ETDEWEB)

    Crkvenjakov, R.; Dramanac, R.

    1992-06-01

    A genome is the sum total of the DNA sequences in the cells of an individual organism. The common usage that species possess genomes comes naturally to biochemists, who have shown that all protein and nucleic acid molecules are at the same time species and individual-specific, with minor individual variations being superimposed on a consensus sequence that is constant for a species. By extension, this property is attributed to the common features of DNA in the chromosomes of members of a given species and is called (species) genome. The definition of species based on chromosomes, genes, or genome common to its member organisms has been implied or mentioned in passing numerous times. Some population biologists think that members of species have similar ``homeostatic genotypes,`` which are to a degree resistant to mutation or environmental change in the production of a basic phenotype.