WorldWideScience

Sample records for genome program contractor-grantee

  1. DOE Human Genome Program contractor-grantee workshop

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1996-01-01

    This volume contains the proceedings for the DOE Human Genome Program`s Contractor-Grantee Workshop V held in Sante Fe, New Mexico January 28, February 1, 1996. Presentations were divided into sessions entitled Sequencing; Mapping; Informatics; Ethical, Legal, and Social Issues; and Infrastructure. Reports of individual projects described herein are separately indexed and abstracted for the database.

  2. DOE Human Genome Program: Contractor-Grantee Workshop IV, November 13--17, 1994, Santa Fe, New Mexico

    Energy Technology Data Exchange (ETDEWEB)

    1994-10-01

    This volume contains the proceedings of the fourth Contractor-Grantee Workshop for the Department of Energy (DOE) Human Genome Program. Of the 204 abstracts in this book, some 200 describe the genome research of DOE-funded grantees and contractors located at the multidisciplinary centers at Lawrence Berkeley Laboratory, Lawrence Livermore National Laboratory, and Los Alamos National Laboratory; other DOE-supported laboratories; and more than 54 universities, research organizations, and companies in the United States and abroad. Included are 16 abstracts from ongoing projects in the Ethical, Legal, and Social Issues (ELSI) component, an area that continues to attract considerable attention from a wide variety of interested parties. Three abstracts summarize work in the new Microbial Genome Initiative launched this year by the Office of Health and Environmental Research (OHER) to provide genome sequence and mapping data on industrially important microorganisms and those that live under extreme conditions. Many of the projects will be discussed at plenary sessions held throughout the workshop, and all are represented in the poster sessions.

  3. Genomics:GTL Contractor-Grantee Workshop IV and Metabolic Engineering Working Group Inter-Agency Conference on Metabolic Engineering 2006

    Energy Technology Data Exchange (ETDEWEB)

    Mansfield, Betty Kay [ORNL; Martin, Sheryl A [ORNL

    2006-02-01

    Welcome to the 2006 joint meeting of the fourth Genomics:GTL Contractor-Grantee Workshop and the six Metabolic Engineering Working Group Inter-Agency Conference. The vision and scope of the Genomics:GTL program continue to expand and encompass research and technology issues from diverse scientific disciplines, attracting broad interest and support from researchers at universities, DOE national laboratories, and industry. Metabolic engineering's vision is the targeted and purposeful alteration of metabolic pathways to improve the understanding and use of cellular pathways for chemical transformation, energy transduction, and supramolecular assembly. These two programs have much complementarity in both vision and technological approaches, as reflected in this joint workshop. GLT's challenge to the scientific community remains the further development and use of a broad array of innovative technologies and computational tools to systematically leverage the knowledge and capabilities brought to us by DNA sequencing projects. The goal is to seek a broad and predictive understanding of the functioning and control of complex systems--individual microbes, microbial communities, and plants. GTL's prominent position at the interface of the physical, computational, and biological sciences is both a strength and challenge. Microbes remain GTL's principal biological focus. In the complex 'simplicity' of microbes, they find capabilities needed by DOE and the nation for clean and secure energy, cleanup of environmental contamination, and sequestration of atmospheric carbon dioxide that contributes to global warming. An ongoing challenge for the entire GTL community is to demonstrate that the fundamental science conducted in each of your research projects brings us a step closer to biology-based solutions for these important national energy and environmental needs.

  4. JGI Fungal Genomics Program

    Energy Technology Data Exchange (ETDEWEB)

    Grigoriev, Igor V.

    2011-03-14

    Genomes of energy and environment fungi are in focus of the Fungal Genomic Program at the US Department of Energy Joint Genome Institute (JGI). Its key project, the Genomics Encyclopedia of Fungi, targets fungi related to plant health (symbionts, pathogens, and biocontrol agents) and biorefinery processes (cellulose degradation, sugar fermentation, industrial hosts), and explores fungal diversity by means of genome sequencing and analysis. Over 50 fungal genomes have been sequenced by JGI to date and released through MycoCosm (www.jgi.doe.gov/fungi), a fungal web-portal, which integrates sequence and functional data with genome analysis tools for user community. Sequence analysis supported by functional genomics leads to developing parts list for complex systems ranging from ecosystems of biofuel crops to biorefineries. Recent examples of such 'parts' suggested by comparative genomics and functional analysis in these areas are presented here

  5. Epidemiology & Genomics Research Program

    Science.gov (United States)

    The Epidemiology and Genomics Research Program, in the National Cancer Institute's Division of Cancer Control and Population Sciences, funds research in human populations to understand the determinants of cancer occurrence and outcomes.

  6. Fungal Genomics Program

    Energy Technology Data Exchange (ETDEWEB)

    Grigoriev, Igor

    2012-03-12

    The JGI Fungal Genomics Program aims to scale up sequencing and analysis of fungal genomes to explore the diversity of fungi important for energy and the environment, and to promote functional studies on a system level. Combining new sequencing technologies and comparative genomics tools, JGI is now leading the world in fungal genome sequencing and analysis. Over 120 sequenced fungal genomes with analytical tools are available via MycoCosm (www.jgi.doe.gov/fungi), a web-portal for fungal biologists. Our model of interacting with user communities, unique among other sequencing centers, helps organize these communities, improves genome annotation and analysis work, and facilitates new larger-scale genomic projects. This resulted in 20 high-profile papers published in 2011 alone and contributing to the Genomics Encyclopedia of Fungi, which targets fungi related to plant health (symbionts, pathogens, and biocontrol agents) and biorefinery processes (cellulose degradation, sugar fermentation, industrial hosts). Our next grand challenges include larger scale exploration of fungal diversity (1000 fungal genomes), developing molecular tools for DOE-relevant model organisms, and analysis of complex systems and metagenomes.

  7. Human Genome Program

    Energy Technology Data Exchange (ETDEWEB)

    1993-01-01

    The DOE Human Genome program has grown tremendously, as shown by the marked increase in the number of genome-funded projects since the last workshop held in 1991. The abstracts in this book describe the genome research of DOE-funded grantees and contractors and invited guests, and all projects are represented at the workshop by posters. The 3-day meeting includes plenary sessions on ethical, legal, and social issues pertaining to the availability of genetic data; sequencing techniques, informatics support; and chromosome and cDNA mapping and sequencing.

  8. Human Genome Program

    Energy Technology Data Exchange (ETDEWEB)

    1993-01-01

    The DOE Human Genome program has grown tremendously, as shown by the marked increase in the number of genome-funded projects since the last workshop held in 1991. The abstracts in this book describe the genome research of DOE-funded grantees and contractors and invited guests, and all projects are represented at the workshop by posters. The 3-day meeting includes plenary sessions on ethical, legal, and social issues pertaining to the availability of genetic data; sequencing techniques, informatics support; and chromosome and cDNA mapping and sequencing.

  9. Programs | Office of Cancer Genomics

    Science.gov (United States)

    OCG facilitates cancer genomics research through a series of highly-focused programs. These programs generate and disseminate genomic data for use by the cancer research community. OCG programs also promote advances in technology-based infrastructure and create valuable experimental reagents and tools. OCG programs encourage collaboration by interconnecting with other genomics and cancer projects in order to accelerate translation of findings into the clinic. Below are OCG’s current, completed, and initiated programs:

  10. Human genome. 1993 Program report

    Energy Technology Data Exchange (ETDEWEB)

    1994-03-01

    The purpose of this report is to update the Human Genome 1991-92 Program Report and provide new information on the DOE genome program to researchers, program managers, other government agencies, and the interested public. This FY 1993 supplement includes abstracts of 60 new or renewed projects and listings of 112 continuing and 28 completed projects. These two reports, taken together, present the most complete published view of the DOE Human Genome Program through FY 1993. Research is progressing rapidly toward 15-year goals of mapping and sequencing the DNA of each of the 24 different human chromosomes.

  11. Genomic Signal Search by Dynamic Programming

    Institute of Scientific and Technical Information of China (English)

    ZHENG Wei-Mou

    2003-01-01

    A general and flexible multi-motif model is proposed based on dynamic programming. By extending theGibbs sampler to the dynamic programming and introducing temperature, an efficient algorithm is developed. Branchpoint signalsequences and translation initiation sequences extracted from the rice genome are then examined.

  12. [Mapping and human genome sequence program].

    Science.gov (United States)

    Weissenbach, J

    1997-03-01

    Until recently, human genome programs focused primarily on establishing maps that would provide signposts to researchers seeking to identify genes responsible for inherited diseases, as well as a basis for genome sequencing studies. Preestablished gene mapping goals have been reached. The over 7,000 microsatellite markers identified to date provide a map of sufficient density to allow localization of the gene of a monogenic disease with a precision of 1 to 2 million base pairs. The physical map, based on systematically arranged overlapping sets of artificial yeast chromosomes (YACs), has also made considerable headway during the last few years. The most recently published map covers more than 90% of the genome. However, currently available physical maps cannot be used for sequencing studies because multiple rearrangements occur in YACs. The recently developed sets of radioinduced hybrids are extremely useful for incorporating genes into existing maps. A network of American and European laboratories has successfully used these radioinduced hybrids to map 15,000 gene tags from large-scale cDNA library sequencing programs. There are increasingly pressing reasons for initiating large scale human genome sequencing studies.

  13. VIGOR, an annotation program for small viral genomes

    Directory of Open Access Journals (Sweden)

    Wang Shiliang

    2010-09-01

    Full Text Available Abstract Background The decrease in cost for sequencing and improvement in technologies has made it easier and more common for the re-sequencing of large genomes as well as parallel sequencing of small genomes. It is possible to completely sequence a small genome within days and this increases the number of publicly available genomes. Among the types of genomes being rapidly sequenced are those of microbial and viral genomes responsible for infectious diseases. However, accurate gene prediction is a challenge that persists for decoding a newly sequenced genome. Therefore, accurate and efficient gene prediction programs are highly desired for rapid and cost effective surveillance of RNA viruses through full genome sequencing. Results We have developed VIGOR (Viral Genome ORF Reader, a web application tool for gene prediction in influenza virus, rotavirus, rhinovirus and coronavirus subtypes. VIGOR detects protein coding regions based on sequence similarity searches and can accurately detect genome specific features such as frame shifts, overlapping genes, embedded genes, and can predict mature peptides within the context of a single polypeptide open reading frame. Genotyping capability for influenza and rotavirus is built into the program. We compared VIGOR to previously described gene prediction programs, ZCURVE_V, GeneMarkS and FLAN. The specificity and sensitivity of VIGOR are greater than 99% for the RNA viral genomes tested. Conclusions VIGOR is a user friendly web-based genome annotation program for five different viral agents, influenza, rotavirus, rhinovirus, coronavirus and SARS coronavirus. This is the first gene prediction program for rotavirus and rhinovirus for public access. VIGOR is able to accurately predict protein coding genes for the above five viral types and has the capability to assign function to the predicted open reading frames and genotype influenza virus. The prediction software was designed for performing high

  14. Genomic prediction in a breeding program of perennial ryegrass

    DEFF Research Database (Denmark)

    Fé, Dario; Ashraf, Bilal; Greve-Pedersen, Morten;

    2015-01-01

    We present a genomic selection study performed on 1918 rye grass families (Lolium perenne L.), which were derived from a commercial breeding program at DLF-Trifolium, Denmark. Phenotypes were recorded on standard plots, across 13 years and in 6 different countries. Variants were identified...... in utilizing genomic selection in rye grass....

  15. GenomePixelizer--a visualization program for comparative genomics within and between species.

    Science.gov (United States)

    Kozik, A; Kochetkova, E; Michelmore, R

    2002-02-01

    GenomePixelizer is a visualization tool that generates custom images of the physical or genetic positions of specified sets of genes in whole genomes or parts of genomes. Multiple sets of genes can be shown simultaneously with user-defined characteristics displayed. It allows the analysis of duplication events within and between species based on sequence similarities. The program is written in Tcl/Tk and works on any platform that supports the Tcl/Tk toolkit. GenomePixelizer generates HTML ImageMap tags for each gene in the image allowing links to databases. Images can be saved and presented on web pages.

  16. BSMAP: whole genome bisulfite sequence MAPping program

    Directory of Open Access Journals (Sweden)

    Li Wei

    2009-07-01

    Full Text Available Abstract Background Bisulfite sequencing is a powerful technique to study DNA cytosine methylation. Bisulfite treatment followed by PCR amplification specifically converts unmethylated cytosines to thymine. Coupled with next generation sequencing technology, it is able to detect the methylation status of every cytosine in the genome. However, mapping high-throughput bisulfite reads to the reference genome remains a great challenge due to the increased searching space, reduced complexity of bisulfite sequence, asymmetric cytosine to thymine alignments, and multiple CpG heterogeneous methylation. Results We developed an efficient bisulfite reads mapping algorithm BSMAP to address the above issues. BSMAP combines genome hashing and bitwise masking to achieve fast and accurate bisulfite mapping. Compared with existing bisulfite mapping approaches, BSMAP is faster, more sensitive and more flexible. Conclusion BSMAP is the first general-purpose bisulfite mapping software. It is able to map high-throughput bisulfite reads at whole genome level with feasible memory and CPU usage. It is freely available under GPL v3 license at http://code.google.com/p/bsmap/.

  17. Human Genome Program Report. Part 1, Overview and Progress

    Science.gov (United States)

    1997-11-01

    This report contains Part 1 of a two-part report to reflect research and progress in the U.S. Department of Energy Human Genome Program from 1994 through 1996, with specified updates made just before publication. Part 1 consists of the program overview and report on progress.

  18. Human genome program report. Part 1, overview and progress

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1997-11-01

    This report contains Part 1 of a two-part report to reflect research and progress in the U.S. Department of Energy Human Genome Program from 1994 through 1996, with specified updates made just before publication. Part 1 consists of the program overview and report on progress.

  19. Primer on Molecular Genetics; DOE Human Genome Program

    Science.gov (United States)

    1992-04-01

    This report is taken from the April 1992 draft of the DOE Human Genome 1991--1992 Program Report, which is expected to be published in May 1992. The primer is intended to be an introduction to basic principles of molecular genetics pertaining to the genome project. The material contained herein is not final and may be incomplete. Techniques of genetic mapping and DNA sequencing are described.

  20. Primer on molecular genetics. DOE Human Genome Program

    Energy Technology Data Exchange (ETDEWEB)

    1992-04-01

    This report is taken from the April 1992 draft of the DOE Human Genome 1991--1992 Program Report, which is expected to be published in May 1992. The primer is intended to be an introduction to basic principles of molecular genetics pertaining to the genome project. The material contained herein is not final and may be incomplete. Techniques of genetic mapping and DNA sequencing are described.

  1. Human genome program report. Part 2, 1996 research abstracts

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1997-11-01

    This report contains Part 2 of a two-part report to reflect research and progress in the US Department of Energy Human Genome Program from 1994 through 1996, with specified updates made just before publication. Part 2 consists of 1996 research abstracts. Attention is focused on the following: sequencing; mapping; informatics; ethical, legal, and social issues; infrastructure; and small business innovation research.

  2. Human Genome Program Report. Part 2, 1996 Research Abstracts

    Science.gov (United States)

    1997-11-01

    This report contains Part 2 of a two-part report to reflect research and progress in the US Department of Energy Human Genome Program from 1994 through 1996, with specified updates made just before publication. Part 2 consists of 1996 research abstracts. Attention is focused on the following: sequencing; mapping; informatics; ethical, legal, and social issues; infrastructure; and small business innovation research.

  3. Assembling networks of microbial genomes using linear programming

    Directory of Open Access Journals (Sweden)

    Holloway Catherine

    2010-11-01

    Full Text Available Abstract Background Microbial genomes exhibit complex sets of genetic affinities due to lateral genetic transfer. Assessing the relative contributions of parent-to-offspring inheritance and gene sharing is a vital step in understanding the evolutionary origins and modern-day function of an organism, but recovering and showing these relationships is a challenging problem. Results We have developed a new approach that uses linear programming to find between-genome relationships, by treating tables of genetic affinities (here, represented by transformed BLAST e-values as an optimization problem. Validation trials on simulated data demonstrate the effectiveness of the approach in recovering and representing vertical and lateral relationships among genomes. Application of the technique to a set comprising Aquifex aeolicus and 75 other thermophiles showed an important role for large genomes as 'hubs' in the gene sharing network, and suggested that genes are preferentially shared between organisms with similar optimal growth temperatures. We were also able to discover distinct and common genetic contributors to each sequenced representative of genus Pseudomonas. Conclusions The linear programming approach we have developed can serve as an effective inference tool in its own right, and can be an efficient first step in a more-intensive phylogenomic analysis.

  4. Genomic resources in mungbean for future breeding programs

    Directory of Open Access Journals (Sweden)

    Sue K Kim

    2015-08-01

    Full Text Available Among the legume family, mungbean (Vigna radiata has become one of the important crops in Asia, showing a steady increase in global production. It provides a good source of protein and contains most notably folate and iron. Beyond the nutritional value of mungbean, certain features make it a well-suited model organism among legume plants because of its small genome size, short life-cycle, self-pollinating, and close genetic relationship to other legumes. In the past, there have been several efforts to develop molecular markers and linkage maps associated with agronomic traits for the genetic improvement of mungbean and, ultimately, breeding for cultivar development to increase the average yields of mungbean. The recent release of a reference genome of the cultivated mungbean (V. radiata var. radiata VC1973A and an additional de novo sequencing of a wild relative mungbean (V. radiata var. sublobata has provided a framework for mungbean genetic and genome research, that can further be used for genome-wide association and functional studies to identify genes related to specific agronomic traits. Moreover, the diverse gene pool of wild mungbean comprises valuable genetic resources of beneficial genes that may be helpful in widening the genetic diversity of cultivated mungbean. This review paper covers the research progress on molecular and genomics approaches and the current status of breeding programs that have developed to move toward the ultimate goal of mungbean improvement.

  5. [Strategies of the study on herb genome program].

    Science.gov (United States)

    Chen, Shi-lin; Sun, Yong-zhen; Xu, Jiang; Luo, Hong-mei; Sun, Chao; He, Liu; Cheng, Xiang-lin; Zhang, Bo-li; Xiao, Pei-gen

    2010-07-01

    Herb Genome Program (HerbGP) includes a series of projects on whole genome sequencing (WGS) and post-genomics research of medicinal plants with unique secondary metabolism pathways or/and those of great medical and pharmaceutical importance. In this paper, we systematically discussed the strategy of HerbGP, from species selection, whole-genome sequencing, assembly and bioinformatics analysis, to postgenomics research. HerbGP will push study on Chinese traditional medicines into the front field of life science, by selecting a series of plants with unique secondary metabolism pathways as models and introducing "omics" methods into the research of these medicinal plants. HerbGP will provide great opportunities for China to be the leader in the basic research field of traditional Chinese medicine. HerbGP shall also have significant impacts on the R&D of natural medicines and the development of medicinal farming by analysis of secondary metabolic pathways and selection of cultivars with good agricultural traits.

  6. 78 FR 18680 - Genomic Medicine Program Advisory Committee, Notice of Meeting

    Science.gov (United States)

    2013-03-27

    ... Medicine Program Advisory Committee will meet on April 11, 2013, in Suite 1000 at the United States Access... AFFAIRS Genomic Medicine Program Advisory Committee, Notice of Meeting The Department of Veterans Affairs... Million Veteran Program, as well as the clinical Genomic Medicine Service. The emerging implications...

  7. Genomic prediction in a breeding program of perennial ryegrass

    DEFF Research Database (Denmark)

    Fé, Dario; Ashraf, Bilal; Greve-Pedersen, Morten

    2015-01-01

    We present a genomic selection study performed on 1918 rye grass families (Lolium perenne L.), which were derived from a commercial breeding program at DLF-Trifolium, Denmark. Phenotypes were recorded on standard plots, across 13 years and in 6 different countries. Variants were identified...... this set. Estimated Breeding Value and prediction accuracies were calculated trough two different cross-validation schemes: (i) k-fold (k=10); (ii) leaving out one parent combination at the time, in order to test for accuracy of predicting new families. Accuracies ranged between 0.56 and 0.97 for scheme (i....... A larger set of 1791 F2s were used as training set to predict EBVs of 127 synthetic families (originated from poly-crosses between 5-11 single plants) for heading date and crown rust resistance. Prediction accuracies were 0.93 and 0.57 respectively. Results clearly demonstrate considerable potential...

  8. Genomic tools in cowpea breeding programs: status and perspectives

    Directory of Open Access Journals (Sweden)

    Ousmane eBoukar

    2016-06-01

    Full Text Available Cowpea is one of the most important grain legumes in sub-Saharan Africa (SSA. It provides strong support to the livelihood of small-scale farmers through its contributions to their nutritional security, income generation and soil fertility enhancement. Worldwide about 6.5 million metric tons of cowpea are produced annually on about 14.5 million hectares. The low productivity of cowpea is attributable to numerous abiotic and biotic constraints. The abiotic stress factors comprise drought, low soil fertility, and heat while biotic constraints include insects, diseases, parasitic weeds and nematodes. Cowpea farmers also have limited access to quality seeds of improved varieties for planting. Some progress has been made through conventional breeding at international and national research institutions in the last three decades. Cowpea improvement could also benefit from modern breeding methods based on molecular genetic tools. A number of advances in cowpea genetic linkage maps, and quantitative trait loci associated with some desirable traits such as resistance to Striga, Macrophomina, Fusarium wilt, bacterial blight, root-knot nematodes, aphids and foliar thrips have been reported. An improved consensus genetic linkage map has been developed and used to identify QTLs of additional traits. In order to take advantage of these developments single nucleotide polymorphism (SNP genotyping is being streamlined to establish an efficient workflow supported by genotyping support service (GSS-client interactions. About 1100 SNPs mapped on the cowpea genome were converted by LGC Genomics to KASP assays. Several cowpea breeding programs have been exploiting these resources to implement molecular breeding, especially for MARS and MABC, to accelerate cowpea variety improvement. The combination of conventional breeding and molecular breeding strategies, with workflow managed through the CGIAR breeding management system (BMS, promises an increase in the number of

  9. Genomic Tools in Cowpea Breeding Programs: Status and Perspectives.

    Science.gov (United States)

    Boukar, Ousmane; Fatokun, Christian A; Huynh, Bao-Lam; Roberts, Philip A; Close, Timothy J

    2016-01-01

    Cowpea is one of the most important grain legumes in sub-Saharan Africa (SSA). It provides strong support to the livelihood of small-scale farmers through its contributions to their nutritional security, income generation and soil fertility enhancement. Worldwide about 6.5 million metric tons of cowpea are produced annually on about 14.5 million hectares. The low productivity of cowpea is attributable to numerous abiotic and biotic constraints. The abiotic stress factors comprise drought, low soil fertility, and heat while biotic constraints include insects, diseases, parasitic weeds, and nematodes. Cowpea farmers also have limited access to quality seeds of improved varieties for planting. Some progress has been made through conventional breeding at international and national research institutions in the last three decades. Cowpea improvement could also benefit from modern breeding methods based on molecular genetic tools. A number of advances in cowpea genetic linkage maps, and quantitative trait loci associated with some desirable traits such as resistance to Striga, Macrophomina, Fusarium wilt, bacterial blight, root-knot nematodes, aphids, and foliar thrips have been reported. An improved consensus genetic linkage map has been developed and used to identify QTLs of additional traits. In order to take advantage of these developments single nucleotide polymorphism (SNP) genotyping is being streamlined to establish an efficient workflow supported by genotyping support service (GSS)-client interactions. About 1100 SNPs mapped on the cowpea genome were converted by LGC Genomics to KASP assays. Several cowpea breeding programs have been exploiting these resources to implement molecular breeding, especially for MARS and MABC, to accelerate cowpea variety improvement. The combination of conventional breeding and molecular breeding strategies, with workflow managed through the CGIAR breeding management system (BMS), promises an increase in the number of improved

  10. GIANT API: an application programming interface for functional genomics

    Science.gov (United States)

    Roberts, Andrew M.; Wong, Aaron K.; Fisk, Ian; Troyanskaya, Olga G.

    2016-01-01

    GIANT API provides biomedical researchers programmatic access to tissue-specific and global networks in humans and model organisms, and associated tools, which includes functional re-prioritization of existing genome-wide association study (GWAS) data. Using tissue-specific interaction networks, researchers are able to predict relationships between genes specific to a tissue or cell lineage, identify the changing roles of genes across tissues and uncover disease-gene associations. Additionally, GIANT API enables computational tools like NetWAS, which leverages tissue-specific networks for re-prioritization of GWAS results. The web services covered by the API include 144 tissue-specific functional gene networks in human, global functional networks for human and six common model organisms and the NetWAS method. GIANT API conforms to the REST architecture, which makes it stateless, cacheable and highly scalable. It can be used by a diverse range of clients including web browsers, command terminals, programming languages and standalone apps for data analysis and visualization. The API is freely available for use at http://giant-api.princeton.edu. PMID:27098035

  11. GIANT API: an application programming interface for functional genomics.

    Science.gov (United States)

    Roberts, Andrew M; Wong, Aaron K; Fisk, Ian; Troyanskaya, Olga G

    2016-07-08

    GIANT API provides biomedical researchers programmatic access to tissue-specific and global networks in humans and model organisms, and associated tools, which includes functional re-prioritization of existing genome-wide association study (GWAS) data. Using tissue-specific interaction networks, researchers are able to predict relationships between genes specific to a tissue or cell lineage, identify the changing roles of genes across tissues and uncover disease-gene associations. Additionally, GIANT API enables computational tools like NetWAS, which leverages tissue-specific networks for re-prioritization of GWAS results. The web services covered by the API include 144 tissue-specific functional gene networks in human, global functional networks for human and six common model organisms and the NetWAS method. GIANT API conforms to the REST architecture, which makes it stateless, cacheable and highly scalable. It can be used by a diverse range of clients including web browsers, command terminals, programming languages and standalone apps for data analysis and visualization. The API is freely available for use at http://giant-api.princeton.edu.

  12. Programming biological operating systems: genome design, assembly and activation.

    Science.gov (United States)

    Gibson, Daniel G

    2014-05-01

    The DNA technologies developed over the past 20 years for reading and writing the genetic code converged when the first synthetic cell was created 4 years ago. An outcome of this work has been an extraordinary set of tools for synthesizing, assembling, engineering and transplanting whole bacterial genomes. Technical progress, options and applications for bacterial genome design, assembly and activation are discussed.

  13. 76 FR 65563 - Genomic Medicine Program Advisory Committee; Notice of Meeting

    Science.gov (United States)

    2011-10-21

    ... (VA) gives notice under Public Law 92-463 (Federal Advisory Committee Act) that the Genomic Medicine... incorporate genomic information into its health care program while applying appropriate ethical oversight and... the public. The purpose of the Committee is to provide advice and make recommendations to the...

  14. Genomic selection needs to be carefully assessed to meet specific requirements in livestock breeding programs

    Directory of Open Access Journals (Sweden)

    Elisabeth eJonas

    2015-02-01

    Full Text Available Genomic selection is a promising development in agriculture, aiming improved production by exploiting molecular genetic markers to design novel breeding programs and to develop new markers-based models for genetic evaluation. It opens opportunities for research, as novel algorithms and lab methodologies are developed. Genomic selection can be applied in many breeds and species. Further research on the implementation of genomic selection in breeding programs is highly desirable not only for the common good, but also the private sector (breeding companies. It has been projected that this approach will improve selection routines, especially in species with long reproduction cycles, late or sex-limited or expensive trait recording and for complex traits. The task of integrating genomic selection into existing breeding programs is, however, not straightforward. Despite successful integration into breeding programs for dairy cattle, it has yet to be shown how much emphasis can be given to the genomic information and how much additional phenotypic information is needed from new selection candidates. Genomic selection is already part of future planning in many breeding companies of pigs and beef cattle among others, but further research is needed to fully estimate how effective the use of genomic information will be for the prediction of the performance of future breeding stock. Genomic prediction of production in crossbreeding and across-breed schemes, costs and choice of individuals for genotyping are reasons for a reluctance to fully rely on genomic information for selection decisions. Breeding objectives are highly dependent on the industry and the additional gain when using genomic information has to be considered carefully. This review synthesizes some of the suggested approaches in selected livestock species including cattle, pig, chicken and fish. It outlines tasks to help understanding possible consequences when applying genomic information in

  15. SIS: a program to generate draft genome sequence scaffolds for prokaryotes

    Directory of Open Access Journals (Sweden)

    Dias Zanoni

    2012-05-01

    Full Text Available Abstract Background Decreasing costs of DNA sequencing have made prokaryotic draft genome sequences increasingly common. A contig scaffold is an ordering of contigs in the correct orientation. A scaffold can help genome comparisons and guide gap closure efforts. One popular technique for obtaining contig scaffolds is to map contigs onto a reference genome. However, rearrangements that may exist between the query and reference genomes may result in incorrect scaffolds, if these rearrangements are not taken into account. Large-scale inversions are common rearrangement events in prokaryotic genomes. Even in draft genomes it is possible to detect the presence of inversions given sufficient sequencing coverage and a sufficiently close reference genome. Results We present a linear-time algorithm that can generate a set of contig scaffolds for a draft genome sequence represented in contigs given a reference genome. The algorithm is aimed at prokaryotic genomes and relies on the presence of matching sequence patterns between the query and reference genomes that can be interpreted as the result of large-scale inversions; we call these patterns inversion signatures. Our algorithm is capable of correctly generating a scaffold if at least one member of every inversion signature pair is present in contigs and no inversion signatures have been overwritten in evolution. The algorithm is also capable of generating scaffolds in the presence of any kind of inversion, even though in this general case there is no guarantee that all scaffolds in the scaffold set will be correct. We compare the performance of sis, the program that implements the algorithm, to seven other scaffold-generating programs. The results of our tests show that sis has overall better performance. Conclusions sis is a new easy-to-use tool to generate contig scaffolds, available both as stand-alone and as a web server. The good performance of sis in our tests adds evidence that large

  16. Genomic tools in pea breeding programs: status and perspectives

    Directory of Open Access Journals (Sweden)

    Nadim eTAYEH

    2015-11-01

    Full Text Available Pea (Pisum sativum L. is an annual cool-season legume and one of the oldest domesticated crops. Dry pea seeds contain 22-25 percent protein, complex starch and fibre constituents and a rich array of vitamins, minerals, and phytochemicals which make them a valuable source for human consumption and livestock feed. Dry pea ranks third to common bean and chickpea as the most widely grown pulse in the world with more than 11 million tonnes produced in 2013. Pea breeding has achieved great success since the time of Mendel’s experiments in the mid-1800s. However, several traits still require significant improvement for better yield stability in a larger growing area. Key breeding objectives in pea include improving biotic and abiotic stress resistance and enhancing yield components and seed quality. Taking advantage of the diversity present in the pea genepool, many mapping populations have been constructed in the last decades and efforts have been deployed to identify loci involved in the control of target traits and further introgress them into elite breeding materials. Pea now benefits from next-generation sequencing and high-throughput genotyping technologies that are paving the way for genome-wide association studies and genomic selection approaches. This review covers the significant development and deployment of genomic tools for pea breeding in recent years. Future prospects are discussed especially in light of current progress towards deciphering the pea genome.

  17. Flexible approaches for teaching computational genomics in a health information management program.

    Science.gov (United States)

    Zhou, Leming; Watzlaf, Valerie; Abdelhak, Mervat

    2013-01-01

    The astonishing improvement of high-throughput biotechnologies in recent years makes it possible to access a huge amount of genomic data. The association between genomic data and genetic disease has already been and will continue to be applied to personalized healthcare. Health information management (HIM) professionals are the ones who will handle personal genetic information and provide solid evidence to support physicians' diagnoses and personalized treatment strategies, and therefore they will need to have the knowledge and skills to process genomic data. In this paper, we describe flexible approaches for teaching a computational genomics course in the HIM program at the University of Pittsburgh. HIM programs at other universities may choose an appropriate approach to fit into their own curriculum.

  18. Identification of functional, endogenous programmed −1 ribosomal frameshift signals in the genome of Saccharomyces cerevisiae

    OpenAIRE

    2006-01-01

    In viruses, programmed −1 ribosomal frameshifting (−1 PRF) signals direct the translation of alternative proteins from a single mRNA. Given that many basic regulatory mechanisms were first discovered in viral systems, the current study endeavored to: (i) identify −1 PRF signals in genomic databases, (ii) apply the protocol to the yeast genome and (iii) test selected candidates at the bench. Computational analyses revealed the presence of 10 340 consensus −1 PRF signals in the yeast genome. Of...

  19. Genome-wide alterations of the DNA replication program during tumor progression

    Science.gov (United States)

    Arneodo, A.; Goldar, A.; Argoul, F.; Hyrien, O.; Audit, B.

    2016-08-01

    Oncogenic stress is a major driving force in the early stages of cancer development. Recent experimental findings reveal that, in precancerous lesions and cancers, activated oncogenes may induce stalling and dissociation of DNA replication forks resulting in DNA damage. Replication timing is emerging as an important epigenetic feature that recapitulates several genomic, epigenetic and functional specificities of even closely related cell types. There is increasing evidence that chromosome rearrangements, the hallmark of many cancer genomes, are intimately associated with the DNA replication program and that epigenetic replication timing changes often precede chromosomic rearrangements. The recent development of a novel methodology to map replication fork polarity using deep sequencing of Okazaki fragments has provided new and complementary genome-wide replication profiling data. We review the results of a wavelet-based multi-scale analysis of genomic and epigenetic data including replication profiles along human chromosomes. These results provide new insight into the spatio-temporal replication program and its dynamics during differentiation. Here our goal is to bring to cancer research, the experimental protocols and computational methodologies for replication program profiling, and also the modeling of the spatio-temporal replication program. To illustrate our purpose, we report very preliminary results obtained for the chronic myelogeneous leukemia, the archetype model of cancer. Finally, we discuss promising perspectives on using genome-wide DNA replication profiling as a novel efficient tool for cancer diagnosis, prognosis and personalized treatment.

  20. A Genome Sequencing Program for Novel Undiagnosed Diseases

    Science.gov (United States)

    Bloss, Cinnamon S.; Scott-Van Zeeland, Ashley A.; Topol, Sarah E.; Darst, Burcu F.; Boeldt, Debra L.; Erikson, Galina A.; Bethel, Kelly J.; Bjork, Robert L.; Friedman, Jennifer R.; Hwynn, Nelson; Patay, Bradley A.; Pockros, Paul J.; Scott, Erick R.; Simon, Ronald A.; Williams, Gary W.; Schork, Nicholas J.; Topol, Eric J.; Torkamani, Ali

    2015-01-01

    Purpose The Scripps Idiopathic Diseases of huMan (IDIOM) study aims to discover novel gene-disease relationships and provide molecular genetic diagnosis and treatment guidance for individuals with novel diseases using genome sequencing integrated with clinical assessment and multidisciplinary case review. Methods Here we describe the IDIOM study operational protocol and initial results. Results 121 cases underwent first tier review by the principal investigators to determine if the primary inclusion criteria were satisfied, 59 (48.8%) underwent second tier review by our clinician-scientist review panel, and 17 (14.0%) patients and their family members were enrolled. 60% of cases resulted in a plausible molecular diagnosis. 18% of cases resulted in a confirmed molecular diagnosis. 2 of 3 confirmed cases led to the identification of novel gene-disease relationships. In the third confirmed case, a previously described but unrecognized disease was revealed. In all three confirmed cases, a new clinical management strategy was initiated based on the genetic findings. Conclusions Genome sequencing provides tangible clinical benefit for individuals with idiopathic genetic disease, not only in the context of molecular genetic diagnosis of known rare conditions, but also in cases where prior clinical information regarding a new genetic disorder is lacking. PMID:25790160

  1. 77 FR 16898 - Genomic Medicine Program Advisory Committee, Notice of Meeting

    Science.gov (United States)

    2012-03-22

    ... (VA) gives notice under Public Law 92-463 (Federal Advisory Committee Act) that the Genomic Medicine... health care program while applying appropriate ethical oversight and protecting the privacy of Veterans... Highway, Arlington, Virginia, from 9 a.m. to 5 p.m. The meeting is open to the public. The purpose of the...

  2. Potential benefits of genomic selection on genetic gain of small ruminant breeding programs.

    Science.gov (United States)

    Shumbusho, F; Raoul, J; Astruc, J M; Palhiere, I; Elsen, J M

    2013-08-01

    In conventional small ruminant breeding programs, only pedigree and phenotype records are used to make selection decisions but prospects of including genomic information are now under consideration. The objective of this study was to assess the potential benefits of genomic selection on the genetic gain in French sheep and goat breeding designs of today. Traditional and genomic scenarios were modeled with deterministic methods for 3 breeding programs. The models included decisional variables related to male selection candidates, progeny testing capacity, and economic weights that were optimized to maximize annual genetic gain (AGG) of i) a meat sheep breeding program that improved a meat trait of heritability (h(2)) = 0.30 and a maternal trait of h(2) = 0.09 and ii) dairy sheep and goat breeding programs that improved a milk trait of h(2) = 0.30. Values of ±0.20 of genetic correlation between meat and maternal traits were considered to study their effects on AGG. The Bulmer effect was accounted for and the results presented here are the averages of AGG after 10 generations of selection. Results showed that current traditional breeding programs provide an AGG of 0.095 genetic standard deviation (σa) for meat and 0.061 σa for maternal trait in meat breed and 0.147 σa and 0.120 σa in sheep and goat dairy breeds, respectively. By optimizing decisional variables, the AGG with traditional selection methods increased to 0.139 σa for meat and 0.096 σa for maternal traits in meat breeding programs and to 0.174 σa and 0.183 σa in dairy sheep and goat breeding programs, respectively. With a medium-sized reference population (nref) of 2,000 individuals, the best genomic scenarios gave an AGG that was 17.9% greater than with traditional selection methods with optimized values of decisional variables for combined meat and maternal traits in meat sheep, 51.7% in dairy sheep, and 26.2% in dairy goats. The superiority of genomic schemes increased with the size of the

  3. The Ethical, Legal, and Social Implications Program of the National Human Genome Research Institute: reflections on an ongoing experiment.

    Science.gov (United States)

    McEwen, Jean E; Boyer, Joy T; Sun, Kathie Y; Rothenberg, Karen H; Lockhart, Nicole C; Guyer, Mark S

    2014-01-01

    For more than 20 years, the Ethical, Legal, and Social Implications (ELSI) Program of the National Human Genome Research Institute has supported empirical and conceptual research to anticipate and address the ethical, legal, and social implications of genomics. As a component of the agency that funds much of the underlying science, the program has always been an experiment. The ever-expanding number of issues the program addresses and the relatively low level of commitment on the part of other funding agencies to support such research make setting priorities especially challenging. Program-supported studies have had a significant impact on the conduct of genomics research, the implementation of genomic medicine, and broader public policies. The program's influence is likely to grow as ELSI research, genomics research, and policy development activities become increasingly integrated. Achieving the benefits of increased integration while preserving the autonomy, objectivity, and intellectual independence of ELSI investigators presents ongoing challenges and new opportunities.

  4. Data Standards for the Genomes to Life Program

    Energy Technology Data Exchange (ETDEWEB)

    Arkin, Adam; Ambrosiano, John; Babnigg, Gyorgy; Frank, Ed; Geist,Al; Giometti, Carol; Jacobsen, Janet; Samatova, Nagiza; Slater, Nancy; Taylor, Ron

    2004-01-31

    Existing GTL Projects already have produced volumes of dataand, over the course of the next five years, will produce an estimatedhundreds, or possibly thousands, of terabytes of data from hundreds ofexperiments conducted at dozens of laboratories in National Labs anduniversities across the nation. These data will be the basis forpublications by individual researchers, research groups, andmulti-institutional collaborations, and the basis for future DOEdecisions on funding further research in bioremediation. The short-termand long-term value of the data to project participants, to the DOE, andto the nation depends, however, on being able to access the data and onhow, or whether, the data are archived. The ability to access data is thestarting point for data analysis and interpretation, data integration,data mining, and development of data-driven models. Limited orinefficient data access means that less data are analyzed in acost-effective and timely manner. Data production in the GTL Program willlikely outstrip, or may have already outstripped, the ability to analyzethe data. Being able to access data depends on two key factors: datastandards and implementation of the data standards. For the purpose ofthis proposal, a data standard is defined as a standard, documented wayin which data and information about the data are describe. The attributesof the experiment in which the data were collected need to be known andthe measurements corresponding to the data collected need to bedescribed. In general terms, a data standard could be a form (electronicor paper) that is completed by a researcher or a document that prescribeshow a protocol or experiment should be described in writing.Datastandards are critical to data access because they provide a frameworkfor organizing and managing data. Researchers spend significant amountsof time managing data and information about experiments using labnotebooks, computer files, Excel spreadsheets, etc. In addition, dataoutput format

  5. Efficient and exact maximum likelihood quantisation of genomic features using dynamic programming.

    Science.gov (United States)

    Song, Mingzhou; Haralick, Robert M; Boissinot, Stéphane

    2010-01-01

    An efficient and exact dynamic programming algorithm is introduced to quantise a continuous random variable into a discrete random variable that maximises the likelihood of the quantised probability distribution for the original continuous random variable. Quantisation is often useful before statistical analysis and modelling of large discrete network models from observations of multiple continuous random variables. The quantisation algorithm is applied to genomic features including the recombination rate distribution across the chromosomes and the non-coding transposable element LINE-1 in the human genome. The association pattern is studied between the recombination rate, obtained by quantisation at genomic locations around LINE-1 elements, and the length groups of LINE-1 elements, also obtained by quantisation on LINE-1 length. The exact and density-preserving quantisation approach provides an alternative superior to the inexact and distance-based univariate iterative k-means clustering algorithm for discretisation.

  6. A genome resource to address mechanisms of developmental programming: determination of the fetal sheep heart transcriptome.

    Science.gov (United States)

    Cox, Laura A; Glenn, Jeremy P; Spradling, Kimberly D; Nijland, Mark J; Garcia, Roy; Nathanielsz, Peter W; Ford, Stephen P

    2012-06-15

    The pregnant sheep has provided seminal insights into reproduction related to animal and human development (ovarian function, fertility, implantation, fetal growth, parturition and lactation). Fetal sheep physiology has been extensively studied since 1950, contributing significantly to the basis for our understanding of many aspects of fetal development and behaviour that remain in use in clinical practice today. Understanding mechanisms requires the combination of systems approaches uniquely available in fetal sheep with the power of genomic studies. Absence of the full range of sheep genomic resources has limited the full realization of the power of this model, impeding progress in emerging areas of pregnancy biology such as developmental programming. We have examined the expressed fetal sheep heart transcriptome using high-throughput sequencing technologies. In so doing we identified 36,737 novel transcripts and describe genes, gene variants and pathways relevant to fundamental developmental mechanisms. Genes with the highest expression levels and with novel exons in the fetal heart transcriptome are known to play central roles in muscle development. We show that high-throughput sequencing methods can generate extensive transcriptome information in the absence of an assembled and annotated genome for that species. The gene sequence data obtained provide a unique genomic resource for sheep specific genetic technology development and, combined with the polymorphism data, augment annotation and assembly of the sheep genome. In addition, identification and pathway analysis of novel fetal sheep heart transcriptome splice variants is a first step towards revealing mechanisms of genetic variation and gene environment interactions during fetal heart development.

  7. Strategies for use of reproductive technologies in genomic dairy cattle breeding programs

    DEFF Research Database (Denmark)

    Thomasen, Jørn Rind; Sørensen, Anders Christian

    A simulation study was performed for testing the effect of using reproductive technologies in a genomic dairy cattle young bull breeding scheme. The breeding scheme parameters: 1) number of donors, 2) number of progeny per donor, 3) age of the donor, 4) number of sires, and 5) reliability...... of genomic breeding values. The breeding schemes were evaluated according to genetic gain and rate of inbreeding. The relative gain by use of reproductive technologies is 11 to 84 percent points depending on the choice of other breeding scheme parameters. A large donor program with high selection intensity...... of sires provides the highest genetic gain. A relatively higher genetic gain is obtained for higher reliability of GEBV. Extending the donor program and number of selected bulls has a major effect of reducing the rate of inbreeding without compromising genetic gain....

  8. Economic evaluation of genomic selection in small ruminants: a sheep meat breeding program.

    Science.gov (United States)

    Shumbusho, F; Raoul, J; Astruc, J M; Palhiere, I; Lemarié, S; Fugeray-Scarbel, A; Elsen, J M

    2016-06-01

    Recent genomic evaluation studies using real data and predicting genetic gain by modeling breeding programs have reported moderate expected benefits from the replacement of classic selection schemes by genomic selection (GS) in small ruminants. The objectives of this study were to compare the cost, monetary genetic gain and economic efficiency of classic selection and GS schemes in the meat sheep industry. Deterministic methods were used to model selection based on multi-trait indices from a sheep meat breeding program. Decisional variables related to male selection candidates and progeny testing were optimized to maximize the annual monetary genetic gain (AMGG), that is, a weighted sum of meat and maternal traits annual genetic gains. For GS, a reference population of 2000 individuals was assumed and genomic information was available for evaluation of male candidates only. In the classic selection scheme, males breeding values were estimated from own and offspring phenotypes. In GS, different scenarios were considered, differing by the information used to select males (genomic only, genomic+own performance, genomic+offspring phenotypes). The results showed that all GS scenarios were associated with higher total variable costs than classic selection (if the cost of genotyping was 123 euros/animal). In terms of AMGG and economic returns, GS scenarios were found to be superior to classic selection only if genomic information was combined with their own meat phenotypes (GS-Pheno) or with their progeny test information. The predicted economic efficiency, defined as returns (proportional to number of expressions of AMGG in the nucleus and commercial flocks) minus total variable costs, showed that the best GS scenario (GS-Pheno) was up to 15% more efficient than classic selection. For all selection scenarios, optimization increased the overall AMGG, returns and economic efficiency. As a conclusion, our study shows that some forms of GS strategies are more advantageous

  9. A survey of application: genomics and genetic programming, a new frontier.

    Science.gov (United States)

    Khan, Mohammad Wahab; Alam, Mansaf

    2012-08-01

    The aim of this paper is to provide an introduction to the rapidly developing field of genetic programming (GP). Particular emphasis is placed on the application of GP to genomics. First, the basic methodology of GP is introduced. This is followed by a review of applications in the areas of gene network inference, gene expression data analysis, SNP analysis, epistasis analysis and gene annotation. Finally this paper concluded by suggesting potential avenues of possible future research on genetic programming, opportunities to extend the technique, and areas for possible practical applications.

  10. Design and implementation of a genomics field trip program aimed at secondary school students.

    Directory of Open Access Journals (Sweden)

    Jennifer McQueen

    Full Text Available With the rapid pace of advancements in biological research brought about by the application of computer science and information technology, we believe the time is right for introducing genomics and bioinformatics tools and concepts to secondary school students. Our approach has been to offer a full-day field trip in our research facility where secondary school students carry out experiments at the laboratory bench and on a laptop computer. This experience offers benefits for students, teachers, and field trip instructors. In delivering a wide variety of science outreach and education programs, we have learned that a number of factors contribute to designing a successful experience for secondary school students. First, it is important to engage students with authentic and fun activities that are linked to real-world applications and/or research questions. Second, connecting with a local high school teacher to pilot programs and linking to curricula taught in secondary schools will enrich the field trip experience. Whether or not programs are linked directly to local teachers, it is important to be flexible and build in mechanisms for collecting feedback in field trip programs. Finally, graduate students can be very powerful mentors for students and should be encouraged to share their enthusiasm for science and to talk about career paths. Our experiences suggest a real need for effective science outreach programs at the secondary school level and that genomics and bioinformatics are ideal areas to explore.

  11. Design and implementation of a genomics field trip program aimed at secondary school students.

    Science.gov (United States)

    McQueen, Jennifer; Wright, Jody J; Fox, Joanne A

    2012-01-01

    With the rapid pace of advancements in biological research brought about by the application of computer science and information technology, we believe the time is right for introducing genomics and bioinformatics tools and concepts to secondary school students. Our approach has been to offer a full-day field trip in our research facility where secondary school students carry out experiments at the laboratory bench and on a laptop computer. This experience offers benefits for students, teachers, and field trip instructors. In delivering a wide variety of science outreach and education programs, we have learned that a number of factors contribute to designing a successful experience for secondary school students. First, it is important to engage students with authentic and fun activities that are linked to real-world applications and/or research questions. Second, connecting with a local high school teacher to pilot programs and linking to curricula taught in secondary schools will enrich the field trip experience. Whether or not programs are linked directly to local teachers, it is important to be flexible and build in mechanisms for collecting feedback in field trip programs. Finally, graduate students can be very powerful mentors for students and should be encouraged to share their enthusiasm for science and to talk about career paths. Our experiences suggest a real need for effective science outreach programs at the secondary school level and that genomics and bioinformatics are ideal areas to explore.

  12. Potential of gene drives with genome editing to increase genetic gain in livestock breeding programs.

    Science.gov (United States)

    Gonen, Serap; Jenko, Janez; Gorjanc, Gregor; Mileham, Alan J; Whitelaw, C Bruce A; Hickey, John M

    2017-01-04

    This paper uses simulation to explore how gene drives can increase genetic gain in livestock breeding programs. Gene drives are naturally occurring phenomena that cause a mutation on one chromosome to copy itself onto its homologous chromosome. We simulated nine different breeding and editing scenarios with a common overall structure. Each scenario began with 21 generations of selection, followed by 20 generations of selection based on true breeding values where the breeder used selection alone, selection in combination with genome editing, or selection with genome editing and gene drives. In the scenarios that used gene drives, we varied the probability of successfully incorporating the gene drive. For each scenario, we evaluated genetic gain, genetic variance [Formula: see text], rate of change in inbreeding ([Formula: see text]), number of distinct quantitative trait nucleotides (QTN) edited, rate of increase in favourable allele frequencies of edited QTN and the time to fix favourable alleles. Gene drives enhanced the benefits of genome editing in seven ways: (1) they amplified the increase in genetic gain brought about by genome editing; (2) they amplified the rate of increase in the frequency of favourable alleles and reduced the time it took to fix them; (3) they enabled more rapid targeting of QTN with lesser effect for genome editing; (4) they distributed fixed editing resources across a larger number of distinct QTN across generations; (5) they focussed editing on a smaller number of QTN within a given generation; (6) they reduced the level of inbreeding when editing a subset of the sires; and (7) they increased the efficiency of converting genetic variation into genetic gain. Genome editing in livestock breeding results in short-, medium- and long-term increases in genetic gain. The increase in genetic gain occurs because editing increases the frequency of favourable alleles in the population. Gene drives accelerate the increase in allele frequency

  13. Report on the Imaging Workshop for the Genomes to Life Program, April 16-18, 2002

    Energy Technology Data Exchange (ETDEWEB)

    Colson, STEVEN

    2003-08-04

    This report is a result of the Imaging Workshop for the Genomes to Life (GTL) program held April 16-19, 2002, in Charlotte, North Carolina. The meeting was sponsored by the Office of Biological and Environmental Research and the Office of Advanced Scientific Computing Research of the U.S. Department of Energy's (DOE) Office of Science. The purpose of the workshop was to project a broad vision for future needs and determine the value of imaging to GTL program research. The workshop included four technical sessions with plenary lectures on biology and technology perspectives and technical presentations on needs and approaches as they related to the following areas of the GTL program: (1) Molecular machines (protein complexes); (2) Intracellular and cellular structure, function, and processes; (3) Multicellular: Monoclonal and heterogeneous multicellular systems, cell-cell signaling, and model systems; and (4) Cells in situ and in vivo: Bacteria in the natural environment, microenvironment, and in vivo systems.

  14. The Human Genome Project and Mental Retardation: An Educational Program. Final Progress Report

    Energy Technology Data Exchange (ETDEWEB)

    Davis, Sharon

    1999-05-03

    The Arc, a national organization on mental retardation, conducted an educational program for members, many of whom have a family member with a genetic condition causing mental retardation. The project informed members about the Human Genome scientific efforts, conducted training regarding ethical, legal and social implications and involved members in issue discussions. Short reports and fact sheets on genetic and ELSI topics were disseminated to 2,200 of the Arc's leaders across the country and to other interested individuals. Materials produced by the project can e found on the Arc's web site, TheArc.org.

  15. READSCAN: A fast and scalable pathogen discovery program with accurate genome relative abundance estimation

    KAUST Repository

    Naeem, Raeece

    2012-11-28

    Summary: READSCAN is a highly scalable parallel program to identify non-host sequences (of potential pathogen origin) and estimate their genome relative abundance in high-throughput sequence datasets. READSCAN accurately classified human and viral sequences on a 20.1 million reads simulated dataset in <27 min using a small Beowulf compute cluster with 16 nodes (Supplementary Material). Availability: http://cbrc.kaust.edu.sa/readscan Contact: or raeece.naeem@gmail.com Supplementary information: Supplementary data are available at Bioinformatics online. 2012 The Author(s).

  16. Test Data Sets and Evaluation of Gene Prediction Programs on the Rice Genome

    Institute of Scientific and Technical Information of China (English)

    Heng Li; Tao Liu; Hai-Hong Li; Yan Li; Li-Jun Fang; Hui-Min Xie; Wei-Mou Zheng; Bai-Lin Hao; Jin-Song Liu; Zhao Xu; Jiao Jin; Lin Fang; Lei Gao; Yu-Dong Li; Zi-Xing Xing; Shao-Gen Gao

    2005-01-01

    With several rice genome projects approaching completion gene prediction/finding by computer algorithms has become an urgent task. Two test sets were constructed by mapping the newly published 28,469 full-length KOME rice cDNA to the RGP BAC clone sequences of Oryza sativa ssp. japonica: a single-gene set of 550 sequences and a multi-gene set of 62 sequences with 271 genes. These data sets were used to evaluate five ab initio gene prediction programs: RiceHMM,GlimmerR, GeneMark, FGENSH and BGF. The predictions were compared on nucleotide, exon and whole gene structure levels using commonly accepted measures and several new measures. The test results show a progress in performance in chronological order. At the same time complementarity of the programs hints on the possibility of further improvement and on the feasibility of reaching better performance by combining several gene-finders.

  17. Highly precise and developmentally programmed genome assembly in Paramecium requires ligase IV-dependent end joining.

    Directory of Open Access Journals (Sweden)

    Aurélie Kapusta

    2011-04-01

    Full Text Available During the sexual cycle of the ciliate Paramecium, assembly of the somatic genome includes the precise excision of tens of thousands of short, non-coding germline sequences (Internal Eliminated Sequences or IESs, each one flanked by two TA dinucleotides. It has been reported previously that these genome rearrangements are initiated by the introduction of developmentally programmed DNA double-strand breaks (DSBs, which depend on the domesticated transposase PiggyMac. These DSBs all exhibit a characteristic geometry, with 4-base 5' overhangs centered on the conserved TA, and may readily align and undergo ligation with minimal processing. However, the molecular steps and actors involved in the final and precise assembly of somatic genes have remained unknown. We demonstrate here that Ligase IV and Xrcc4p, core components of the non-homologous end-joining pathway (NHEJ, are required both for the repair of IES excision sites and for the circularization of excised IESs. The transcription of LIG4 and XRCC4 is induced early during the sexual cycle and a Lig4p-GFP fusion protein accumulates in the developing somatic nucleus by the time IES excision takes place. RNAi-mediated silencing of either gene results in the persistence of free broken DNA ends, apparently protected against extensive resection. At the nucleotide level, controlled removal of the 5'-terminal nucleotide occurs normally in LIG4-silenced cells, while nucleotide addition to the 3' ends of the breaks is blocked, together with the final joining step, indicative of a coupling between NHEJ polymerase and ligase activities. Taken together, our data indicate that IES excision is a "cut-and-close" mechanism, which involves the introduction of initiating double-strand cleavages at both ends of each IES, followed by DSB repair via highly precise end joining. This work broadens our current view on how the cellular NHEJ pathway has cooperated with domesticated transposases for the emergence of new

  18. The mitochondrial genome of the hexactinellid sponge Aphrocallistes vastus: Evidence for programmed translational frameshifting

    Directory of Open Access Journals (Sweden)

    Leys Sally P

    2008-01-01

    Full Text Available Abstract Background Mitochondrial genomes (mtDNA of numerous sponges have been sequenced as part of an ongoing effort to resolve the class-level phylogeny of the Porifera, as well as to place the various lower metazoan groups on the animal-kingdom tree. Most recently, the partial mtDNA of two glass sponges, class Hexactinellida, were reported. While previous phylogenetic estimations based on these data remain uncertain due to insufficient taxon sampling and accelerated rates of evolution, the mtDNA molecules themselves reveal interesting traits that may be unique to hexactinellids. Here we determined the first complete mitochondrial genome of a hexactinellid sponge, Aphrocallistes vastus, and compared it to published poriferan mtDNAs to further describe characteristics specific to hexactinellid and other sponge mitochondrial genomes. Results The A. vastus mtDNA consisted of a 17,427 base pair circular molecule containing thirteen protein-coding genes, divergent large and small subunit ribosomal RNAs, and a reduced set of 18 tRNAs. The A. vastus mtDNA showed a typical hexactinellid nucleotide composition and shared a large synteny with the other sequenced glass sponge mtDNAs. It also contained an unidentified open reading frame and large intergenic space region. Two frameshifts, in the cox3 and nad6 genes, were not corrected by RNA editing, but rather possessed identical shift sites marked by the extremely rare tryptophan codon (UGG followed by the common glycine codon (GGA in the +1 frame. Conclusion Hexactinellid mtDNAs have shown similar trends in gene content, nucleotide composition, and codon usage, and have retained a large gene syntenty. Analysis of the mtDNA of A. vastus has provided evidence diagnostic for +1 programmed translational frameshifting, a phenomenon disparately reported throughout the animal kingdom, but present in the hexactinellid mtDNAs that have been sequenced to date.

  19. A mixed-integer linear programming approach to the reduction of genome-scale metabolic networks.

    Science.gov (United States)

    Röhl, Annika; Bockmayr, Alexander

    2017-01-03

    Constraint-based analysis has become a widely used method to study metabolic networks. While some of the associated algorithms can be applied to genome-scale network reconstructions with several thousands of reactions, others are limited to small or medium-sized models. In 2015, Erdrich et al. introduced a method called NetworkReducer, which reduces large metabolic networks to smaller subnetworks, while preserving a set of biological requirements that can be specified by the user. Already in 2001, Burgard et al. developed a mixed-integer linear programming (MILP) approach for computing minimal reaction sets under a given growth requirement. Here we present an MILP approach for computing minimum subnetworks with the given properties. The minimality (with respect to the number of active reactions) is not guaranteed by NetworkReducer, while the method by Burgard et al. does not allow specifying the different biological requirements. Our procedure is about 5-10 times faster than NetworkReducer and can enumerate all minimum subnetworks in case there exist several ones. This allows identifying common reactions that are present in all subnetworks, and reactions appearing in alternative pathways. Applying complex analysis methods to genome-scale metabolic networks is often not possible in practice. Thus it may become necessary to reduce the size of the network while keeping important functionalities. We propose a MILP solution to this problem. Compared to previous work, our approach is more efficient and allows computing not only one, but even all minimum subnetworks satisfying the required properties.

  20. Tool for rapid annotation of microbial SNPs (TRAMS): a simple program for rapid annotation of genomic variation in prokaryotes.

    Science.gov (United States)

    Reumerman, Richard A; Tucker, Nicholas P; Herron, Paul R; Hoskisson, Paul A; Sangal, Vartul

    2013-09-01

    Next generation sequencing (NGS) has been widely used to study genomic variation in a variety of prokaryotes. Single nucleotide polymorphisms (SNPs) resulting from genomic comparisons need to be annotated for their functional impact on the coding sequences. We have developed a program, TRAMS, for functional annotation of genomic SNPs which is available to download as a single file executable for WINDOWS users with limited computational experience and as a Python script for Mac OS and Linux users. TRAMS needs a tab delimited text file containing SNP locations, reference nucleotide and SNPs in variant strains along with a reference genome sequence in GenBank or EMBL format. SNPs are annotated as synonymous, nonsynonymous or nonsense. Nonsynonymous SNPs in start and stop codons are separated as non-start and non-stop SNPs, respectively. SNPs in multiple overlapping features are annotated separately for each feature and multiple nucleotide polymorphisms within a codon are combined before annotation. We have also developed a workflow for Galaxy, a highly used tool for analysing NGS data, to map short reads to a reference genome and extract and annotate the SNPs. TRAMS is a simple program for rapid and accurate annotation of SNPs that will be very useful for microbiologists in analysing genomic diversity in microbial populations.

  1. 77 FR 58913 - Genomic Medicine Program Advisory Committee, Notice of Meeting

    Science.gov (United States)

    2012-09-24

    ... applying appropriate ethical oversight and protecting the privacy of Veterans. The meeting focus will be on... challenges, and a continued discussion of the incorporation of genomic data, particularly whole genome...

  2. Genetic testing and Alzheimer disease: recommendations of the Stanford Program in Genomics, Ethics, and Society.

    Science.gov (United States)

    McConnell, L M; Koenig, B A; Greely, H T; Raffin, T A

    1999-01-01

    Several genes associated with Alzheimer disease (AD) have been localized and cloned; two genetic tests are already commercially available, and new tests are being developed. Genetic testing for AD--either for disease prediction or for diagnosis--raises critical ethical concerns. The multidisciplinary Alzheimer Disease Working Group of the Stanford Program in Genomics, Ethics, and Society (PGES) presents comprehensive recommendations on genetic testing for AD. The Group concludes that under current conditions, genetic testing for AD prediction or diagnosis is only rarely appropriate. Criteria for judging the readiness of a test for introduction into routine clinical practice typically rely heavily on evaluation of technical efficacy. PGES recommends a broader and more comprehensive approach, considering: 1) the unique social and historical meanings of AD; 2) the availability of procedures to promote good surrogate decision making for incompetent patients and to safeguard confidentiality; 3) access to sophisticated genetic counselors able to communicate complex risk information and effectively convey the social costs and psychological burdens of testing, such as unintentional disclosure of predictive genetic information to family members; 4) protection from inappropriate advertising and marketing of genetic tests; and 5) recognition of the need for public education about the meaning and usefulness of predictive and diagnostic tests for AD. In this special issue of Genetic Testing, the PGES recommendations are published along with comprehensive background papers authored by Working Group members.

  3. MaxSSmap: a GPU program for mapping divergent short reads to genomes with the maximum scoring subsequence.

    Science.gov (United States)

    Turki, Turki; Roshan, Usman

    2014-11-15

    Programs based on hash tables and Burrows-Wheeler are very fast for mapping short reads to genomes but have low accuracy in the presence of mismatches and gaps. Such reads can be aligned accurately with the Smith-Waterman algorithm but it can take hours and days to map millions of reads even for bacteria genomes. We introduce a GPU program called MaxSSmap with the aim of achieving comparable accuracy to Smith-Waterman but with faster runtimes. Similar to most programs MaxSSmap identifies a local region of the genome followed by exact alignment. Instead of using hash tables or Burrows-Wheeler in the first part, MaxSSmap calculates maximum scoring subsequence score between the read and disjoint fragments of the genome in parallel on a GPU and selects the highest scoring fragment for exact alignment. We evaluate MaxSSmap's accuracy and runtime when mapping simulated Illumina E.coli and human chromosome one reads of different lengths and 10% to 30% mismatches with gaps to the E.coli genome and human chromosome one. We also demonstrate applications on real data by mapping ancient horse DNA reads to modern genomes and unmapped paired reads from NA12878 in 1000 genomes. We show that MaxSSmap attains comparable high accuracy and low error to fast Smith-Waterman programs yet has much lower runtimes. We show that MaxSSmap can map reads rejected by BWA and NextGenMap with high accuracy and low error much faster than if Smith-Waterman were used. On short read lengths of 36 and 51 both MaxSSmap and Smith-Waterman have lower accuracy compared to at higher lengths. On real data MaxSSmap produces many alignments with high score and mapping quality that are not given by NextGenMap and BWA. The MaxSSmap source code in CUDA and OpenCL is freely available from http://www.cs.njit.edu/usman/MaxSSmap.

  4. 75 FR 61861 - Genomic Medicine Program Advisory Committee; Notice of Meeting

    Science.gov (United States)

    2010-10-06

    ... ethical oversight and protecting the privacy of Veterans; and receive updates on genomics initiatives within Patient Care Services; efforts to increase genetics education among VA Nursing Staff; and...

  5. An Innovative Plant Genomics and Gene Annotation Program for High School, Community College, and University Faculty

    Science.gov (United States)

    Hacisalihoglu, Gokhan; Hilgert, Uwe; Nash, E. Bruce; Micklos, David A.

    2008-01-01

    Today's biology educators face the challenge of training their students in modern molecular biology techniques including genomics and bioinformatics. The Dolan DNA Learning Center (DNALC) of Cold Spring Harbor Laboratory has developed and disseminated a bench- and computer-based plant genomics curriculum for biology faculty. In 2007, a five-day…

  6. Assessment of the scientific-technological production in molecular biology in Brazil (1996-2007): the contribution of genomics programs.

    Science.gov (United States)

    Meneghini, Rogério; Gamba, Estêvão C

    2011-06-01

    Several genome sequencing programs were launched in Brazil by the end of the nineties and the early 2000s.The most important initiatives were supported by the ONSA program (http://watson.fapesp.br/onsa/Genoma3.htm) and aimed at gaining domain in genomic technology and bringing molecular biology to the state of art. Two main sets of data were collected in the 1996-2007 period to evaluate the results of these genome programs: the scientific production (Scopus and Web of Science databases) and the register of patents (US Patent and Trademark Office), both related to the progress of molecular biology along this period. In regard to the former, Brazil took a great leap in comparison to 17 other developed and developing countries, being only surpassed by China. As to the register of patents in the area of molecular biology, Brazil's performance lags far behind most of the countries focused in the present study, confirming the Brazilian long-standing tendency of poor achievements in technological innovations when compared with scientific production. Possible solutions to surpass this inequality are discussed.

  7. Extension of type 2 diabetes genome-wide association scan results in the diabetes prevention program.

    Science.gov (United States)

    Moore, Allan F; Jablonski, Kathleen A; McAteer, Jarred B; Saxena, Richa; Pollin, Toni I; Franks, Paul W; Hanson, Robert L; Shuldiner, Alan R; Knowler, William C; Altshuler, David; Florez, Jose C

    2008-09-01

    Genome-wide association scans (GWASs) have identified novel diabetes-associated genes. We evaluated how these variants impact diabetes incidence, quantitative glycemic traits, and response to preventive interventions in 3,548 subjects at high risk of type 2 diabetes enrolled in the Diabetes Prevention Program (DPP), which examined the effects of lifestyle intervention, metformin, and troglitazone versus placebo. We genotyped selected single nucleotide polymorphisms (SNPs) in or near diabetes-associated loci, including EXT2, CDKAL1, CDKN2A/B, IGF2BP2, HHEX, LOC387761, and SLC30A8 in DPP participants and performed Cox regression analyses using genotype, intervention, and their interactions as predictors of diabetes incidence. We evaluated their effect on insulin resistance and secretion at 1 year. None of the selected SNPs were associated with increased diabetes incidence in this population. After adjustments for ethnicity, baseline insulin secretion was lower in subjects with the risk genotype at HHEX rs1111875 (P = 0.01); there were no significant differences in baseline insulin sensitivity. Both at baseline and at 1 year, subjects with the risk genotype at LOC387761 had paradoxically increased insulin secretion; adjustment for self-reported ethnicity abolished these differences. In ethnicity-adjusted analyses, we noted a nominal differential improvement in beta-cell function for carriers of the protective genotype at CDKN2A/B after 1 year of troglitazone treatment (P = 0.01) and possibly lifestyle modification (P = 0.05). We were unable to replicate the GWAS findings regarding diabetes risk in the DPP. We did observe genotype associations with differences in baseline insulin secretion at the HHEX locus and a possible pharmacogenetic interaction at CDKNA2/B.

  8. 75 FR 26846 - Genomic Medicine Program Advisory Committee; Notice of Meeting

    Science.gov (United States)

    2010-05-12

    ... electronic health record; an overview of an upcoming large scale study on the genetics of functional disability of mental illness; and a roll-out of the national genomics initiative, the Million Veteran...

  9. Sequential computation of elementary modes and minimal cut sets in genome-scale metabolic networks using alternate integer linear programming

    Energy Technology Data Exchange (ETDEWEB)

    Song, Hyun-Seob; Goldberg, Noam; Mahajan, Ashutosh; Ramkrishna, Doraiswami

    2017-03-27

    Elementary (flux) modes (EMs) have served as a valuable tool for investigating structural and functional properties of metabolic networks. Identification of the full set of EMs in genome-scale networks remains challenging due to combinatorial explosion of EMs in complex networks. It is often, however, that only a small subset of relevant EMs needs to be known, for which optimization-based sequential computation is a useful alternative. Most of the currently available methods along this line are based on the iterative use of mixed integer linear programming (MILP), the effectiveness of which significantly deteriorates as the number of iterations builds up. To alleviate the computational burden associated with the MILP implementation, we here present a novel optimization algorithm termed alternate integer linear programming (AILP). Results: Our algorithm was designed to iteratively solve a pair of integer programming (IP) and linear programming (LP) to compute EMs in a sequential manner. In each step, the IP identifies a minimal subset of reactions, the deletion of which disables all previously identified EMs. Thus, a subsequent LP solution subject to this reaction deletion constraint becomes a distinct EM. In cases where no feasible LP solution is available, IP-derived reaction deletion sets represent minimal cut sets (MCSs). Despite the additional computation of MCSs, AILP achieved significant time reduction in computing EMs by orders of magnitude. The proposed AILP algorithm not only offers a computational advantage in the EM analysis of genome-scale networks, but also improves the understanding of the linkage between EMs and MCSs.

  10. Whole Genome Sequence Analysis of Salmonella Typhi Isolated in Thailand before and after the Introduction of a National Immunization Program.

    Directory of Open Access Journals (Sweden)

    Zoe A Dyson

    2017-01-01

    Full Text Available Vaccines against Salmonella Typhi, the causative agent of typhoid fever, are commonly used by travellers, however, there are few examples of national immunization programs in endemic areas. There is therefore a paucity of data on the impact of typhoid immunization programs on localised populations of S. Typhi. Here we have used whole genome sequencing (WGS to characterise 44 historical bacterial isolates collected before and after a national typhoid immunization program that was implemented in Thailand in 1977 in response to a large outbreak; the program was highly effective in reducing typhoid case numbers. Thai isolates were highly diverse, including 10 distinct phylogenetic lineages or genotypes. Novel prophage and plasmids were also detected, including examples that were previously only reported in Shigella sonnei and Escherichia coli. The majority of S. Typhi genotypes observed prior to the immunization program were not observed following it. Post-vaccine era isolates were more closely related to S. Typhi isolated from neighbouring countries than to earlier Thai isolates, providing no evidence for the local persistence of endemic S. Typhi following the national immunization program. Rather, later cases of typhoid appeared to be caused by the occasional importation of common genotypes from neighbouring Vietnam, Laos, and Cambodia. These data show the value of WGS in understanding the impacts of vaccination on pathogen populations and provide support for the proposal that large-scale typhoid immunization programs in endemic areas could result in lasting local disease elimination, although larger prospective studies are needed to test this directly.

  11. Whole Genome Sequence Analysis of Salmonella Typhi Isolated in Thailand before and after the Introduction of a National Immunization Program

    Science.gov (United States)

    Thanh, Duy Pham; Bodhidatta, Ladaporn; Mason, Carl Jeffries; Srijan, Apichai; Rabaa, Maia A.; Vinh, Phat Voong; Thanh, Tuyen Ha; Thwaites, Guy E.; Baker, Stephen; Holt, Kathryn E.

    2017-01-01

    Vaccines against Salmonella Typhi, the causative agent of typhoid fever, are commonly used by travellers, however, there are few examples of national immunization programs in endemic areas. There is therefore a paucity of data on the impact of typhoid immunization programs on localised populations of S. Typhi. Here we have used whole genome sequencing (WGS) to characterise 44 historical bacterial isolates collected before and after a national typhoid immunization program that was implemented in Thailand in 1977 in response to a large outbreak; the program was highly effective in reducing typhoid case numbers. Thai isolates were highly diverse, including 10 distinct phylogenetic lineages or genotypes. Novel prophage and plasmids were also detected, including examples that were previously only reported in Shigella sonnei and Escherichia coli. The majority of S. Typhi genotypes observed prior to the immunization program were not observed following it. Post-vaccine era isolates were more closely related to S. Typhi isolated from neighbouring countries than to earlier Thai isolates, providing no evidence for the local persistence of endemic S. Typhi following the national immunization program. Rather, later cases of typhoid appeared to be caused by the occasional importation of common genotypes from neighbouring Vietnam, Laos, and Cambodia. These data show the value of WGS in understanding the impacts of vaccination on pathogen populations and provide support for the proposal that large-scale typhoid immunization programs in endemic areas could result in lasting local disease elimination, although larger prospective studies are needed to test this directly. PMID:28060810

  12. Whole Genome Sequence Analysis of Salmonella Typhi Isolated in Thailand before and after the Introduction of a National Immunization Program.

    Science.gov (United States)

    Dyson, Zoe A; Thanh, Duy Pham; Bodhidatta, Ladaporn; Mason, Carl Jeffries; Srijan, Apichai; Rabaa, Maia A; Vinh, Phat Voong; Thanh, Tuyen Ha; Thwaites, Guy E; Baker, Stephen; Holt, Kathryn E

    2017-01-01

    Vaccines against Salmonella Typhi, the causative agent of typhoid fever, are commonly used by travellers, however, there are few examples of national immunization programs in endemic areas. There is therefore a paucity of data on the impact of typhoid immunization programs on localised populations of S. Typhi. Here we have used whole genome sequencing (WGS) to characterise 44 historical bacterial isolates collected before and after a national typhoid immunization program that was implemented in Thailand in 1977 in response to a large outbreak; the program was highly effective in reducing typhoid case numbers. Thai isolates were highly diverse, including 10 distinct phylogenetic lineages or genotypes. Novel prophage and plasmids were also detected, including examples that were previously only reported in Shigella sonnei and Escherichia coli. The majority of S. Typhi genotypes observed prior to the immunization program were not observed following it. Post-vaccine era isolates were more closely related to S. Typhi isolated from neighbouring countries than to earlier Thai isolates, providing no evidence for the local persistence of endemic S. Typhi following the national immunization program. Rather, later cases of typhoid appeared to be caused by the occasional importation of common genotypes from neighbouring Vietnam, Laos, and Cambodia. These data show the value of WGS in understanding the impacts of vaccination on pathogen populations and provide support for the proposal that large-scale typhoid immunization programs in endemic areas could result in lasting local disease elimination, although larger prospective studies are needed to test this directly.

  13. Controlling inbreeding and maximizing genetic gain using semi-definite programming with pedigree-based and genomic relationships.

    Science.gov (United States)

    Schierenbeck, S; Pimentel, E C G; Tietze, M; Körte, J; Reents, R; Reinhardt, F; Simianer, H; König, S

    2011-12-01

    Because of the relatively high levels of genetic relationships among potential bull sires and bull dams, innovative selection tools should consider both genetic gain and genetic relationships in a long-term perspective. Optimum genetic contribution theory using official estimated breeding values for a moderately heritable trait (production index, Index-PROD), and a lowly heritable functional trait (index for somatic cell score, Index-SCS) was applied to find optimal allocations of bull dams and bull sires. In contrast to previous practical applications using optimizations based on Lagrange multipliers, we focused on semi-definite programming (SDP). The SDP methodology was combined with either pedigree (a(ij)) or genomic relationships (f(ij)) among selection candidates. Selection candidates were 484 genotyped bulls, and 499 preselected genotyped bull dams completing a central test on station. In different scenarios separately for PROD and SCS, constraints on the average pedigree relationships among future progeny were varied from a(ij)=0.08 to a(ij)=0.20 in increments of 0.01. Corresponding constraints for single nucleotide polymorphism-based kinship coefficients were derived from regression analysis. Applying the coefficient of 0.52 with an intercept of 0.14 estimated for the regression pedigree relationship on genomic relationship, the corresponding range to alter genomic relationships varied from f(ij) = 0.18 to f(ij) = 0.24. Despite differences for some bulls in genomic and pedigree relationships, the same trends were observed for constraints on pedigree and corresponding genomic relationships regarding results in genetic gain and achieved coefficients of relationships. Generally, allowing higher values for relationships resulted in an increase of genetic gain for Index-PROD and Index-SCS and in a reduction in the number of selected sires. Interestingly, more sires were selected for all scenarios when restricting genomic relationships compared with restricting

  14. Genomic prediction unifies animal and plant breeding programs to form platforms for biological discovery

    DEFF Research Database (Denmark)

    Hickey, John M.; Chiurugwi, Tinashe; Mackay, Ian

    2017-01-01

    The rate of annual yield increases for major staple crops must more than double relative to current levels in order to feed a predicted global population of 9 billion by 2050. Controlled hybridization and selective breeding have been used for centuries to adapt plant and animal species for human...... that unifies breeding approaches, biological discovery, and tools and methods. Here we compare and contrast some animal and plant breeding approaches to make a case for bringing the two together through the application of genomic selection. We propose a strategy for the use of genomic selection as a unifying...... use. However, achieving higher, sustainable rates of improvement in yields in various species will require renewed genetic interventions and dramatic improvement of agricultural practices. Genomic prediction of breeding values has the potential to improve selection, reduce costs and provide a platform...

  15. Allele frequency changes due to hitch-hiking in genomic selection programs

    DEFF Research Database (Denmark)

    Liu, Huiming; Sørensen, Anders Christian; Meuwissen, Theo H E

    2014-01-01

    Background Genomic selection makes it possible to reduce pedigree-based inbreeding over best linear unbiased prediction (BLUP) by increasing emphasis on own rather than family information. However, pedigree inbreeding might not accurately reflect the loss of genetic variation and the true level...... of inbreeding due to changes in allele frequencies and hitch-hiking. This study aimed at understanding the impact of using long-term genomic selection on changes in allele frequencies, genetic variation and the level of inbreeding. Methods Selection was performed in simulated scenarios with a population of 400......-BLUP, Genomic BLUP and Bayesian Lasso. Changes in allele frequencies at QTL, markers and linked neutral loci were investigated for the different selection criteria and different scenarios, along with the loss of favourable alleles and the rate of inbreeding measured by pedigree and runs of homozygosity. Results...

  16. Genomic prediction unifies animal and plant breeding programs to form platforms for biological discovery.

    Science.gov (United States)

    Hickey, John M; Chiurugwi, Tinashe; Mackay, Ian; Powell, Wayne

    2017-08-30

    The rate of annual yield increases for major staple crops must more than double relative to current levels in order to feed a predicted global population of 9 billion by 2050. Controlled hybridization and selective breeding have been used for centuries to adapt plant and animal species for human use. However, achieving higher, sustainable rates of improvement in yields in various species will require renewed genetic interventions and dramatic improvement of agricultural practices. Genomic prediction of breeding values has the potential to improve selection, reduce costs and provide a platform that unifies breeding approaches, biological discovery, and tools and methods. Here we compare and contrast some animal and plant breeding approaches to make a case for bringing the two together through the application of genomic selection. We propose a strategy for the use of genomic selection as a unifying approach to deliver innovative 'step changes' in the rate of genetic gain at scale.

  17. Democratizing Human Genome Project Information: A Model Program for Education, Information and Debate in Public Libraries.

    Science.gov (United States)

    Pollack, Miriam

    The "Mapping the Human Genome" project demonstrated that librarians can help whomever they serve in accessing information resources in the areas of biological and health information, whether it is the scientists who are developing the information or a member of the public who is using the information. Public libraries can guide library…

  18. Conserved Transcription Factors Steer Growth-Related Genomic Programs in Daphnia

    Science.gov (United States)

    Spanier, Katina I.; Jansen, Mieke; Decaestecker, Ellen; Hulselmans, Gert; Becker, Dörthe; Colbourne, John K.; Orsini, Luisa

    2017-01-01

    Abstract Ecological genomics aims to understand the functional association between environmental gradients and the genes underlying adaptive traits. Many genes that are identified by genome-wide screening in ecologically relevant species lack functional annotations. Although gene functions can be inferred from sequence homology, such approaches have limited power. Here, we introduce ecological regulatory genomics by presenting an ontology-free gene prioritization method. Specifically, our method combines transcriptome profiling with high-throughput cis-regulatory sequence analysis in the water fleas Daphnia pulex and Daphnia magna. It screens coexpressed genes for overrepresented DNA motifs that serve as transcription factor binding sites, thereby providing insight into conserved transcription factors and gene regulatory networks shaping the expression profile. We first validated our method, called Daphnia-cisTarget, on a D. pulex heat shock data set, which revealed a network driven by the heat shock factor. Next, we performed RNA-Seq in D. magna exposed to the cyanobacterium Microcystis aeruginosa. Daphnia-cisTarget identified coregulated gene networks that associate with the moulting cycle and potentially regulate life history changes in growth rate and age at maturity. These networks are predicted to be regulated by evolutionary conserved transcription factors such as the homologues of Drosophila Shavenbaby and Grainyhead, nuclear receptors, and a GATA family member. In conclusion, our approach allows prioritising candidate genes in Daphnia without bias towards prior knowledge about functional gene annotation and represents an important step towards exploring the molecular mechanisms of ecological responses in organisms with poorly annotated genomes. PMID:28854641

  19. Replication of the Wellcome Trust genome-wide association study of essential hypertension: the Family Blood Pressure Program.

    Science.gov (United States)

    Ehret, Georg B; Morrison, Alanna C; O'Connor, Ashley A; Grove, Megan L; Baird, Lisa; Schwander, Karen; Weder, Alan; Cooper, Richard S; Rao, D C; Hunt, Steven C; Boerwinkle, Eric; Chakravarti, Aravinda

    2008-12-01

    Essential hypertension is a principal cardiovascular risk factor whose origin remains unknown. Classical genetic studies have shown that blood pressure is at least partially heritable, opening a window to understanding the pathophysiology of essential hypertension in the human using modern genetic tools. The Wellcome Trust Case Control Consortium has recently published the results of screening the genomes of 2000 essential hypertension cases and 3000 controls using 500 000 genome-wide single nucleotide polymorphisms (SNPs). None of the variants proved to be genome-wide significant after correction for multiple tests but the most significantly associated SNPs (PFamily Blood Pressure Program comprising 11 433 individuals recruited by hypertensive families. The results suggest that only one of the six SNPs might be associated with essential hypertension in Americans of European origin. This SNP shows a significant but opposite effect in Americans of Hispanic origin and no association in African Americans. The significance of the opposing effect estimates is unclear. No replication could be shown for hypertension status, but there are differences in study design. This attempted replication highlights that essential hypertension studies will require more comprehensive and larger genetic screens.

  20. An Introduction to China Rice Functional Genomics Program%中国水稻功能基因组学研究计划介绍

    Institute of Scientific and Technical Information of China (English)

    Yongbiao XUE; Zhihong XU; Jingliu ZHANG; Da LUO; Jiayang LI; Yaoguang LIU; Hongwei XUE; Kang CHONG; Hai HUANG; Guohua LIANG

    2002-01-01

    @@ To discover genes essential for agronomic performances of crops we initiated a program on rice functional genomics of important agronomic traits in rice in 1999. The program was funded by the Ministry of Science and Technology of China through the National Basic Research Initiative and is expected to last for five years.

  1. Ku-mediated coupling of DNA cleavage and repair during programmed genome rearrangements in the ciliate Paramecium tetraurelia.

    Directory of Open Access Journals (Sweden)

    Antoine Marmignon

    2014-08-01

    Full Text Available During somatic differentiation, physiological DNA double-strand breaks (DSB can drive programmed genome rearrangements (PGR, during which DSB repair pathways are mobilized to safeguard genome integrity. Because of their unique nuclear dimorphism, ciliates are powerful unicellular eukaryotic models to study the mechanisms involved in PGR. At each sexual cycle, the germline nucleus is transmitted to the progeny, but the somatic nucleus, essential for gene expression, is destroyed and a new somatic nucleus differentiates from a copy of the germline nucleus. In Paramecium tetraurelia, the development of the somatic nucleus involves massive PGR, including the precise elimination of at least 45,000 germline sequences (Internal Eliminated Sequences, IES. IES excision proceeds through a cut-and-close mechanism: a domesticated transposase, PiggyMac, is essential for DNA cleavage, and DSB repair at excision sites involves the Ligase IV, a specific component of the non-homologous end-joining (NHEJ pathway. At the genome-wide level, a huge number of programmed DSBs must be repaired during this process to allow the assembly of functional somatic chromosomes. To understand how DNA cleavage and DSB repair are coordinated during PGR, we have focused on Ku, the earliest actor of NHEJ-mediated repair. Two Ku70 and three Ku80 paralogs are encoded in the genome of P. tetraurelia: Ku70a and Ku80c are produced during sexual processes and localize specifically in the developing new somatic nucleus. Using RNA interference, we show that the development-specific Ku70/Ku80c heterodimer is essential for the recovery of a functional somatic nucleus. Strikingly, at the molecular level, PiggyMac-dependent DNA cleavage is abolished at IES boundaries in cells depleted for Ku80c, resulting in IES retention in the somatic genome. PiggyMac and Ku70a/Ku80c co-purify as a complex when overproduced in a heterologous system. We conclude that Ku has been integrated in the Paramecium

  2. Ku-mediated coupling of DNA cleavage and repair during programmed genome rearrangements in the ciliate Paramecium tetraurelia.

    Science.gov (United States)

    Marmignon, Antoine; Bischerour, Julien; Silve, Aude; Fojcik, Clémentine; Dubois, Emeline; Arnaiz, Olivier; Kapusta, Aurélie; Malinsky, Sophie; Bétermier, Mireille

    2014-08-01

    During somatic differentiation, physiological DNA double-strand breaks (DSB) can drive programmed genome rearrangements (PGR), during which DSB repair pathways are mobilized to safeguard genome integrity. Because of their unique nuclear dimorphism, ciliates are powerful unicellular eukaryotic models to study the mechanisms involved in PGR. At each sexual cycle, the germline nucleus is transmitted to the progeny, but the somatic nucleus, essential for gene expression, is destroyed and a new somatic nucleus differentiates from a copy of the germline nucleus. In Paramecium tetraurelia, the development of the somatic nucleus involves massive PGR, including the precise elimination of at least 45,000 germline sequences (Internal Eliminated Sequences, IES). IES excision proceeds through a cut-and-close mechanism: a domesticated transposase, PiggyMac, is essential for DNA cleavage, and DSB repair at excision sites involves the Ligase IV, a specific component of the non-homologous end-joining (NHEJ) pathway. At the genome-wide level, a huge number of programmed DSBs must be repaired during this process to allow the assembly of functional somatic chromosomes. To understand how DNA cleavage and DSB repair are coordinated during PGR, we have focused on Ku, the earliest actor of NHEJ-mediated repair. Two Ku70 and three Ku80 paralogs are encoded in the genome of P. tetraurelia: Ku70a and Ku80c are produced during sexual processes and localize specifically in the developing new somatic nucleus. Using RNA interference, we show that the development-specific Ku70/Ku80c heterodimer is essential for the recovery of a functional somatic nucleus. Strikingly, at the molecular level, PiggyMac-dependent DNA cleavage is abolished at IES boundaries in cells depleted for Ku80c, resulting in IES retention in the somatic genome. PiggyMac and Ku70a/Ku80c co-purify as a complex when overproduced in a heterologous system. We conclude that Ku has been integrated in the Paramecium DNA cleavage

  3. Sequential computation of elementary modes and minimal cut sets in genome-scale metabolic networks using alternate integer linear programming.

    Science.gov (United States)

    Song, Hyun-Seob; Goldberg, Noam; Mahajan, Ashutosh; Ramkrishna, Doraiswami

    2017-08-01

    Elementary (flux) modes (EMs) have served as a valuable tool for investigating structural and functional properties of metabolic networks. Identification of the full set of EMs in genome-scale networks remains challenging due to combinatorial explosion of EMs in complex networks. It is often, however, that only a small subset of relevant EMs needs to be known, for which optimization-based sequential computation is a useful alternative. Most of the currently available methods along this line are based on the iterative use of mixed integer linear programming (MILP), the effectiveness of which significantly deteriorates as the number of iterations builds up. To alleviate the computational burden associated with the MILP implementation, we here present a novel optimization algorithm termed alternate integer linear programming (AILP). Our algorithm was designed to iteratively solve a pair of integer programming (IP) and linear programming (LP) to compute EMs in a sequential manner. In each step, the IP identifies a minimal subset of reactions, the deletion of which disables all previously identified EMs. Thus, a subsequent LP solution subject to this reaction deletion constraint becomes a distinct EM. In cases where no feasible LP solution is available, IP-derived reaction deletion sets represent minimal cut sets (MCSs). Despite the additional computation of MCSs, AILP achieved significant time reduction in computing EMs by orders of magnitude. The proposed AILP algorithm not only offers a computational advantage in the EM analysis of genome-scale networks, but also improves the understanding of the linkage between EMs and MCSs. The software is implemented in Matlab, and is provided as supplementary information . hyunseob.song@pnnl.gov. Supplementary data are available at Bioinformatics online.

  4. Proceedings of the relevance of mass spectrometry to DNA sequence determination: Research needs for the Human Genome Program

    Energy Technology Data Exchange (ETDEWEB)

    Edmonds, C.G.; Smith, R.D. (Pacific Northwest Lab., Richland, WA (USA)); Smith, L.M. (Wisconsin Univ., Madison, WI (USA))

    1990-11-01

    A workshop was sponsored for the US Department of Energy (DOE), Office of Health and Environmental Research by Pacific Northwest Laboratory, April 4--5, 1990, in Seattle, Washington, to examine the potential role of mass spectrometry in the joint DOE/National Institutes of Health (NIH) Human Genome Program. The workshop was occasioned by recent developments in mass spectrometry that are providing new levels for selectivity, sensitivity, and, in particular, new methods of ionization appropriate for large biopolymers such as DNA. During discussions, three general mass spectrometric approaches to the determination of DNA sequence were considered: (1) the mass spectrometric detection of isotopic labels from DNA sequencing mixtures separated using gel electrophoresis, (2) the direct mass spectrometric analysis from direct ionization of unfractionated sequencing mixtures where the measured mass of the constituents functions to identify and order the base sequence (replacing separation by gel electrophoresis), and (3) an approach in which a single highly charged molecular ion of a large DNA segment produced is rapidly sequenced in an ion cyclotron resonance ion trap. The consensus of the workshop was that, on the basis of the new developments, mass spectrometry has the potential to provide the substantial increases in sequencing speed required for the Human Genome Program. 66 refs., 3 tabs.

  5. Proceedings of the relevance of mass spectrometry to DNA sequence determination: Research needs for the Human Genome Program

    Energy Technology Data Exchange (ETDEWEB)

    Edmonds, C.G.; Smith, R.D. (Pacific Northwest Lab., Richland, WA (USA)); Smith, L.M. (Wisconsin Univ., Madison, WI (USA))

    1990-11-01

    A workshop was sponsored for the US Department of Energy (DOE), Office of Health and Environmental Research by Pacific Northwest Laboratory, April 4--5, 1990, in Seattle, Washington, to examine the potential role of mass spectrometry in the joint DOE/National Institutes of Health (NIH) Human Genome Program. The workshop was occasioned by recent developments in mass spectrometry that are providing new levels for selectivity, sensitivity, and, in particular, new methods of ionization appropriate for large biopolymers such as DNA. During discussions, three general mass spectrometric approaches to the determination of DNA sequence were considered: (1) the mass spectrometric detection of isotopic labels from DNA sequencing mixtures separated using gel electrophoresis, (2) the direct mass spectrometric analysis from direct ionization of unfractionated sequencing mixtures where the measured mass of the constituents functions to identify and order the base sequence (replacing separation by gel electrophoresis), and (3) an approach in which a single highly charged molecular ion of a large DNA segment produced is rapidly sequenced in an ion cyclotron resonance ion trap. The consensus of the workshop was that, on the basis of the new developments, mass spectrometry has the potential to provide the substantial increases in sequencing speed required for the Human Genome Program. 66 refs., 3 tabs.

  6. Educational Gaps in Molecular Diagnostics, Genomics, and Personalized Medicine in Dermatopathology Training: A Survey of US Dermatopathology Fellowship Program Directors.

    Science.gov (United States)

    Torre, Kristin; Russomanno, Kristen; Ferringer, Tammie; Elston, Dirk; Murphy, Michael J

    2017-05-02

    Molecular technologies offer clinicians the tools to provide high-quality, cost-effective patient care. We evaluated education focused on molecular diagnostics, genomics, and personalized medicine in dermatopathology fellowship. A 20-question online survey was emailed to all (n = 53) Accreditation Council for Graduate Medical Education (ACGME)-accredited dermatopathology training programs in the United States. Thirty-one of 53 program directors responded (response rate = 58%). Molecular training is undertaken in 74% of responding dermatopathology fellowships, with levels of instruction varying among dermatology-based and pathology-based programs. Education differed for dermatology- and pathology-trained fellows in approximately one-fifth (19%) of programs. Almost half (48%) of responding program directors believe that fellows are not currently receiving adequate molecular education although the majority (97%) expect to incorporate additional instruction in the next 2-5 years. Factors influencing the incorporation of relevant education include perceived clinical utility and Accreditation Council for Graduate Medical Education/residency review committee (RRC) requirements. Potential benefits of molecular education include increased medical knowledge, improved patient care, and promotion of effective communication with other healthcare professionals. More than two-thirds (68%) of responding program directors believe that instruction in molecular technologies should be required in dermatopathology fellowship training. Although all responding dermatopathology fellowship program directors agreed that molecular education is important, only a little over half of survey participants believe that their fellows receive adequate instruction. This represents an important educational gap. Discussion among those who oversee fellow education is necessary to best integrate and evaluate teaching of molecular dermatopathology.

  7. Genomic Selection for Processing and End-Use Quality Traits in the CIMMYT Spring Bread Wheat Breeding Program

    Directory of Open Access Journals (Sweden)

    Sarah D. Battenfield

    2016-07-01

    Full Text Available Wheat ( L. cultivars must possess suitable end-use quality for release and consumer acceptability. However, breeding for quality traits is often considered a secondary target relative to yield largely because of amount of seed needed and expense. Without testing and selection, many undesirable materials are advanced, expending additional resources. Here, we develop and validate whole-genome prediction models for end-use quality phenotypes in the CIMMYT bread wheat breeding program. Model accuracy was tested using forward prediction on breeding lines ( = 5520 tested in unbalanced yield trials from 2009 to 2015 at Ciudad Obregon, Sonora, Mexico. Quality parameters included test weight, 1000-kernel weight, hardness, grain and flour protein, flour yield, sodium dodecyl sulfate sedimentation, Mixograph and Alveograph performance, and loaf volume. In general, prediction accuracy substantially increased over time as more data was available to train the model. Reflecting practical implementation of genomic selection (GS in the breeding program, forward prediction accuracies ( for quality parameters were assessed in 2015 and ranged from 0.32 (grain hardness to 0.62 (mixing time. Increased selection intensity was possible with GS since more entries can be genotyped than phenotyped and expected genetic gain was 1.4 to 2.7 times higher across all traits than phenotypic selection. Given the limitations in measuring many lines for quality, we conclude that GS is a powerful tool to facilitate early generation selection for end-use quality in wheat, leaving larger populations for selection on yield during advanced testing and leading to better gain for both quality and yield in bread wheat breeding programs.

  8. Genome-Wide Linkage Analysis for Loci Affecting Pulse Pressure: The Family Blood Pressure Program

    National Research Council Canada - National Science Library

    Bielinski, Suzette J; Lynch, Amy I; Miller, Michael B; Weder, Alan; Cooper, Richard; Oberman, Albert; Chen, Yii-Der Ida; Turner, Stephen T; Fornage, Myriam; Province, Michael; Arnett, Donna K

    2005-01-01

    ... in sequential oligogenic linkage analysis routines. The analysis sample included 10 798 participants in 3320 families who were recruited as part of the Family Blood Pressure Program and were phenotyped with an oscillometric blood pressure measurement...

  9. 45 CFR 89.2 - Definitions.

    Science.gov (United States)

    2010-10-01

    ... providing any commercial sex act. Recipients are contractors, grantees, applicants or awardees who receive Leadership Act funds for HIV/AIDS programs directly or indirectly from HHS. Sex trafficking means the...: Commercial sex act means any sex act on account of which anything of value is given to or received by...

  10. Multimedia Presentations on the Human Genome: Implementation and Assessment of a Teaching Program for the Introduction to Genome Science Using a Poster and Animations

    Science.gov (United States)

    Kano, Kei; Yahata, Saiko; Muroi, Kaori; Kawakami, Masahiro; Tomoda, Mari; Miyaki, Koichi; Nakayama, Takeo; Kosugi, Shinji; Kato, Kazuto

    2008-01-01

    Genome science, including topics such as gene recombination, cloning, genetic tests, and gene therapy, is now an established part of our daily lives; thus we need to learn genome science to better equip ourselves for the present day. Learning from topics directly related to the human has been suggested to be more effective than learning from…

  11. Integrating Public Health and Deliberative Public Bioethics: Lessons from the Human Genome Project Ethical, Legal, and Social Implications Program.

    Science.gov (United States)

    Meagher, Karen M; Lee, Lisa M

    2016-01-01

    Public health policy works best when grounded in firm public health standards of evidence and widely shared social values. In this article, we argue for incorporating a specific method of ethical deliberation--deliberative public bioethics--into public health. We describe how deliberative public bioethics is a method of engagement that can be helpful in public health. Although medical, research, and public health ethics can be considered some of what bioethics addresses, deliberative public bioethics offers both a how and where. Using the Human Genome Project Ethical, Legal, and Social Implications program as an example of effective incorporation of deliberative processes to integrate ethics into public health policy, we examine how deliberative public bioethics can integrate both public health and bioethics perspectives into three areas of public health practice: research, education, and health policy. We then offer recommendations for future collaborations that integrate deliberative methods into public health policy and practice.

  12. ParaHaplo: A program package for haplotype-based whole-genome association study using parallel computing

    Directory of Open Access Journals (Sweden)

    Kamatani Naoyuki

    2009-10-01

    Full Text Available Abstract Background Since more than a million single-nucleotide polymorphisms (SNPs are analyzed in any given genome-wide association study (GWAS, performing multiple comparisons can be problematic. To cope with multiple-comparison problems in GWAS, haplotype-based algorithms were developed to correct for multiple comparisons at multiple SNP loci in linkage disequilibrium. A permutation test can also control problems inherent in multiple testing; however, both the calculation of exact probability and the execution of permutation tests are time-consuming. Faster methods for calculating exact probabilities and executing permutation tests are required. Methods We developed a set of computer programs for the parallel computation of accurate P-values in haplotype-based GWAS. Our program, ParaHaplo, is intended for workstation clusters using the Intel Message Passing Interface (MPI. We compared the performance of our algorithm to that of the regular permutation test on JPT and CHB of HapMap. Results ParaHaplo can detect smaller differences between 2 populations than SNP-based GWAS. We also found that parallel-computing techniques made ParaHaplo 100-fold faster than a non-parallel version of the program. Conclusion ParaHaplo is a useful tool in conducting haplotype-based GWAS. Since the data sizes of such projects continue to increase, the use of fast computations with parallel computing--such as that used in ParaHaplo--will become increasingly important. The executable binaries and program sources of ParaHaplo are available at the following address: http://sourceforge.jp/projects/parallelgwas/?_sl=1

  13. ParaHaplo: A program package for haplotype-based whole-genome association study using parallel computing.

    Science.gov (United States)

    Misawa, Kazuharu; Kamatani, Naoyuki

    2009-10-21

    Since more than a million single-nucleotide polymorphisms (SNPs) are analyzed in any given genome-wide association study (GWAS), performing multiple comparisons can be problematic. To cope with multiple-comparison problems in GWAS, haplotype-based algorithms were developed to correct for multiple comparisons at multiple SNP loci in linkage disequilibrium. A permutation test can also control problems inherent in multiple testing; however, both the calculation of exact probability and the execution of permutation tests are time-consuming. Faster methods for calculating exact probabilities and executing permutation tests are required. We developed a set of computer programs for the parallel computation of accurate P-values in haplotype-based GWAS. Our program, ParaHaplo, is intended for workstation clusters using the Intel Message Passing Interface (MPI). We compared the performance of our algorithm to that of the regular permutation test on JPT and CHB of HapMap. ParaHaplo can detect smaller differences between 2 populations than SNP-based GWAS. We also found that parallel-computing techniques made ParaHaplo 100-fold faster than a non-parallel version of the program. ParaHaplo is a useful tool in conducting haplotype-based GWAS. Since the data sizes of such projects continue to increase, the use of fast computations with parallel computing--such as that used in ParaHaplo--will become increasingly important. The executable binaries and program sources of ParaHaplo are available at the following address: http://sourceforge.jp/projects/parallelgwas/?_sl=1.

  14. Cultural differences define diagnosis and genomic medicine practice: implications for undiagnosed diseases program in China.

    Science.gov (United States)

    Duan, Xiaohong; Markello, Thomas; Adams, David; Toro, Camilo; Tifft, Cynthia; Gahl, William A; Boerkoel, Cornelius F

    2013-09-01

    Despite the current acceleration and increasing leadership of Chinese genetics research, genetics and its clinical application have largely been imported to China from the Occident. Neither genetics nor the scientific reductionism underpinning its clinical application is integral to the traditional Chinese worldview. Given that disease concepts and their incumbent diagnoses are historically derived and culturally meaningful, we hypothesize that the cultural expectations of genetic diagnoses and medical genetics practice differ between the Occident and China. Specifically, we suggest that an undiagnosed diseases program in China will differ from the recently established Undiagnosed Diseases Program at the United States National Institutes of Health; a culturally sensitive concept will integrate traditional Chinese understanding of disease with the scientific reductionism of Occidental medicine.

  15. Integrating genomics and proteomics data to predict drug effects using binary linear programming.

    Science.gov (United States)

    Ji, Zhiwei; Su, Jing; Liu, Chenglin; Wang, Hongyan; Huang, Deshuang; Zhou, Xiaobo

    2014-01-01

    The Library of Integrated Network-Based Cellular Signatures (LINCS) project aims to create a network-based understanding of biology by cataloging changes in gene expression and signal transduction that occur when cells are exposed to a variety of perturbations. It is helpful for understanding cell pathways and facilitating drug discovery. Here, we developed a novel approach to infer cell-specific pathways and identify a compound's effects using gene expression and phosphoproteomics data under treatments with different compounds. Gene expression data were employed to infer potential targets of compounds and create a generic pathway map. Binary linear programming (BLP) was then developed to optimize the generic pathway topology based on the mid-stage signaling response of phosphorylation. To demonstrate effectiveness of this approach, we built a generic pathway map for the MCF7 breast cancer cell line and inferred the cell-specific pathways by BLP. The first group of 11 compounds was utilized to optimize the generic pathways, and then 4 compounds were used to identify effects based on the inferred cell-specific pathways. Cross-validation indicated that the cell-specific pathways reliably predicted a compound's effects. Finally, we applied BLP to re-optimize the cell-specific pathways to predict the effects of 4 compounds (trichostatin A, MS-275, staurosporine, and digoxigenin) according to compound-induced topological alterations. Trichostatin A and MS-275 (both HDAC inhibitors) inhibited the downstream pathway of HDAC1 and caused cell growth arrest via activation of p53 and p21; the effects of digoxigenin were totally opposite. Staurosporine blocked the cell cycle via p53 and p21, but also promoted cell growth via activated HDAC1 and its downstream pathway. Our approach was also applied to the PC3 prostate cancer cell line, and the cross-validation analysis showed very good accuracy in predicting effects of 4 compounds. In summary, our computational model can be

  16. Whole-Genome Sequence Assembly for Mammalian Genomes: Arachne 2

    OpenAIRE

    Jaffe, David B.; Butler, Jonathan; Gnerre, Sante; Mauceli, Evan; Lindblad-Toh, Kerstin; Jill P. Mesirov; Michael C Zody; Lander, Eric S.

    2003-01-01

    We previously described the whole-genome assembly program Arachne, presenting assemblies of simulated data for small to mid-sized genomes. Here we describe algorithmic adaptations to the program, allowing for assembly of mammalian-size genomes, and also improving the assembly of smaller genomes. Three principal changes were simultaneously made and applied to the assembly of the mouse genome, during a six-month period of development: (1) Supercontigs (scaffolds) were iteratively broken and rej...

  17. ParaHaplo 2.0: a program package for haplotype-estimation and haplotype-based whole-genome association study using parallel computing

    Directory of Open Access Journals (Sweden)

    Kamatani Naoyuki

    2010-06-01

    Full Text Available Abstract Background The use of haplotype-based association tests can improve the power of genome-wide association studies. Since the observed genotypes are unordered pairs of alleles, haplotype phase must be inferred. However, estimating haplotype phase is time consuming. When millions of single-nucleotide polymorphisms (SNPs are analyzed in genome-wide association study, faster methods for haplotype estimation are required. Methods We developed a program package for parallel computation of haplotype estimation. Our program package, ParaHaplo 2.0, is intended for use in workstation clusters using the Intel Message Passing Interface (MPI. We compared the performance of our algorithm to that of the regular permutation test on both Japanese in Tokyo, Japan and Han Chinese in Beijing, China of the HapMap dataset. Results Parallel version of ParaHaplo 2.0 can estimate haplotypes 100 times faster than a non-parallel version of the ParaHaplo. Conclusion ParaHaplo 2.0 is an invaluable tool for conducting haplotype-based genome-wide association studies (GWAS. The need for fast haplotype estimation using parallel computing will become increasingly important as the data sizes of such projects continue to increase. The executable binaries and program sources of ParaHaplo are available at the following address: http://en.sourceforge.jp/projects/parallelgwas/releases/

  18. Generating information-rich high-throughput experimental materials genomes using functional clustering via multitree genetic programming and information theory.

    Science.gov (United States)

    Suram, Santosh K; Haber, Joel A; Jin, Jian; Gregoire, John M

    2015-04-13

    High-throughput experimental methodologies are capable of synthesizing, screening and characterizing vast arrays of combinatorial material libraries at a very rapid rate. These methodologies strategically employ tiered screening wherein the number of compositions screened decreases as the complexity, and very often the scientific information obtained from a screening experiment, increases. The algorithm used for down-selection of samples from higher throughput screening experiment to a lower throughput screening experiment is vital in achieving information-rich experimental materials genomes. The fundamental science of material discovery lies in the establishment of composition-structure-property relationships, motivating the development of advanced down-selection algorithms which consider the information value of the selected compositions, as opposed to simply selecting the best performing compositions from a high throughput experiment. Identification of property fields (composition regions with distinct composition-property relationships) in high throughput data enables down-selection algorithms to employ advanced selection strategies, such as the selection of representative compositions from each field or selection of compositions that span the composition space of the highest performing field. Such strategies would greatly enhance the generation of data-driven discoveries. We introduce an informatics-based clustering of composition-property functional relationships using a combination of information theory and multitree genetic programming concepts for identification of property fields in a composition library. We demonstrate our approach using a complex synthetic composition-property map for a 5 at. % step ternary library consisting of four distinct property fields and finally explore the application of this methodology for capturing relationships between composition and catalytic activity for the oxygen evolution reaction for 5429 catalyst compositions in a

  19. Genomic Encyclopedia of Fungi

    Energy Technology Data Exchange (ETDEWEB)

    Grigoriev, Igor

    2012-08-10

    Genomes of fungi relevant to energy and environment are in focus of the Fungal Genomic Program at the US Department of Energy Joint Genome Institute (JGI). Its key project, the Genomics Encyclopedia of Fungi, targets fungi related to plant health (symbionts, pathogens, and biocontrol agents) and biorefinery processes (cellulose degradation, sugar fermentation, industrial hosts), and explores fungal diversity by means of genome sequencing and analysis. Over 150 fungal genomes have been sequenced by JGI to date and released through MycoCosm (www.jgi.doe.gov/fungi), a fungal web-portal, which integrates sequence and functional data with genome analysis tools for user community. Sequence analysis supported by functional genomics leads to developing parts list for complex systems ranging from ecosystems of biofuel crops to biorefineries. Recent examples of such parts suggested by comparative genomics and functional analysis in these areas are presented here.

  20. Marine genomics

    DEFF Research Database (Denmark)

    Oliveira Ribeiro, Ângela Maria; Foote, Andrew D.; Kupczok, Anne

    2017-01-01

    Marine ecosystems occupy 71% of the surface of our planet, yet we know little about their diversity. Although the inventory of species is continually increasing, as registered by the Census of Marine Life program, only about 10% of the estimated two million marine species are known. This lag......-throughput sequencing approaches have been helping to improve our knowledge of marine biodiversity, from the rich microbial biota that forms the base of the tree of life to a wealth of plant and animal species. In this review, we present an overview of the applications of genomics to the study of marine life, from...... evolutionary biology of non-model organisms to species of commercial relevance for fishing, aquaculture and biomedicine. Instead of providing an exhaustive list of available genomic data, we rather set to present contextualized examples that best represent the current status of the field of marine genomics....

  1. Ontology for Genome Comparison and Genomic Rearrangements

    Directory of Open Access Journals (Sweden)

    Anil Wipat

    2006-04-01

    Full Text Available We present an ontology for describing genomes, genome comparisons, their evolution and biological function. This ontology will support the development of novel genome comparison algorithms and aid the community in discussing genomic evolution. It provides a framework for communication about comparative genomics, and a basis upon which further automated analysis can be built. The nomenclature defined by the ontology will foster clearer communication between biologists, and also standardize terms used by data publishers in the results of analysis programs. The overriding aim of this ontology is the facilitation of consistent annotation of genomes through computational methods, rather than human annotators. To this end, the ontology includes definitions that support computer analysis and automated transfer of annotations between genomes, rather than relying upon human mediation.

  2. Phytozome Comparative Plant Genomics Portal

    Energy Technology Data Exchange (ETDEWEB)

    Goodstein, David; Batra, Sajeev; Carlson, Joseph; Hayes, Richard; Phillips, Jeremy; Shu, Shengqiang; Schmutz, Jeremy; Rokhsar, Daniel

    2014-09-09

    The Dept. of Energy Joint Genome Institute is a genomics user facility supporting DOE mission science in the areas of Bioenergy, Carbon Cycling, and Biogeochemistry. The Plant Program at the JGI applies genomic, analytical, computational and informatics platforms and methods to: 1. Understand and accelerate the improvement (domestication) of bioenergy crops 2. Characterize and moderate plant response to climate change 3. Use comparative genomics to identify constrained elements and infer gene function 4. Build high quality genomic resource platforms of JGI Plant Flagship genomes for functional and experimental work 5. Expand functional genomic resources for Plant Flagship genomes

  3. Fisher: a program for the detection of H/ACA snoRNAs using MFE secondary structure prediction and comparative genomics – assessment and update

    Science.gov (United States)

    Freyhult, Eva; Edvardsson, Sverker; Tamas, Ivica; Moulton, Vincent; Poole, Anthony M

    2008-01-01

    Background The H/ACA family of small nucleolar RNAs (snoRNAs) plays a central role in guiding the pseudouridylation of ribosomal RNA (rRNA). In an effort to systematically identify the complete set of rRNA-modifying H/ACA snoRNAs from the genome sequence of the budding yeast, Saccharomyces cerevisiae, we developed a program – Fisher – and previously presented several candidate snoRNAs based on our analysis [1]. Findings In this report, we provide a brief update of this work, which was aborted after the publication of experimentally-identified snoRNAs [2] identical to candidates we had identified bioinformatically using Fisher. Our motivation for revisiting this work is to report on the status of the candidate snoRNAs described in [1], and secondly, to report that a modified version of Fisher together with the available multiple yeast genome sequences was able to correctly identify several H/ACA snoRNAs for modification sites not identified by the snoGPS program [3]. While we are no longer developing Fisher, we briefly consider the merits of the Fisher algorithm relative to snoGPS, which may be of use for workers considering pursuing a similar search strategy for the identification of small RNAs. The modified source code for Fisher is made available as supplementary material. Conclusion Our results confirm the validity of using minimum free energy (MFE) secondary structure prediction to guide comparative genomic screening for RNA families with few sequence constraints. PMID:18710502

  4. Optimizing the creation of base populations for aquaculture breeding programs using phenotypic and genomic data and its consequences on genetic progress.

    Science.gov (United States)

    Fernández, Jesús; Toro, Miguel Á; Sonesson, Anna K; Villanueva, Beatriz

    2014-01-01

    The success of an aquaculture breeding program critically depends on the way in which the base population of breeders is constructed since all the genetic variability for the traits included originally in the breeding goal as well as those to be included in the future is contained in the initial founders. Traditionally, base populations were created from a number of wild strains by sampling equal numbers from each strain. However, for some aquaculture species improved strains are already available and, therefore, mean phenotypic values for economically important traits can be used as a criterion to optimize the sampling when creating base populations. Also, the increasing availability of genome-wide genotype information in aquaculture species could help to refine the estimation of relationships within and between candidate strains and, thus, to optimize the percentage of individuals to be sampled from each strain. This study explores the advantages of using phenotypic and genome-wide information when constructing base populations for aquaculture breeding programs in terms of initial and subsequent trait performance and genetic diversity level. Results show that a compromise solution between diversity and performance can be found when creating base populations. Up to 6% higher levels of phenotypic performance can be achieved at the same level of global diversity in the base population by optimizing the selection of breeders instead of sampling equal numbers from each strain. The higher performance observed in the base population persisted during 10 generations of phenotypic selection applied in the subsequent breeding program.

  5. Optimizing the creation of base populations for aquaculture breeding programs using phenotypic and genomic data and its consequences on genetic progress

    Directory of Open Access Journals (Sweden)

    Jesús eFernández

    2014-11-01

    Full Text Available The success of an aquaculture breeding program critically depends on the way in which the base population of breeders is constructed since all the genetic variability for the traits included originally in the breeding goal as well as those to be included in the future is contained in those initial founders. Traditionally base populations were created from a number of wild strains by sampling equal numbers from each strain. However, for some aquaculture species improved strains are already available and therefore, mean phenotypic values for economically important traits can be used as a criterion to optimize the sampling when creating base populations. Also, the increasing availability of genome-wide genotype information in aquaculture species could help to refine the estimation of relationships within and between candidate strains and, thus, to optimize the percentage of individuals to be sampled from each strain. This study explores the advantages of using phenotypic and genome-wide information when constructing base populations for aquaculture breeding programs in terms of initial and subsequent trait performance and genetic diversity level. Results show that a compromise solution between diversity and performance can be found when creating base populations. Up to 6% higher levels of phenotypic performance can be achieved at the same level of global diversity in the base population by optimizing the selection of breeders instead of sampling equal numbers from each strain. The higher performance observed in the base population persisted during ten generations of phenotypic selection applied in the subsequent breeding program.

  6. The Research of VBA Programmed Modeling Based on Product Genome%基于产品基因组的VBA程序建模开发研究

    Institute of Scientific and Technical Information of China (English)

    陈煌; 陈锦昌

    2013-01-01

    In order to meet the market demand,improve the accuracy of establishing the digital model of product and meet the demand of the small batch production, the research has been proposed.Under the premise of sketching the whole product prototype,the product designer and the VBA programmers group leader divide the product into several different characteristic product genomes according to the product deconstructed rule which are based on genome,then the VBA program designers are responsible for the corresponding features genomes according to the distribution rule of product genome,with VBA programming language for constructing the parametric,programmed,exact and 3D digital modeling, which can exert the advantages of collaborative development and motivate the enthusiasm and innovative of team members.Therefore ,it can realize the organic union of perceptual product design and rational VBA program development, promoting the union of perceptual design and rational design.%为了适应市场需求,提高产品数字化建模的精确性和满足小批量生产,提出了在产品设计师手绘完成整个产品原型的前提下,依据基因组的产品解构规则把产品解构为若于个不同特征的产品基因组,再由VBA程序设计师们根据产品基因组的分配规则来负责对应的特征基因组,用VBA程序语言进行参数化、程序化与精确化的三维数字化建模,发挥协同开发的优势以及调动团队中各成员的热情与创新性,从而实现感性的产品设计与理性的VBA程序开发的有机结合,促进感性设计与理性设计的结合.

  7. Frontiers in cancer epidemiology: a challenge to the research community from the Epidemiology and Genomics Research Program at the National Cancer Institute.

    Science.gov (United States)

    Khoury, Muin J; Freedman, Andrew N; Gillanders, Elizabeth M; Harvey, Chinonye E; Kaefer, Christie; Reid, Britt C; Rogers, Scott; Schully, Sheri D; Seminara, Daniela; Verma, Mukesh

    2012-07-01

    The Epidemiology and Genomics Research Program (EGRP) at the National Cancer Institute (NCI) is developing scientific priorities for cancer epidemiology research in the next decade. We would like to engage the research community and other stakeholders in a planning effort that will include a workshop in December 2012 to help shape new foci for cancer epidemiology research. To facilitate the process of defining the future of cancer epidemiology, we invite the research community to join in an ongoing web-based conversation at http://blog-epi.grants.cancer.gov/ to develop priorities and the next generation of high-impact studies.

  8. Automatic genomics: a user-friendly program for the automatic designing and plate loading of medium-throughput qPCR experiments.

    Science.gov (United States)

    Callejas, Sergio; Alvarez, Rebeca; Dopazo, Ana

    2011-01-01

    Quantitative PCR (qPCR) remains the method of choice for gene and microRNA (miRNA) expression studies. Many laboratories wish to automate some or all of the steps of medium-throughput qPCR experiments through the use of various types of liquid handling robots. However, it is not uncommon to find cases in which scripts provided by the robot supplier are too rigid for user-specific applications, do not include all the desired options, or are too complicated to be modified by a nonprofessional programmer. Here, we present Automatic Genomics, a program that allows users with a limited programming background to automate medium-throughput qPCR experiments by using commercially available liquid-handling robots. The user is able to optimize the plate design in terms of number of genes, number of samples, and controls.

  9. Querying genomic databases

    Energy Technology Data Exchange (ETDEWEB)

    Baehr, A.; Hagstrom, R.; Joerg, D.; Overbeek, R.

    1991-09-01

    A natural-language interface has been developed that retrieves genomic information by using a simple subset of English. The interface spares the biologist from the task of learning database-specific query languages and computer programming. Currently, the interface deals with the E. coli genome. It can, however, be readily extended and shows promise as a means of easy access to other sequenced genomic databases as well.

  10. Five blood pressure loci identified by an updated genome-wide linkage scan: meta-analysis of the Family Blood Pressure Program.

    Science.gov (United States)

    Simino, Jeannette; Shi, Gang; Kume, Rezart; Schwander, Karen; Province, Michael A; Gu, C Charles; Kardia, Sharon; Chakravarti, Aravinda; Ehret, Georg; Olshen, Richard A; Turner, Stephen T; Ho, Low-Tone; Zhu, Xiaofeng; Jaquish, Cashell; Paltoo, Dina; Cooper, Richard S; Weder, Alan; Curb, J David; Boerwinkle, Eric; Hunt, Steven C; Rao, Dabeeru C

    2011-03-01

    A preliminary genome-wide linkage analysis of blood pressure in the Family Blood Pressure Program (FBPP) was reported previously. We harnessed the power and ethnic diversity of the final pooled FBPP dataset to identify novel loci for blood pressure thereby enhancing localization of genes containing less common variants with large effects on blood pressure levels and hypertension. We performed one overall and 4 race-specific meta-analyses of genome-wide blood pressure linkage scans using data on 4,226 African-American, 2,154 Asian, 4,229 Caucasian, and 2,435 Mexican-American participants (total N = 13,044). Variance components models were fit to measured (raw) blood pressure levels and two types of antihypertensive medication adjusted blood pressure phenotypes within each of 10 subgroups defined by race and network. A modified Fisher's method was used to combine the P values for each linkage marker across the 10 subgroups. Five quantitative trait loci (QTLs) were detected on chromosomes 6p22.3, 8q23.1, 20q13.12, 21q21.1, and 21q21.3 based on significant linkage evidence (defined by logarithm of odds (lod) score ≥3) in at least one meta-analysis and lod scores ≥1 in at least 2 subgroups defined by network and race. The chromosome 8q23.1 locus was supported by Asian-, Caucasian-, and Mexican-American-specific meta-analyses. The new QTLs reported justify new candidate gene studies. They may help support results from genome-wide association studies (GWAS) that fall in these QTL regions but fail to achieve the genome-wide significance.

  11. Genome Maps, a new generation genome browser.

    Science.gov (United States)

    Medina, Ignacio; Salavert, Francisco; Sanchez, Rubén; de Maria, Alejandro; Alonso, Roberto; Escobar, Pablo; Bleda, Marta; Dopazo, Joaquín

    2013-07-01

    Genome browsers have gained importance as more genomes and related genomic information become available. However, the increase of information brought about by new generation sequencing technologies is, at the same time, causing a subtle but continuous decrease in the efficiency of conventional genome browsers. Here, we present Genome Maps, a genome browser that implements an innovative model of data transfer and management. The program uses highly efficient technologies from the new HTML5 standard, such as scalable vector graphics, that optimize workloads at both server and client sides and ensure future scalability. Thus, data management and representation are entirely carried out by the browser, without the need of any Java Applet, Flash or other plug-in technology installation. Relevant biological data on genes, transcripts, exons, regulatory features, single-nucleotide polymorphisms, karyotype and so forth, are imported from web services and are available as tracks. In addition, several DAS servers are already included in Genome Maps. As a novelty, this web-based genome browser allows the local upload of huge genomic data files (e.g. VCF or BAM) that can be dynamically visualized in real time at the client side, thus facilitating the management of medical data affected by privacy restrictions. Finally, Genome Maps can easily be integrated in any web application by including only a few lines of code. Genome Maps is an open source collaborative initiative available in the GitHub repository (https://github.com/compbio-bigdata-viz/genome-maps). Genome Maps is available at: http://www.genomemaps.org.

  12. Genome Maps, a new generation genome browser

    Science.gov (United States)

    Medina, Ignacio; Salavert, Francisco; Sanchez, Rubén; de Maria, Alejandro; Alonso, Roberto; Escobar, Pablo; Bleda, Marta; Dopazo, Joaquín

    2013-01-01

    Genome browsers have gained importance as more genomes and related genomic information become available. However, the increase of information brought about by new generation sequencing technologies is, at the same time, causing a subtle but continuous decrease in the efficiency of conventional genome browsers. Here, we present Genome Maps, a genome browser that implements an innovative model of data transfer and management. The program uses highly efficient technologies from the new HTML5 standard, such as scalable vector graphics, that optimize workloads at both server and client sides and ensure future scalability. Thus, data management and representation are entirely carried out by the browser, without the need of any Java Applet, Flash or other plug-in technology installation. Relevant biological data on genes, transcripts, exons, regulatory features, single-nucleotide polymorphisms, karyotype and so forth, are imported from web services and are available as tracks. In addition, several DAS servers are already included in Genome Maps. As a novelty, this web-based genome browser allows the local upload of huge genomic data files (e.g. VCF or BAM) that can be dynamically visualized in real time at the client side, thus facilitating the management of medical data affected by privacy restrictions. Finally, Genome Maps can easily be integrated in any web application by including only a few lines of code. Genome Maps is an open source collaborative initiative available in the GitHub repository (https://github.com/compbio-bigdata-viz/genome-maps). Genome Maps is available at: http://www.genomemaps.org. PMID:23748955

  13. Fisher: a program for the detection of H/ACA snoRNAs using MFE secondary structure prediction and comparative genomics – assessment and update

    Directory of Open Access Journals (Sweden)

    Tamas Ivica

    2008-07-01

    Full Text Available Abstract Background The H/ACA family of small nucleolar RNAs (snoRNAs plays a central role in guiding the pseudouridylation of ribosomal RNA (rRNA. In an effort to systematically identify the complete set of rRNA-modifying H/ACA snoRNAs from the genome sequence of the budding yeast, Saccharomyces cerevisiae, we developed a program – Fisher – and previously presented several candidate snoRNAs based on our analysis 1. Findings In this report, we provide a brief update of this work, which was aborted after the publication of experimentally-identified snoRNAs 2 identical to candidates we had identified bioinformatically using Fisher. Our motivation for revisiting this work is to report on the status of the candidate snoRNAs described in 1, and secondly, to report that a modified version of Fisher together with the available multiple yeast genome sequences was able to correctly identify several H/ACA snoRNAs for modification sites not identified by the snoGPS program 3. While we are no longer developing Fisher, we briefly consider the merits of the Fisher algorithm relative to snoGPS, which may be of use for workers considering pursuing a similar search strategy for the identification of small RNAs. The modified source code for Fisher is made available as supplementary material. Conclusion Our results confirm the validity of using minimum free energy (MFE secondary structure prediction to guide comparative genomic screening for RNA families with few sequence constraints.

  14. Program in Functional Genomics of Autoimmunity and Immunology of yhe University of Kentucky and the University of Alabama

    Energy Technology Data Exchange (ETDEWEB)

    Alan M Kaplan

    2012-10-12

    This grant will be used to augment the equipment infrastructure and core support at the University of Kentucky and the University of Alabama particularly in the areas of genomics/informatics, molecular analysis and cell separation. In addition, we will promote collaborative research interactions through scientific workshops and exchange of scientists, as well as joint exploration of the role of immune receptors as targets in autoimmunity and host defense, innate and adaptive immune responses, and mucosal immunity in host defense.

  15. Genomic Prediction in Barley

    DEFF Research Database (Denmark)

    Edriss, Vahid; Cericola, Fabio; Jensen, Jens D

    2015-01-01

    Genomic prediction uses markers (SNPs) across the whole genome to predict individual breeding values at an early growth stage potentially before large scale phenotyping. One of the applications of genomic prediction in plant breeding is to identify the best individual candidate lines to contribute...... to next generation. The main goal of this study was to see the potential of using genomic prediction in a commercial Barley breeding program. The data used in this study was from Nordic Seed company which is located in Denmark. Around 350 advanced lines were genotyped with 9K Barely chip from Illumina...

  16. MacSyFinder: a program to mine genomes for molecular systems with an application to CRISPR-Cas systems.

    Directory of Open Access Journals (Sweden)

    Sophie S Abby

    Full Text Available Biologists often wish to use their knowledge on a few experimental models of a given molecular system to identify homologs in genomic data. We developed a generic tool for this purpose.Macromolecular System Finder (MacSyFinder provides a flexible framework to model the properties of molecular systems (cellular machinery or pathway including their components, evolutionary associations with other systems and genetic architecture. Modelled features also include functional analogs, and the multiple uses of a same component by different systems. Models are used to search for molecular systems in complete genomes or in unstructured data like metagenomes. The components of the systems are searched by sequence similarity using Hidden Markov model (HMM protein profiles. The assignment of hits to a given system is decided based on compliance with the content and organization of the system model. A graphical interface, MacSyView, facilitates the analysis of the results by showing overviews of component content and genomic context. To exemplify the use of MacSyFinder we built models to detect and class CRISPR-Cas systems following a previously established classification. We show that MacSyFinder allows to easily define an accurate "Cas-finder" using publicly available protein profiles.MacSyFinder is a standalone application implemented in Python. It requires Python 2.7, Hmmer and makeblastdb (version 2.2.28 or higher. It is freely available with its source code under a GPLv3 license at https://github.com/gem-pasteur/macsyfinder. It is compatible with all platforms supporting Python and Hmmer/makeblastdb. The "Cas-finder" (models and HMM profiles is distributed as a compressed tarball archive as Supporting Information.

  17. Human Genome Project

    Energy Technology Data Exchange (ETDEWEB)

    Block, S. [The MITRE Corporation, McLean, VA (US). JASON Program Office; Cornwall, J. [The MITRE Corporation, McLean, VA (US). JASON Program Office; Dally, W. [The MITRE Corporation, McLean, VA (US). JASON Program Office; Dyson, F. [The MITRE Corporation, McLean, VA (US). JASON Program Office; Fortson, N. [The MITRE Corporation, McLean, VA (US). JASON Program Office; Joyce, G. [The MITRE Corporation, McLean, VA (US). JASON Program Office; Kimble, H. J. [The MITRE Corporation, McLean, VA (US). JASON Program Office; Lewis, N. [The MITRE Corporation, McLean, VA (US). JASON Program Office; Max, C. [The MITRE Corporation, McLean, VA (US). JASON Program Office; Prince, T. [The MITRE Corporation, McLean, VA (US). JASON Program Office; Schwitters, R. [The MITRE Corporation, McLean, VA (US). JASON Program Office; Weinberger, P. [The MITRE Corporation, McLean, VA (US). JASON Program Office; Woodin, W. H. [The MITRE Corporation, McLean, VA (US). JASON Program Office

    1998-01-04

    The study reviews Department of Energy supported aspects of the United States Human Genome Project, the joint National Institutes of Health/Department of Energy program to characterize all human genetic material, to discover the set of human genes, and to render them accessible for further biological study. The study concentrates on issues of technology, quality assurance/control, and informatics relevant to current effort on the genome project and needs beyond it. Recommendations are presented on areas of the genome program that are of particular interest to and supported by the Department of Energy.

  18. ParaHaplo 3.0: A program package for imputation and a haplotype-based whole-genome association study using hybrid parallel computing

    Directory of Open Access Journals (Sweden)

    Kamatani Naoyuki

    2011-05-01

    Full Text Available Abstract Background Use of missing genotype imputations and haplotype reconstructions are valuable in genome-wide association studies (GWASs. By modeling the patterns of linkage disequilibrium in a reference panel, genotypes not directly measured in the study samples can be imputed and used for GWASs. Since millions of single nucleotide polymorphisms need to be imputed in a GWAS, faster methods for genotype imputation and haplotype reconstruction are required. Results We developed a program package for parallel computation of genotype imputation and haplotype reconstruction. Our program package, ParaHaplo 3.0, is intended for use in workstation clusters using the Intel Message Passing Interface. We compared the performance of ParaHaplo 3.0 on the Japanese in Tokyo, Japan and Han Chinese in Beijing, and Chinese in the HapMap dataset. A parallel version of ParaHaplo 3.0 can conduct genotype imputation 20 times faster than a non-parallel version of ParaHaplo. Conclusions ParaHaplo 3.0 is an invaluable tool for conducting haplotype-based GWASs. The need for faster genotype imputation and haplotype reconstruction using parallel computing will become increasingly important as the data sizes of such projects continue to increase. ParaHaplo executable binaries and program sources are available at http://en.sourceforge.jp/projects/parallelgwas/releases/.

  19. Genomes to Life''Center for Molecular and Cellular Systems'': A research program for identification and characterization of protein complexes.

    Energy Technology Data Exchange (ETDEWEB)

    Buchanan, M V.; Larimer, Frank; Wiley, H S.; Kennel, S J.; Squier, Thomas C.; Ramsey, John M.; Rodland, Karin D.; Hurst, G B.; Smith, Richard D.; Xu, Ying; Dixon, David A.; Doktycz, M J.; Colson, Steve D.; Gesteland, R; Giometti, Carol S.; Young, Mark E.; Giddings, Ralph M.

    2002-02-01

    Goal 1 of Department of Energy's Genomes to Life (GTL) program seeks to identify and characterize the complete set of protein complexes within a cell. Goal 1 forms the foundation necessary to accomplish the other objectives of the GTL program, which focus on gene regulatory networks and molecular level characterization of interactions in microbial communities. Together this information would allow cells and their components to be understood in sufficient detail to predict, test, and understand the responses of a biological system to its environment. The Center for Molecular and Cellular Systems has been established to identify and characterize protein complexes using high through-put analytical technologies. A dynamic research program is being developed that supports the goals of the Center by focusing on the development of new capabilities for sample preparation and complex separations, molecular level identification of the protein complexes by mass spectrometry, characterization of the complexes in living cells by imaging techniques, and bioinformatics and computational tools for the collection and interpretation of data and formation of databases and tools to allow the data to be shared by the biological community.

  20. Genome-wide Functional Analysis of CREB/Long-Term Memory-Dependent Transcription Reveals Distinct Basal and Memory Gene Expression Programs

    Science.gov (United States)

    Lakhina, Vanisha; Arey, Rachel N.; Kaletsky, Rachel; Kauffman, Amanda; Stein, Geneva; Keyes, William; Xu, Daniel; Murphy, Coleen T.

    2014-01-01

    SUMMARY Induced CREB activity is a hallmark of long-term memory, but the full repertoire of CREB transcriptional targets required specifically for memory is not known in any system. To obtain a more complete picture of the mechanisms involved in memory, we combined memory training with genome-wide transcriptional analysis of C. elegans CREB mutants. This approach identified 757 significant CREB/memory-induced targets and confirmed the involvement of known memory genes from other organisms, but also suggested new mechanisms and novel components that may be conserved through mammals. CREB mediates distinct basal and memory transcriptional programs at least partially through spatial restriction of CREB activity: basal targets are regulated primarily in nonneuronal tissues, while memory targets are enriched for neuronal expression, emanating from CREB activity in AIM neurons. This suite of novel memory-associated genes will provide a platform for the discovery of orthologous mammalian long-term memory components. PMID:25611510

  1. Genome-wide functional analysis of CREB/long-term memory-dependent transcription reveals distinct basal and memory gene expression programs.

    Science.gov (United States)

    Lakhina, Vanisha; Arey, Rachel N; Kaletsky, Rachel; Kauffman, Amanda; Stein, Geneva; Keyes, William; Xu, Daniel; Murphy, Coleen T

    2015-01-21

    Induced CREB activity is a hallmark of long-term memory, but the full repertoire of CREB transcriptional targets required specifically for memory is not known in any system. To obtain a more complete picture of the mechanisms involved in memory, we combined memory training with genome-wide transcriptional analysis of C. elegans CREB mutants. This approach identified 757 significant CREB/memory-induced targets and confirmed the involvement of known memory genes from other organisms, but also suggested new mechanisms and novel components that may be conserved through mammals. CREB mediates distinct basal and memory transcriptional programs at least partially through spatial restriction of CREB activity: basal targets are regulated primarily in nonneuronal tissues, while memory targets are enriched for neuronal expression, emanating from CREB activity in AIM neurons. This suite of novel memory-associated genes will provide a platform for the discovery of orthologous mammalian long-term memory components.

  2. Genome-wide ChIP-seq profiling of PPARγ/RXR target sites and gene program during adipogenesis

    DEFF Research Database (Denmark)

    Nielsen, Ronni; Pedersen, Thomas Åskov; Hagenbeek, Dik

    Peroxisome proliferator-activated receptors (PPARs) are nuclear receptors which bind to DNA as heterodimers with members of the retinoid X receptor family. PPARγ is an important regulator of adipocyte differentiation and function. In addition to driving the adipogenic process, PPARγ activates...... directly a large number of genes involved in lipid metabolism. Using ChIP combined with deep sequencing we have generated a genome-wide map of PPARγ-RXR binding to chromatin as well as the activation of associated target genes during differentiation of murine 3T3-L1 adipocytes. Our analysis shows...... that target sites/genes attain RXR and PPARγ occupancy at different time points and that sites are often co-occupied by C/EBP factors. Coupling this analysis with RNAPII occupancy throughout adipogenesis revealed that PPARg:RXR is specifically associated with induced genes involved in diverse processes...

  3. A Review on Genomics APIs

    Directory of Open Access Journals (Sweden)

    Rajeswari Swaminathan

    2016-01-01

    Full Text Available The constant improvement and falling prices of whole human genome Next Generation Sequencing (NGS has resulted in rapid adoption of genomic information at both clinics and research institutions. Considered together, the complexity of genomics data, due to its large volume and diversity along with the need for genomic data sharing, has resulted in the creation of Application Programming Interface (API for secure, modular, interoperable access to genomic data from different applications, platforms, and even organizations. The Genomics APIs are a set of special protocols that assist software developers in dealing with multiple genomic data sources for building seamless, interoperable applications leading to the advancement of both genomic and clinical research. These APIs help define a standard for retrieval of genomic data from multiple sources as well as to better package genomic information for integration with Electronic Health Records. This review covers three currently available Genomics APIs: a Google Genomics, b SMART Genomics, and c 23andMe. The functionalities, reference implementations (if available and authentication protocols of each API are reviewed. A comparative analysis of the different features across the three APIs is provided in the Discussion section. Though Genomics APIs are still under active development and have yet to reach widespread adoption, they hold the promise to make building of complicated genomics applications easier with downstream constructive effects on healthcare.

  4. Integrative bioinformatics analysis of genomic and proteomic approaches to understand the transcriptional regulatory program in coronary artery disease pathways.

    Directory of Open Access Journals (Sweden)

    Rajani Kanth Vangala

    Full Text Available Patients with cardiovascular disease show a panel of differentially regulated serum biomarkers indicative of modulation of several pathways from disease onset to progression. Few of these biomarkers have been proposed for multimarker risk prediction methods. However, the underlying mechanism of the expression changes and modulation of the pathways is not yet addressed in entirety. Our present work focuses on understanding the regulatory mechanisms at transcriptional level by identifying the core and specific transcription factors that regulate the coronary artery disease associated pathways. Using the principles of systems biology we integrated the genomics and proteomics data with computational tools. We selected biomarkers from 7 different pathways based on their association with the disease and assayed 24 biomarkers along with gene expression studies and built network modules which are highly regulated by 5 core regulators PPARG, EGR1, ETV1, KLF7 and ESRRA. These network modules in turn comprise of biomarkers from different pathways showing that the core regulatory transcription factors may work together in differential regulation of several pathways potentially leading to the disease. This kind of analysis can enhance the elucidation of mechanisms in the disease and give better strategies of developing multimarker module based risk predictions.

  5. Integrative bioinformatics analysis of genomic and proteomic approaches to understand the transcriptional regulatory program in coronary artery disease pathways.

    Science.gov (United States)

    Vangala, Rajani Kanth; Ravindran, Vandana; Ghatge, Madan; Shanker, Jayashree; Arvind, Prathima; Bindu, Hima; Shekar, Meghala; Rao, Veena S

    2013-01-01

    Patients with cardiovascular disease show a panel of differentially regulated serum biomarkers indicative of modulation of several pathways from disease onset to progression. Few of these biomarkers have been proposed for multimarker risk prediction methods. However, the underlying mechanism of the expression changes and modulation of the pathways is not yet addressed in entirety. Our present work focuses on understanding the regulatory mechanisms at transcriptional level by identifying the core and specific transcription factors that regulate the coronary artery disease associated pathways. Using the principles of systems biology we integrated the genomics and proteomics data with computational tools. We selected biomarkers from 7 different pathways based on their association with the disease and assayed 24 biomarkers along with gene expression studies and built network modules which are highly regulated by 5 core regulators PPARG, EGR1, ETV1, KLF7 and ESRRA. These network modules in turn comprise of biomarkers from different pathways showing that the core regulatory transcription factors may work together in differential regulation of several pathways potentially leading to the disease. This kind of analysis can enhance the elucidation of mechanisms in the disease and give better strategies of developing multimarker module based risk predictions.

  6. Genomic Expression Program Involving the Haa1p-Regulon in Saccharomyces cerevisiae Response to Acetic Acid

    Science.gov (United States)

    Becker, Jorg D.; Sá-Correia, Isabel

    2010-01-01

    Abstract The alterations occurring in yeast genomic expression during early response to acetic acid and the involvement of the transcription factor Haa1p in this transcriptional reprogramming are described in this study. Haa1p was found to regulate, directly or indirectly, the transcription of approximately 80% of the acetic acid-activated genes, suggesting that Haa1p is the main player in the control of yeast response to this weak acid. The genes identified in this work as being activated in response to acetic acid in a Haa1p-dependent manner include protein kinases, multidrug resistance transporters, proteins involved in lipid metabolism, in nucleic acid processing, and proteins of unknown function. Among these genes, the expression of SAP30 and HRK1 provided the strongest protective effect toward acetic acid. SAP30 encode a subunit of a histone deacetylase complex and HRK1 encode a protein kinase belonging to a family of protein kinases dedicated to the regulation of plasma membrane transporters activity. The deletion of the HRK1 gene was found to lead to the increase of the accumulation of labeled acetic acid into acid-stressed yeast cells, suggesting that the role of both HAA1 and HRK1 in providing protection against acetic acid is, at least partially, related with their involvement in the reduction of intracellular acetate concentration. PMID:20955010

  7. HIV-1 and M-PMV RNA Nuclear Export Elements Program Viral Genomes for Distinct Cytoplasmic Trafficking Behaviors.

    Science.gov (United States)

    Pocock, Ginger M; Becker, Jordan T; Swanson, Chad M; Ahlquist, Paul; Sherer, Nathan M

    2016-04-01

    Retroviruses encode cis-acting RNA nuclear export elements that override nuclear retention of intron-containing viral mRNAs including the full-length, unspliced genomic RNAs (gRNAs) packaged into assembling virions. The HIV-1 Rev-response element (RRE) recruits the cellular nuclear export receptor CRM1 (also known as exportin-1/XPO1) using the viral protein Rev, while simple retroviruses encode constitutive transport elements (CTEs) that directly recruit components of the NXF1(Tap)/NXT1(p15) mRNA nuclear export machinery. How gRNA nuclear export is linked to trafficking machineries in the cytoplasm upstream of virus particle assembly is unknown. Here we used long-term (>24 h), multicolor live cell imaging to directly visualize HIV-1 gRNA nuclear export, translation, cytoplasmic trafficking, and virus particle production in single cells. We show that the HIV-1 RRE regulates unique, en masse, Rev- and CRM1-dependent "burst-like" transitions of mRNAs from the nucleus to flood the cytoplasm in a non-localized fashion. By contrast, the CTE derived from Mason-Pfizer monkey virus (M-PMV) links gRNAs to microtubules in the cytoplasm, driving them to cluster markedly to the centrosome that forms the pericentriolar core of the microtubule-organizing center (MTOC). Adding each export element to selected heterologous mRNAs was sufficient to confer each distinct export behavior, as was directing Rev/CRM1 or NXF1/NXT1 transport modules to mRNAs using a site-specific RNA tethering strategy. Moreover, multiple CTEs per transcript enhanced MTOC targeting, suggesting that a cooperative mechanism links NXF1/NXT1 to microtubules. Combined, these results reveal striking, unexpected features of retroviral gRNA nucleocytoplasmic transport and demonstrate roles for mRNA export elements that extend beyond nuclear pores to impact gRNA distribution in the cytoplasm.

  8. Breeding-assisted genomics.

    Science.gov (United States)

    Poland, Jesse

    2015-04-01

    The revolution of inexpensive sequencing has ushered in an unprecedented age of genomics. The promise of using this technology to accelerate plant breeding is being realized with a vision of genomics-assisted breeding that will lead to rapid genetic gain for expensive and difficult traits. The reality is now that robust phenotypic data is an increasing limiting resource to complement the current wealth of genomic information. While genomics has been hailed as the discipline to fundamentally change the scope of plant breeding, a more symbiotic relationship is likely to emerge. In the context of developing and evaluating large populations needed for functional genomics, none excel in this area more than plant breeders. While genetic studies have long relied on dedicated, well-structured populations, the resources dedicated to these populations in the context of readily available, inexpensive genotyping is making this philosophy less tractable relative to directly focusing functional genomics on material in breeding programs. Through shifting effort for basic genomic studies from dedicated structured populations, to capturing the entire scope of genetic determinants in breeding lines, we can move towards not only furthering our understanding of functional genomics in plants, but also rapidly improving crops for increased food security, availability and nutrition.

  9. Bisprimer--a program for the design of primers for bisulfite-based genomic sequencing of both plant and Mammalian DNA samples.

    Science.gov (United States)

    Kovacova, Viera; Janousek, Bohuslav

    2012-01-01

    Plants and animals differ in the sequence context of the methylated sites in DNA. Plants exhibit cytosine methylation in CG, CHG, and CHH sites, whereas CG methylation is the only form present in mammals (with an exception of the early embryonic development). This fact must be taken into account in the design of primers for bisulfite-based genomic sequencing because CHG and CHH sites can remain unmodified. Surprisingly, no user-friendly primer design program is publicly available that could be used to design primers in plants and to simultaneously check the properties of primers such as the potential for primer-dimer formation. For studies concentrating on particular DNA loci, the correct design of primers is crucial. The program, called BisPrimer, includes 2 different subprograms for the primer design, the first one for mammals and the second one for angiosperm plants. Each subprogram is divided into 2 variants. The first variant serves to design primers that preferentially bind to the bisulfite-modified primer-binding sites (C to U conversion). This type of primer preferentially amplifies the bisulfite-converted DNA strands. This feature can help to avoid problems connected with an incomplete bisulfite modification that can sometimes occur for technical reasons. The second variant is intended for the analysis of samples that are supposed to consist of a mixture of DNA molecules that have different levels of cytosine methylation (e.g., pollen DNA). In this case, the aim is to minimize the selection in favor of either less methylated or more methylated molecules.

  10. GRAbB : Selective Assembly of Genomic Regions, a New Niche for Genomic Research

    NARCIS (Netherlands)

    Brankovics, Balázs; Zhang, Hao; van Diepeningen, Anne D; van der Lee, Theo A J; Waalwijk, Cees; de Hoog, G Sybren

    2016-01-01

    GRAbB (Genomic Region Assembly by Baiting) is a new program that is dedicated to assemble specific genomic regions from NGS data. This approach is especially useful when dealing with multi copy regions, such as mitochondrial genome and the rDNA repeat region, parts of the genome that are often negle

  11. Genomic medicine implementation: learning by example.

    Science.gov (United States)

    Williams, Marc S

    2014-03-01

    Genomic Medicine is beginning to emerge into clinical practice. The National Human Genome Research Institute's Genomic Medicine Working Group consists of organizations that have begun to implement some aspect of genomic medicine (e.g., family history, systematic implementation of Mendelian disease program, pharmacogenomics, whole exome/genome sequencing). This article concisely reviews the working group and provides a broader context for the articles in the special issue including an assessment of anticipated provider needs and ethical, legal, and social issues relevant to the implementation of genomic medicine. The challenges of implementation of innovation in clinical practice and the potential value of genomic medicine are discussed.

  12. Cancer genomics

    DEFF Research Database (Denmark)

    Norrild, Bodil; Guldberg, Per; Ralfkiær, Elisabeth Methner

    2007-01-01

    Almost all cells in the human body contain a complete copy of the genome with an estimated number of 25,000 genes. The sequences of these genes make up about three percent of the genome and comprise the inherited set of genetic information. The genome also contains information that determines whe...

  13. [Results of work on the project "Instruments, reagents, probes" of the state scientific-technical program "Human genome" (1989-1994)].

    Science.gov (United States)

    Tverdokhlebov, E N

    1995-01-01

    This report reviews the activities of the "Reagents, Devices, Probes" branch of the Russian State "Human Genome" Project for six-year period (1989-1994). Data on pilot and commercial production of reagents and equipment for human genome studies along with information on the project costs and awarded grants are presented.

  14. Genomic Resources for Cancer Epidemiology

    Science.gov (United States)

    This page provides links to research resources, complied by the Epidemiology and Genomics Research Program, that may be of interest to genetic epidemiologists conducting cancer research, but is not exhaustive.

  15. Collaborators | Office of Cancer Genomics

    Science.gov (United States)

    The TARGET initiative is jointly managed within the National Cancer Institute (NCI) by the Office of Cancer Genomics (OCG)Opens in a New Tab and the Cancer Therapy Evaluation Program (CTEP)Opens in a New Tab.

  16. The Genome Atlas Resource

    DEFF Research Database (Denmark)

    Azam Qureshi, Matloob; Rotenberg, Eva; Stærfeldt, Hans Henrik;

    2010-01-01

    with scripts and algorithms developed in a variety of programming languages at the Centre for Biological Sequence Analysis in order to create a three-tier software application for genome analysis. The results are made available via a web interface developed in Java, PHP and Perl CGI. User...

  17. Molluscan Evolutionary Genomics

    Energy Technology Data Exchange (ETDEWEB)

    Simison, W. Brian; Boore, Jeffrey L.

    2005-12-01

    In the last 20 years there have been dramatic advances in techniques of high-throughput DNA sequencing, most recently accelerated by the Human Genome Project, a program that has determined the three billion base pair code on which we are based. Now this tremendous capability is being directed at other genome targets that are being sampled across the broad range of life. This opens up opportunities as never before for evolutionary and organismal biologists to address questions of both processes and patterns of organismal change. We stand at the dawn of a new 'modern synthesis' period, paralleling that of the early 20th century when the fledgling field of genetics first identified the underlying basis for Darwin's theory. We must now unite the efforts of systematists, paleontologists, mathematicians, computer programmers, molecular biologists, developmental biologists, and others in the pursuit of discovering what genomics can teach us about the diversity of life. Genome-level sampling for mollusks to date has mostly been limited to mitochondrial genomes and it is likely that these will continue to provide the best targets for broad phylogenetic sampling in the near future. However, we are just beginning to see an inroad into complete nuclear genome sequencing, with several mollusks and other eutrochozoans having been selected for work about to begin. Here, we provide an overview of the state of molluscan mitochondrial genomics, highlight a few of the discoveries from this research, outline the promise of broadening this dataset, describe upcoming projects to sequence whole mollusk nuclear genomes, and challenge the community to prepare for making the best use of these data.

  18. Screening of genomic libraries.

    Science.gov (United States)

    Novelli, Valdenice M; Cristofani-Yaly, Mariângela; Bastianel, Marinês; Palmieri, Dario A; Machado, Marcos A

    2013-01-01

    Microsatellites, or simple sequence repeats (SSRs), have proven to be an important molecular marker in plant genetics and breeding research. The main strategies to obtain these markers can be through genomic DNA and from expressed sequence tags (ESTs) from mRNA/cDNA libraries. Genetic studies using microsatellite markers have increased rapidly because they can be highly polymorphic, codominant markers and they show heterozygous conserved sequences. Here, we describe a methodology to obtain microsatellite using the enrichment library of DNA genomic sequences. This method is highly efficient to development microsatellite markers especially in plants that do not have available ESTs or genome databases. This methodology has been used to enrich SSR marker libraries in Citrus spp., an important tool to genotype germplasm, to select zygotic hybrids, and to saturate genetic maps in breeding programs.

  19. Fungal genome sequencing: basic biology to biotechnology.

    Science.gov (United States)

    Sharma, Krishna Kant

    2016-08-01

    The genome sequences provide a first glimpse into the genomic basis of the biological diversity of filamentous fungi and yeast. The genome sequence of the budding yeast, Saccharomyces cerevisiae, with a small genome size, unicellular growth, and rich history of genetic and molecular analyses was a milestone of early genomics in the 1990s. The subsequent completion of fission yeast, Schizosaccharomyces pombe and genetic model, Neurospora crassa initiated a revolution in the genomics of the fungal kingdom. In due course of time, a substantial number of fungal genomes have been sequenced and publicly released, representing the widest sampling of genomes from any eukaryotic kingdom. An ambitious genome-sequencing program provides a wealth of data on metabolic diversity within the fungal kingdom, thereby enhancing research into medical science, agriculture science, ecology, bioremediation, bioenergy, and the biotechnology industry. Fungal genomics have higher potential to positively affect human health, environmental health, and the planet's stored energy. With a significant increase in sequenced fungal genomes, the known diversity of genes encoding organic acids, antibiotics, enzymes, and their pathways has increased exponentially. Currently, over a hundred fungal genome sequences are publicly available; however, no inclusive review has been published. This review is an initiative to address the significance of the fungal genome-sequencing program and provides the road map for basic and applied research.

  20. Antarctic Genomics

    Directory of Open Access Journals (Sweden)

    Alex D. Rogers

    2006-03-01

    Full Text Available With the development of genomic science and its battery of technologies, polar biology stands on the threshold of a revolution, one that will enable the investigation of important questions of unprecedented scope and with extraordinary depth and precision. The exotic organisms of polar ecosystems are ideal candidates for genomic analysis. Through such analyses, it will be possible to learn not only the novel features that enable polar organisms to survive, and indeed thrive, in their extreme environments, but also fundamental biological principles that are common to most, if not all, organisms. This article aims to review recent developments in Antarctic genomics and to demonstrate the global context of such studies.

  1. Genome-wide association and genomic selection in animal breeding.

    Science.gov (United States)

    Hayes, Ben; Goddard, Mike

    2010-11-01

    Results from genome-wide association studies in livestock, and humans, has lead to the conclusion that the effect of individual quantitative trait loci (QTL) on complex traits, such as yield, are likely to be small; therefore, a large number of QTL are necessary to explain genetic variation in these traits. Given this genetic architecture, gains from marker-assisted selection (MAS) programs using only a small number of DNA markers to trace a limited number of QTL is likely to be small. This has lead to the development of alternative technology for using the available dense single nucleotide polymorphism (SNP) information, called genomic selection. Genomic selection uses a genome-wide panel of dense markers so that all QTL are likely to be in linkage disequilibrium with at least one SNP. The genomic breeding values are predicted to be the sum of the effect of these SNPs across the entire genome. In dairy cattle breeding, the accuracy of genomic estimated breeding values (GEBV) that can be achieved and the fact that these are available early in life have lead to rapid adoption of the technology. Here, we discuss the design of experiments necessary to achieve accurate prediction of GEBV in future generations in terms of the number of markers necessary and the size of the reference population where marker effects are estimated. We also present a simple method for implementing genomic selection using a genomic relationship matrix. Future challenges discussed include using whole genome sequence data to improve the accuracy of genomic selection and management of inbreeding through genomic relationships.

  2. Herbarium genomics

    DEFF Research Database (Denmark)

    Bakker, Freek T.; Lei, Di; Yu, Jiaying

    2016-01-01

    Herbarium genomics is proving promising as next-generation sequencing approaches are well suited to deal with the usually fragmented nature of archival DNA. We show that routine assembly of partial plastome sequences from herbarium specimens is feasible, from total DNA extracts and with specimens...... up to 146 years old. We use genome skimming and an automated assembly pipeline, Iterative Organelle Genome Assembly, that assembles paired-end reads into a series of candidate assemblies, the best one of which is selected based on likelihood estimation. We used 93 specimens from 12 different...... correlation between plastome coverage and nuclear genome size (C value) in our samples, but the range of C values included is limited. Finally, we conclude that routine plastome sequencing from herbarium specimens is feasible and cost-effective (compared with Sanger sequencing or plastome...

  3. Genome sequencing conference II

    Energy Technology Data Exchange (ETDEWEB)

    1990-01-01

    Genome Sequencing Conference 2 was held September 30 to October 30, 1990. 26 speaker abstracts and 33 poster presentations were included in the program report. New and improved methods for DNA sequencing and genetic mapping were presented. Many of the papers were concerned with accuracy and speed of acquisition of data with computers and automation playing an increasing role. Individual papers have been processed separately for inclusion on the database.

  4. The life cycle of a genome project: perspectives and guidelines inspired by insect genome projects.

    Science.gov (United States)

    Papanicolaou, Alexie

    2016-01-01

    Many research programs on non-model species biology have been empowered by genomics. In turn, genomics is underpinned by a reference sequence and ancillary information created by so-called "genome projects". The most reliable genome projects are the ones created as part of an active research program and designed to address specific questions but their life extends past publication. In this opinion paper I outline four key insights that have facilitated maintaining genomic communities: the key role of computational capability, the iterative process of building genomic resources, the value of community participation and the importance of manual curation. Taken together, these ideas can and do ensure the longevity of genome projects and the growing non-model species community can use them to focus a discussion with regards to its future genomic infrastructure.

  5. 3D Genome Tuner: Compare Multiple Circular Genomes in a 3D Context

    Institute of Scientific and Technical Information of China (English)

    Qi Wang; Qun Liang; Xiuqing Zhang

    2009-01-01

    Circular genomes, being the largest proportion of sequenced genomes, play an important role in genome analysis. However, traditional 2D circular map only provides an overview and annotations of genome but does not offer feature-based comparison. For remedying these shortcomings, we developed 3D Genome Tuner, a hybrid of circular map and comparative map tools. Its capability of viewing comparisons between multiple circular maps in a 3D space offers great benefits to the study of comparative genomics. The program is freely available(under an LGPL licence)at http://sourceforge.net/projects/dgenometuner.

  6. 78 FR 20933 - National Human Genome Research Institute; Notice of Closed Meeting

    Science.gov (United States)

    2013-04-08

    ... HUMAN SERVICES National Institutes of Health National Human Genome Research Institute; Notice of Closed... of Committee: National Human Genome Research Institute Special Emphasis Panel Loan Repayment Program... applications. Place: National Human Genome Research Institute, Room 3055, 5635 Fishers Lane, Rockville,...

  7. 78 FR 68856 - National Human Genome Research Institute; Notice of Closed Meeting

    Science.gov (United States)

    2013-11-15

    ... HUMAN SERVICES National Institutes of Health National Human Genome Research Institute; Notice of Closed... Review Officer, Scientific Review Branch, National Human Genome Research Institute, National Institutes... of Federal Domestic Assistance Program Nos. 93.172, Human Genome Research, National Institutes...

  8. Prospects for Genomic Research in Forestry

    Directory of Open Access Journals (Sweden)

    K. V. Krutovsky

    2014-08-01

    Full Text Available Conifers are keystone species of boreal forests. Their whole genome sequencing, assembly and annotation will allow us to understand the evolution of the complex ancient giant conifer genomes that are 4 times larger in larch and 7–9 times larger in pines than the human genome. Genomic studies will allow also to obtain important whole genome sequence data and develop highly polymorphic and informative genetic markers, such as microsatellites and single nucleotide polymorphisms (SNPs that can be efficiently used in timber origin identification, for genetic variation monitoring, to study local and climate change adaptation and in tree improvement and conservation programs.

  9. Recent advances in fruit crop genomics

    Directory of Open Access Journals (Sweden)

    Qiang XU,Chaoyang LIU,Manosh Kumar BISWAS,Zhiyong PAN,Xiuxin DENG

    2014-02-01

    Full Text Available In recent years, dramatic progress has been made in the genomics of fruit crops. The publication of a dozen fruit crop genomes represents a milestone for both functional genomics and breeding programs in fruit crops. Rapid advances in high-throughput sequencing technology have revolutionized the manner and scale of genomics in fruit crops. Research on fruit crops is encompassing a wide range of biological questions which are unique and cannot be addressed in a model plant such as Arabidopsis. This review summarizes recent achievements of research on the genome, transcriptome, proteome, miRNAs and epigenome of fruit crops.

  10. Endometrial and acute myeloid leukemia cancer genomes characterized

    Science.gov (United States)

    Two studies from The Cancer Genome Atlas (TCGA) program reveal details about the genomic landscapes of acute myeloid leukemia (AML) and endometrial cancer. Both provide new insights into the molecular underpinnings of these cancers.

  11. Genome databases

    Energy Technology Data Exchange (ETDEWEB)

    Courteau, J.

    1991-10-11

    Since the Genome Project began several years ago, a plethora of databases have been developed or are in the works. They range from the massive Genome Data Base at Johns Hopkins University, the central repository of all gene mapping information, to small databases focusing on single chromosomes or organisms. Some are publicly available, others are essentially private electronic lab notebooks. Still others limit access to a consortium of researchers working on, say, a single human chromosome. An increasing number incorporate sophisticated search and analytical software, while others operate as little more than data lists. In consultation with numerous experts in the field, a list has been compiled of some key genome-related databases. The list was not limited to map and sequence databases but also included the tools investigators use to interpret and elucidate genetic data, such as protein sequence and protein structure databases. Because a major goal of the Genome Project is to map and sequence the genomes of several experimental animals, including E. coli, yeast, fruit fly, nematode, and mouse, the available databases for those organisms are listed as well. The author also includes several databases that are still under development - including some ambitious efforts that go beyond data compilation to create what are being called electronic research communities, enabling many users, rather than just one or a few curators, to add or edit the data and tag it as raw or confirmed.

  12. Single genome amplification of proviral HIV-1 DNA from dried blood spot specimens collected during early infant screening programs in Lusaka, Zambia.

    Science.gov (United States)

    Seu, Lillian; Mwape, Innocent; Guffey, M Bradford

    2014-07-01

    The ability to evaluate individual HIV-1 virions from the quasispecies of vertically infected infants was evaluated in a field setting at the Centre for Infectious Disease Research in Zambia. Infant heel-prick blood specimens were spotted onto dried blood spot (DBS) filter paper cards at government health clinics. Nucleic acid was extracted and used as a template for HIV-1 proviral DNA detection by a commercial Amplicor HIV-1 PCR test (Roche, version 1.5). On samples that tested positive by commercial diagnostic assay, amplification of DNA was performed using an in-house assay of the 5' and 3' region of the HIV-1 genome. Additionally, fragments covering 1200 nucleotides within pol (full length protease and partial reverse transcriptase) and 1400 nucleotides within env (variable 1-variable 5 region) were further analyzed by single genome amplification (SGA). In summary, we have demonstrated an in-house assay for amplifying the 5' and 3' proviral HIV-1 DNA as well as pol and env proviral DNA fragments from DBS cards collected and analyzed entirely in Zambia. In conclusion, this study shows the feasibility of utilizing DBS cards to amplify the whole proviral HIV-1 genome as well as perform SGA on key HIV-1 genes.

  13. Cephalopod genomics

    DEFF Research Database (Denmark)

    Albertin, Caroline B.; Bonnaud, Laure; Brown, C. Titus

    2012-01-01

    The Cephalopod Sequencing Consortium (CephSeq Consortium) was established at a NESCent Catalysis Group Meeting, ``Paths to Cephalopod Genomics-Strategies, Choices, Organization,'' held in Durham, North Carolina, USA on May 24-27, 2012. Twenty-eight participants representing nine countries (Austria......, Australia, China, Denmark, France, Italy, Japan, Spain and the USA) met to address the pressing need for genome sequencing of cephalopod mollusks. This group, drawn from cephalopod biologists, neuroscientists, developmental and evolutionary biologists, materials scientists, bioinformaticians and researchers...... active in sequencing, assembling and annotating genomes, agreed on a set of cephalopod species of particular importance for initial sequencing and developed strategies and an organization (CephSeq Consortium) to promote this sequencing. The conclusions and recommendations of this meeting are described...

  14. Listeria Genomics

    Science.gov (United States)

    Cabanes, Didier; Sousa, Sandra; Cossart, Pascale

    The opportunistic intracellular foodborne pathogen Listeria monocytogenes has become a paradigm for the study of host-pathogen interactions and bacterial adaptation to mammalian hosts. Analysis of L. monocytogenes infection has provided considerable insight into how bacteria invade cells, move intracellularly, and disseminate in tissues, as well as tools to address fundamental processes in cell biology. Moreover, the vast amount of knowledge that has been gathered through in-depth comparative genomic analyses and in vivo studies makes L. monocytogenes one of the most well-studied bacterial pathogens. This chapter provides an overview of progress in the exploration of genomic, transcriptomic, and proteomic data in Listeria spp. to understand genome evolution and diversity, as well as physiological aspects of metabolism used by bacteria when growing in diverse environments, in particular in infected hosts.

  15. Genome Sequencing

    DEFF Research Database (Denmark)

    Sato, Shusei; Andersen, Stig Uggerhøj

    2014-01-01

    The current Lotus japonicus reference genome sequence is based on a hybrid assembly of Sanger TAC/BAC, Sanger shotgun and Illumina shotgun sequencing data generated from the Miyakojima-MG20 accession. It covers nearly all expressed L. japonicus genes and has been annotated mainly based on transcr......The current Lotus japonicus reference genome sequence is based on a hybrid assembly of Sanger TAC/BAC, Sanger shotgun and Illumina shotgun sequencing data generated from the Miyakojima-MG20 accession. It covers nearly all expressed L. japonicus genes and has been annotated mainly based...

  16. Genome Improvement at JGI-HAGSC

    Energy Technology Data Exchange (ETDEWEB)

    Grimwood, Jane; Schmutz, Jeremy J.; Myers, Richard M.

    2012-03-03

    Since the completion of the sequencing of the human genome, the Joint Genome Institute (JGI) has rapidly expanded its scientific goals in several DOE mission-relevant areas. At the JGI-HAGSC, we have kept pace with this rapid expansion of projects with our focus on assessing, assembling, improving and finishing eukaryotic whole genome shotgun (WGS) projects for which the shotgun sequence is generated at the Production Genomic Facility (JGI-PGF). We follow this by combining the draft WGS with genomic resources generated at JGI-HAGSC or in collaborator laboratories (including BAC end sequences, genetic maps and FLcDNA sequences) to produce an improved draft sequence. For eukaryotic genomes important to the DOE mission, we then add further information from directed experiments to produce reference genomic sequences that are publicly available for any scientific researcher. Also, we have continued our program for producing BAC-based finished sequence, both for adding information to JGI genome projects and for small BAC-based sequencing projects proposed through any of the JGI sequencing programs. We have now built our computational expertise in WGS assembly and analysis and have moved eukaryotic genome assembly from the JGI-PGF to JGI-HAGSC. We have concentrated our assembly development work on large plant genomes and complex fungal and algal genomes.

  17. Fungal Genomics for Energy and Environment

    Energy Technology Data Exchange (ETDEWEB)

    Grigoriev, Igor V.

    2013-03-11

    Genomes of fungi relevant to energy and environment are in focus of the Fungal Genomic Program at the US Department of Energy Joint Genome Institute (JGI). One of its projects, the Genomics Encyclopedia of Fungi, targets fungi related to plant health (symbionts, pathogens, and biocontrol agents) and biorefinery processes (cellulose degradation, sugar fermentation, industrial hosts) by means of genome sequencing and analysis. New chapters of the Encyclopedia can be opened with user proposals to the JGI Community Sequencing Program (CSP). Another JGI project, the 1000 fungal genomes, explores fungal diversity on genome level at scale and is open for users to nominate new species for sequencing. Over 200 fungal genomes have been sequenced by JGI to date and released through MycoCosm (www.jgi.doe.gov/fungi), a fungal web-portal, which integrates sequence and functional data with genome analysis tools for user community. Sequence analysis supported by functional genomics leads to developing parts list for complex systems ranging from ecosystems of biofuel crops to biorefineries. Recent examples of such parts suggested by comparative genomics and functional analysis in these areas are presented here.

  18. Ancient genomics

    DEFF Research Database (Denmark)

    Der Sarkissian, Clio; Allentoft, Morten Erik; Avila Arcos, Maria del Carmen;

    2015-01-01

    , archaic hominins, ancient pathogens and megafaunal species. Those have revealed important functional and phenotypic information, as well as unexpected adaptation, migration and admixture patterns. As such, the field of aDNA has entered the new era of genomics and has provided valuable information when...

  19. Cephalopod genomics

    DEFF Research Database (Denmark)

    Albertin, Caroline B.; Bonnaud, Laure; Brown, C. Titus

    2012-01-01

    The Cephalopod Sequencing Consortium (CephSeq Consortium) was established at a NESCent Catalysis Group Meeting, ``Paths to Cephalopod Genomics-Strategies, Choices, Organization,'' held in Durham, North Carolina, USA on May 24-27, 2012. Twenty-eight participants representing nine countries (Austri...

  20. Ancient genomics

    DEFF Research Database (Denmark)

    Der Sarkissian, Clio; Allentoft, Morten Erik; Avila Arcos, Maria del Carmen

    2015-01-01

    by increasing the number of sequence reads to billions effectively means that contamination issues that have haunted aDNA research for decades, particularly in human studies, can now be efficiently and confidently quantified. At present, whole genomes have been sequenced from ancient anatomically modern humans...

  1. Genomics technologies to study structural variations in the grapevine genome

    Directory of Open Access Journals (Sweden)

    Cardone Maria Francesca

    2016-01-01

    Full Text Available Grapevine is one of the most important crop plants in the world. Recently there was great expansion of genomics resources about grapevine genome, thus providing increasing efforts for molecular breeding. Current cultivars display a great level of inter-specific differentiation that needs to be investigated to reach a comprehensive understanding of the genetic basis of phenotypic differences, and to find responsible genes selected by cross breeding programs. While there have been significant advances in resolving the pattern and nature of single nucleotide polymorphisms (SNPs on plant genomes, few data are available on copy number variation (CNV. Furthermore association between structural variations and phenotypes has been described in only a few cases. We combined high throughput biotechnologies and bioinformatics tools, to reveal the first inter-varietal atlas of structural variation (SV for the grapevine genome. We sequenced and compared four table grape cultivars with the Pinot noir inbred line PN40024 genome as the reference. We detected roughly 8% of the grapevine genome affected by genomic variations. Taken into account phenotypic differences existing among the studied varieties we performed comparison of SVs among them and the reference and next we performed an in-depth analysis of gene content of polymorphic regions. This allowed us to identify genes showing differences in copy number as putative functional candidates for important traits in grapevine cultivation.

  2. The Human Genome Initiative of the Department of Energy

    Science.gov (United States)

    1988-01-01

    The structural characterization of genes and elucidation of their encoded functions have become a cornerstone of modern health research, biology and biotechnology. A genome program is an organized effort to locate and identify the functions of all the genes of an organism. Beginning with the DOE-sponsored, 1986 human genome workshop at Santa Fe, the value of broadly organized efforts supporting total genome characterization became a subject of intensive study. There is now national recognition that benefits will rapidly accrue from an effective scientific infrastructure for total genome research. In the US genome research is now receiving dedicated funds. Several other nations are implementing genome programs. Supportive infrastructure is being improved through both national and international cooperation. The Human Genome Initiative of the Department of Energy (DOE) is a focused program of Resource and Technology Development, with objectives of speeding and bringing economies to the national human genome effort. This report relates the origins and progress of the Initiative.

  3. The Human Genome Initiative of the Department of Energy

    Energy Technology Data Exchange (ETDEWEB)

    None

    1988-01-01

    The structural characterization of genes and elucidation of their encoded functions have become a cornerstone of modern health research, biology and biotechnology. A genome program is an organized effort to locate and identify the functions of all the genes of an organism. Beginning with the DOE-sponsored, 1986 human genome workshop at Santa Fe, the value of broadly organized efforts supporting total genome characterization became a subject of intensive study. There is now national recognition that benefits will rapidly accrue from an effective scientific infrastructure for total genome research. In the US genome research is now receiving dedicated funds. Several other nations are implementing genome programs. Supportive infrastructure is being improved through both national and international cooperation. The Human Genome Initiative of the Department of Energy (DOE) is a focused program of Resource and Technology Development, with objectives of speeding and bringing economies to the national human genome effort. This report relates the origins and progress of the Initiative. 34 refs.

  4. Fueling the Future with Fungal Genomes

    Energy Technology Data Exchange (ETDEWEB)

    Grigoriev, Igor V.

    2014-10-27

    Genomes of fungi relevant to energy and environment are in focus of the JGI Fungal Genomic Program. One of its projects, the Genomics Encyclopedia of Fungi, targets fungi related to plant health (symbionts and pathogens) and biorefinery processes (cellulose degradation and sugar fermentation) by means of genome sequencing and analysis. New chapters of the Encyclopedia can be opened with user proposals to the JGI Community Science Program (CSP). Another JGI project, the 1000 fungal genomes, explores fungal diversity on genome level at scale and is open for users to nominate new species for sequencing. Over 400 fungal genomes have been sequenced by JGI to date and released through MycoCosm (www.jgi.doe.gov/fungi), a fungal web-portal, which integrates sequence and functional data with genome analysis tools for user community. Sequence analysis supported by functional genomics will lead to developing parts list for complex systems ranging from ecosystems of biofuel crops to biorefineries. Recent examples of such ‘parts’ suggested by comparative genomics and functional analysis in these areas are presented here.

  5. Genome Annotation Transfer Utility (GATU: rapid annotation of viral genomes using a closely related reference genome

    Directory of Open Access Journals (Sweden)

    Upton Chris

    2006-06-01

    annotations to the target genome. The program is freely available under the General Public License and can be accessed along with documentation and tutorial from http://www.virology.ca/gatu.

  6. Genome Annotation Transfer Utility (GATU): rapid annotation of viral genomes using a closely related reference genome.

    Science.gov (United States)

    Tcherepanov, Vasily; Ehlers, Angelika; Upton, Chris

    2006-06-13

    Since DNA sequencing has become easier and cheaper, an increasing number of closely related viral genomes have been sequenced. However, many of these have been deposited in GenBank without annotations, severely limiting their value to researchers. While maintaining comprehensive genomic databases for a set of virus families at the Viral Bioinformatics Resource Center http://www.biovirus.org and Viral Bioinformatics - Canada http://www.virology.ca, we found that researchers were unnecessarily spending time annotating viral genomes that were close relatives of already annotated viruses. We have therefore designed and implemented a novel tool, Genome Annotation Transfer Utility (GATU), to transfer annotations from a previously annotated reference genome to a new target genome, thereby greatly reducing this laborious task. GATU transfers annotations from a reference genome to a closely related target genome, while still giving the user final control over which annotations should be included. GATU also detects open reading frames present in the target but not the reference genome and provides the user with a variety of bioinformatics tools to quickly determine if these ORFs should also be included in the annotation. After this process is complete, GATU saves the newly annotated genome as a GenBank, EMBL or XML-format file. The software is coded in Java and runs on a variety of computer platforms. Its user-friendly Graphical User Interface is specifically designed for users trained in the biological sciences. GATU greatly simplifies the initial stages of genome annotation by using a closely related genome as a reference. It is not intended to be a gene prediction tool or a "complete" annotation system, but we have found that it significantly reduces the time required for annotation of genes and mature peptides as well as helping to standardize gene names between related organisms by transferring reference genome annotations to the target genome. The program is freely

  7. Ancient genomics.

    Science.gov (United States)

    Der Sarkissian, Clio; Allentoft, Morten E; Ávila-Arcos, María C; Barnett, Ross; Campos, Paula F; Cappellini, Enrico; Ermini, Luca; Fernández, Ruth; da Fonseca, Rute; Ginolhac, Aurélien; Hansen, Anders J; Jónsson, Hákon; Korneliussen, Thorfinn; Margaryan, Ashot; Martin, Michael D; Moreno-Mayar, J Víctor; Raghavan, Maanasa; Rasmussen, Morten; Velasco, Marcela Sandoval; Schroeder, Hannes; Schubert, Mikkel; Seguin-Orlando, Andaine; Wales, Nathan; Gilbert, M Thomas P; Willerslev, Eske; Orlando, Ludovic

    2015-01-19

    The past decade has witnessed a revolution in ancient DNA (aDNA) research. Although the field's focus was previously limited to mitochondrial DNA and a few nuclear markers, whole genome sequences from the deep past can now be retrieved. This breakthrough is tightly connected to the massive sequence throughput of next generation sequencing platforms and the ability to target short and degraded DNA molecules. Many ancient specimens previously unsuitable for DNA analyses because of extensive degradation can now successfully be used as source materials. Additionally, the analytical power obtained by increasing the number of sequence reads to billions effectively means that contamination issues that have haunted aDNA research for decades, particularly in human studies, can now be efficiently and confidently quantified. At present, whole genomes have been sequenced from ancient anatomically modern humans, archaic hominins, ancient pathogens and megafaunal species. Those have revealed important functional and phenotypic information, as well as unexpected adaptation, migration and admixture patterns. As such, the field of aDNA has entered the new era of genomics and has provided valuable information when testing specific hypotheses related to the past.

  8. Visualization for genomics: the Microbial Genome Viewer.

    NARCIS (Netherlands)

    Kerkhoven, R.; Enckevort, F.H.J. van; Boekhorst, J.; Molenaar, D.; Siezen, R.J.

    2004-01-01

    SUMMARY: A Web-based visualization tool, the Microbial Genome Viewer, is presented that allows the user to combine complex genomic data in a highly interactive way. This Web tool enables the interactive generation of chromosome wheels and linear genome maps from genome annotation data stored in a My

  9. The function genomics study

    Institute of Scientific and Technical Information of China (English)

    2001-01-01

    @@ Genomics is a biology term appeared ten years ago, used to describe the researches of genomic mapping, sequencing, and structure analysis, etc. Genomics, the first journal for publishing papers on genomics research was born in 1986. In the past decade, the concept of genomics has been widely accepted by scientists who are engaging in biology research. Meanwhile, the research scope of genomics has been extended continuously, from simple gene mapping and sequencing to function genomics study. To reflect the change, genomics is divided into two parts now, the structure genomics and the function genomics.

  10. gmos: Rapid Detection of Genome Mosaicism over Short Evolutionary Distances.

    Science.gov (United States)

    Domazet-Lošo, Mirjana; Domazet-Lošo, Tomislav

    2016-01-01

    Prokaryotic and viral genomes are often altered by recombination and horizontal gene transfer. The existing methods for detecting recombination are primarily aimed at viral genomes or sets of loci, since the expensive computation of underlying statistical models often hinders the comparison of complete prokaryotic genomes. As an alternative, alignment-free solutions are more efficient, but cannot map (align) a query to subject genomes. To address this problem, we have developed gmos (Genome MOsaic Structure), a new program that determines the mosaic structure of query genomes when compared to a set of closely related subject genomes. The program first computes local alignments between query and subject genomes and then reconstructs the query mosaic structure by choosing the best local alignment for each query region. To accomplish the analysis quickly, the program mostly relies on pairwise alignments and constructs multiple sequence alignments over short overlapping subject regions only when necessary. This fine-tuned implementation achieves an efficiency comparable to an alignment-free tool. The program performs well for simulated and real data sets of closely related genomes and can be used for fast recombination detection; for instance, when a new prokaryotic pathogen is discovered. As an example, gmos was used to detect genome mosaicism in a pathogenic Enterococcus faecium strain compared to seven closely related genomes. The analysis took less than two minutes on a single 2.1 GHz processor. The output is available in fasta format and can be visualized using an accessory program, gmosDraw (freely available with gmos).

  11. Teaching strategies to incorporate genomics education into academic nursing curricula.

    Science.gov (United States)

    Quevedo Garcia, Sylvia P; Greco, Karen E; Loescher, Lois J

    2011-11-01

    The translation of genomic science into health care has expanded our ability to understand the effects of genomics on human health and disease. As genomic advances continue, nurses are expected to have the knowledge and skills to translate genomic information into improved patient care. This integrative review describes strategies used to teach genomics in academic nursing programs and their facilitators and barriers to inclusion in nursing curricula. The Learning Engagement Model and the Diffusion of Innovations Theory guided the interpretation of findings. CINAHL, Medline, and Web of Science were resources for articles published during the past decade that included strategies for teaching genomics in academic nursing programs. Of 135 articles, 13 met criteria for review. Examples of effective genomics teaching strategies included clinical application through case studies, storytelling, online genomics resources, student self-assessment, guest lecturers, and a genetics focus group. Most strategies were not evaluated for effectiveness.

  12. Potential assessment of genome-wide association study and genomic selection in Japanese pear Pyrus pyrifolia

    OpenAIRE

    Iwata, Hiroyoshi; Hayashi, Takeshi; Terakami, Shingo; Takada, Norio; Sawamura, Yutaka; Yamamoto, Toshiya

    2013-01-01

    Although the potential of marker-assisted selection (MAS) in fruit tree breeding has been reported, bi-parental QTL mapping before MAS has hindered the introduction of MAS to fruit tree breeding programs. Genome-wide association studies (GWAS) are an alternative to bi-parental QTL mapping in long-lived perennials. Selection based on genomic predictions of breeding values (genomic selection: GS) is another alternative for MAS. This study examined the potential of GWAS and GS in pear breeding w...

  13. Developing Breast Cancer Program at Xavier; Genomic and Proteomic Analysis of Signaling Pathways Involved in Xenohormone and MEK5 Regulation of Breast Cancer

    Science.gov (United States)

    2007-05-01

    focus on breast and prostate cancer. The two additional XU faculty involved will develop a mini-proposal in Y1-2 and carry out pilot studies with...department. Dr. Wiese is now PI of both the Xavier DOD Breast Cancer and the Prostate Cancer programs as well as manager of the new NCI P20 grant at...potentially involved in difference sin the MEK5 and VEC cels suggested a role for metalloproteinases and Cox2 . Using Western blot analysis we demonstrate

  14. The Oxytricha trifallax macronuclear genome: a complex eukaryotic genome with 16,000 tiny chromosomes.

    Directory of Open Access Journals (Sweden)

    Estienne C Swart

    Full Text Available The macronuclear genome of the ciliate Oxytricha trifallax displays an extreme and unique eukaryotic genome architecture with extensive genomic variation. During sexual genome development, the expressed, somatic macronuclear genome is whittled down to the genic portion of a small fraction (∼5% of its precursor "silent" germline micronuclear genome by a process of "unscrambling" and fragmentation. The tiny macronuclear "nanochromosomes" typically encode single, protein-coding genes (a small portion, 10%, encode 2-8 genes, have minimal noncoding regions, and are differentially amplified to an average of ∼2,000 copies. We report the high-quality genome assembly of ∼16,000 complete nanochromosomes (∼50 Mb haploid genome size that vary from 469 bp to 66 kb long (mean ∼3.2 kb and encode ∼18,500 genes. Alternative DNA fragmentation processes ∼10% of the nanochromosomes into multiple isoforms that usually encode complete genes. Nucleotide diversity in the macronucleus is very high (SNP heterozygosity is ∼4.0%, suggesting that Oxytricha trifallax may have one of the largest known effective population sizes of eukaryotes. Comparison to other ciliates with nonscrambled genomes and long macronuclear chromosomes (on the order of 100 kb suggests several candidate proteins that could be involved in genome rearrangement, including domesticated MULE and IS1595-like DDE transposases. The assembly of the highly fragmented Oxytricha macronuclear genome is the first completed genome with such an unusual architecture. This genome sequence provides tantalizing glimpses into novel molecular biology and evolution. For example, Oxytricha maintains tens of millions of telomeres per cell and has also evolved an intriguing expansion of telomere end-binding proteins. In conjunction with the micronuclear genome in progress, the O. trifallax macronuclear genome will provide an invaluable resource for investigating programmed genome rearrangements, complementing

  15. PopGenome: an efficient Swiss army knife for population genomic analyses in R.

    Science.gov (United States)

    Pfeifer, Bastian; Wittelsbürger, Ulrich; Ramos-Onsins, Sebastian E; Lercher, Martin J

    2014-07-01

    Although many computer programs can perform population genetics calculations, they are typically limited in the analyses and data input formats they offer; few applications can process the large data sets produced by whole-genome resequencing projects. Furthermore, there is no coherent framework for the easy integration of new statistics into existing pipelines, hindering the development and application of new population genetics and genomics approaches. Here, we present PopGenome, a population genomics package for the R software environment (a de facto standard for statistical analyses). PopGenome can efficiently process genome-scale data as well as large sets of individual loci. It reads DNA alignments and single-nucleotide polymorphism (SNP) data sets in most common formats, including those used by the HapMap, 1000 human genomes, and 1001 Arabidopsis genomes projects. PopGenome also reads associated annotation files in GFF format, enabling users to easily define regions or classify SNPs based on their annotation; all analyses can also be applied to sliding windows. PopGenome offers a wide range of diverse population genetics analyses, including neutrality tests as well as statistics for population differentiation, linkage disequilibrium, and recombination. PopGenome is linked to Hudson's MS and Ewing's MSMS programs to assess statistical significance based on coalescent simulations. PopGenome's integration in R facilitates effortless and reproducible downstream analyses as well as the production of publication-quality graphics. Developers can easily incorporate new analyses methods into the PopGenome framework. PopGenome and R are freely available from CRAN (http://cran.r-project.org/) for all major operating systems under the GNU General Public License.

  16. Dissection of genomic correlation matrices using multivariate factor analysis in dairy and dual-purpose cattle breeds

    Science.gov (United States)

    SNP effects estimated in genomic selection programs allow for the prediction of direct genomic values (DGV) both at genome-wide and chromosomal level. As a consequence, genome-wide (G_GW) or chromosomal (G_CHR) correlation matrices between genomic predictions for different traits can be calculated. ...

  17. Building International Genomics Collaboration for Global Health Security.

    Science.gov (United States)

    Cui, Helen H; Erkkila, Tracy; Chain, Patrick S G; Vuyisich, Momchilo

    2015-01-01

    Genome science and technologies are transforming life sciences globally in many ways and becoming a highly desirable area for international collaboration to strengthen global health. The Genome Science Program at the Los Alamos National Laboratory is leveraging a long history of expertise in genomics research to assist multiple partner nations in advancing their genomics and bioinformatics capabilities. The capability development objectives focus on providing a molecular genomics-based scientific approach for pathogen detection, characterization, and biosurveillance applications. The general approaches include introduction of basic principles in genomics technologies, training on laboratory methodologies and bioinformatic analysis of resulting data, procurement, and installation of next-generation sequencing instruments, establishing bioinformatics software capabilities, and exploring collaborative applications of the genomics capabilities in public health. Genome centers have been established with public health and research institutions in the Republic of Georgia, Kingdom of Jordan, Uganda, and Gabon; broader collaborations in genomics applications have also been developed with research institutions in many other countries.

  18. Building International Genomics Collaboration for Global Health Security

    Directory of Open Access Journals (Sweden)

    Helen H Cui

    2015-12-01

    Full Text Available Genome science and technologies are transforming life sciences globally in many ways, and becoming a highly desirable area for international collaboration to strengthen global health. The Genome Science Program at the Los Alamos National Laboratory is leveraging a long history of expertise in genomics research to assist multiple partner nations in advancing their genomics and bioinformatics capabilities. The capability development objectives focus on providing a molecular genomics-based scientific approach for pathogen detection, characterization, and biosurveillance applications. The general approaches include introduction of basic principles in genomics technologies, training on laboratory methodologies and bioinformatic analysis of resulting data, procurement and installation of next generation sequencing instruments, establishing bioinformatics software capabilities, and exploring collaborative applications of the genomics capabilities in public health. Genome centers have been established with public health and research institutions in the Republic of Georgia, Kingdom of Jordan, Uganda, and Gabon; broader collaborations in genomics applications have also been developed with research institutions in many other countries.

  19. Genome cartography: charting the apicomplexan genome.

    Science.gov (United States)

    Kissinger, Jessica C; DeBarry, Jeremy

    2011-08-01

    Genes reside in particular genomic contexts that can be mapped at many levels. Historically, 'genetic maps' were used primarily to locate genes. Recent technological advances in the determination of genome sequences have made the analysis and comparison of whole genomes possible and increasingly tractable. What do we see if we shift our focus from gene content (the 'inventory' of genes contained within a genome) to the composition and organization of a genome? This review examines what has been learned about the evolution of the apicomplexan genome as well as the significance and impact of genomic location on our understanding of the eukaryotic genome and parasite biology. Copyright © 2011 Elsevier Ltd. All rights reserved.

  20. Plant Genome Duplication Database.

    Science.gov (United States)

    Lee, Tae-Ho; Kim, Junah; Robertson, Jon S; Paterson, Andrew H

    2017-01-01

    Genome duplication, widespread in flowering plants, is a driving force in evolution. Genome alignments between/within genomes facilitate identification of homologous regions and individual genes to investigate evolutionary consequences of genome duplication. PGDD (the Plant Genome Duplication Database), a public web service database, provides intra- or interplant genome alignment information. At present, PGDD contains information for 47 plants whose genome sequences have been released. Here, we describe methods for identification and estimation of dates of genome duplication and speciation by functions of PGDD.The database is freely available at http://chibba.agtec.uga.edu/duplication/.

  1. Rodent malaria parasites : genome organization & comparative genomics

    NARCIS (Netherlands)

    Kooij, Taco W.A.

    2006-01-01

    The aim of the studies described in this thesis was to investigate the genome organization of rodent malaria parasites (RMPs) and compare the organization and gene content of the genomes of RMPs and the human malaria parasite P. falciparum. The release of the complete genome sequence of P. falciparu

  2. Rodent malaria parasites : genome organization & comparative genomics

    NARCIS (Netherlands)

    Kooij, Taco W.A.

    2006-01-01

    The aim of the studies described in this thesis was to investigate the genome organization of rodent malaria parasites (RMPs) and compare the organization and gene content of the genomes of RMPs and the human malaria parasite P. falciparum. The release of the complete genome sequence of P.

  3. 75 FR 52537 - National Human Genome Research Institute; Notice of Closed Meeting

    Science.gov (United States)

    2010-08-26

    ... HUMAN SERVICES National Institutes of Health National Human Genome Research Institute; Notice of Closed... of Committee: National Human Genome Research Institute Initial Review Group; Genome Research Review... Domestic Assistance Program Nos. 93.172, Human Genome Research, National Institutes of Health, HHS)...

  4. 75 FR 2148 - National Human Genome Research Institute; Notice of Closed Meeting

    Science.gov (United States)

    2010-01-14

    ... HUMAN SERVICES National Institutes of Health National Human Genome Research Institute; Notice of Closed... of Committee: National Human Genome Research Institute Initial Review Group, Genome Research Review... Domestic Assistance Program Nos. 93.172, Human Genome Research, National Institutes of Health, HHS)...

  5. 76 FR 3643 - National Human Genome Research Institute; Notice of Closed Meeting

    Science.gov (United States)

    2011-01-20

    ... HUMAN SERVICES National Institutes of Health National Human Genome Research Institute; Notice of Closed... of Committee: National Human Genome Research Institute Initial Review Group; Genome Research Review... Assistance Program Nos. 93.172, Human Genome Research, National Institutes of Health, HHS) Dated: January...

  6. Recent Developments of Genomic Research in Soybean

    Institute of Scientific and Technical Information of China (English)

    Ching Chan; Xinpeng Qi; Man-Wah Li; Fuk-Ling Wong; Hon-Ming Lam

    2012-01-01

    Soybean is an important cash crop with unique and important traits such as the high seed protein and oil contents,and the ability to perform symbiotic nitrogen fixation.A reference genome of cultivated soybeans was established in 2010,followed by whole-genome re-sequencing of wild and cultivated soybean accessions.These efforts revealed unique features of the soybean genome and helped to understand its evolution.Mapping of variations between wild and cultivated soybean genomes were performed.These genomic variations may be related to the process of domestication and human selection.Wild soybean germplasms exhibited high genomic diversity and hence may be an important source of novel genes/alleles.Accumulation of genomic data will help to refine genetic maps and expedite the identification of functional genes.In this review,we summarize the major findings from the whole-genome sequencing projects and discuss the possible impacts on soybean researches and breeding programs.Some emerging areas such as transcriptomic and epigenomic studies will be introduced.In addition,we also tabulated some useful bioinformatics tools that will help the mining of the soybean genomic data.

  7. Funding Opportunity: Genomic Data Centers

    Science.gov (United States)

    Funding Opportunity CCG, Funding Opportunity Center for Cancer Genomics, CCG, Center for Cancer Genomics, CCG RFA, Center for cancer genomics rfa, genomic data analysis network, genomic data analysis network centers,

  8. Genome Mapping in Plant Comparative Genomics.

    Science.gov (United States)

    Chaney, Lindsay; Sharp, Aaron R; Evans, Carrie R; Udall, Joshua A

    2016-09-01

    Genome mapping produces fingerprints of DNA sequences to construct a physical map of the whole genome. It provides contiguous, long-range information that complements and, in some cases, replaces sequencing data. Recent advances in genome-mapping technology will better allow researchers to detect large (>1kbp) structural variations between plant genomes. Some molecular and informatics complications need to be overcome for this novel technology to achieve its full utility. This technology will be useful for understanding phenotype responses due to DNA rearrangements and will yield insights into genome evolution, particularly in polyploids. In this review, we outline recent advances in genome-mapping technology, including the processes required for data collection and analysis, and applications in plant comparative genomics.

  9. Enabling functional genomics with genome engineering.

    Science.gov (United States)

    Hilton, Isaac B; Gersbach, Charles A

    2015-10-01

    Advances in genome engineering technologies have made the precise control over genome sequence and regulation possible across a variety of disciplines. These tools can expand our understanding of fundamental biological processes and create new opportunities for therapeutic designs. The rapid evolution of these methods has also catalyzed a new era of genomics that includes multiple approaches to functionally characterize and manipulate the regulation of genomic information. Here, we review the recent advances of the most widely adopted genome engineering platforms and their application to functional genomics. This includes engineered zinc finger proteins, TALEs/TALENs, and the CRISPR/Cas9 system as nucleases for genome editing, transcription factors for epigenome editing, and other emerging applications. We also present current and potential future applications of these tools, as well as their current limitations and areas for future advances.

  10. CONTIGuator: a bacterial genomes finishing tool for structural insights on draft genomes

    Directory of Open Access Journals (Sweden)

    Bazzicalupo Marco

    2011-06-01

    Full Text Available Abstract Recent developments in sequencing technologies have given the opportunity to sequence many bacterial genomes with limited cost and labor, compared to previous techniques. However, a limiting step of genome sequencing is the finishing process, needed to infer the relative position of each contig and close sequencing gaps. An additional degree of complexity is given by bacterial species harboring more than one replicon, which are not contemplated by the currently available programs. The availability of a large number of bacterial genomes allows geneticists to use complete genomes (possibly from the same species as templates for contigs mapping. Here we present CONTIGuator, a software tool for contigs mapping over a reference genome which allows the visualization of a map of contigs, underlining loss and/or gain of genetic elements and permitting to finish multipartite genomes. The functionality of CONTIGuator was tested using four genomes, demonstrating its improved performances compared to currently available programs. Our approach appears efficient, with a clear visualization, allowing the user to perform comparative structural genomics analysis on draft genomes. CONTIGuator is a Python script for Linux environments and can be used on normal desktop machines and can be downloaded from http://contiguator.sourceforge.net.

  11. Exploring Other Genomes: Bacteria.

    Science.gov (United States)

    Flannery, Maura C.

    2001-01-01

    Points out the importance of genomes other than the human genome project and provides information on the identified bacterial genomes Pseudomonas aeuroginosa, Leprosy, Cholera, Meningitis, Tuberculosis, Bubonic Plague, and plant pathogens. Considers the computer's use in genome studies. (Contains 14 references.) (YDS)

  12. Genome-wide mRNA expression analysis of hepatic adaptation to high-fat diets reveals switch from an inflammatory to steatotic transcriptional program.

    Directory of Open Access Journals (Sweden)

    Marijana Radonjic

    Full Text Available BACKGROUND: Excessive exposure to dietary fats is an important factor in the initiation of obesity and metabolic syndrome associated pathologies. The cellular processes associated with the onset and progression of diet-induced metabolic syndrome are insufficiently understood. PRINCIPAL FINDINGS: To identify the mechanisms underlying the pathological changes associated with short and long-term exposure to excess dietary fat, hepatic gene expression of ApoE3Leiden mice fed chow and two types of high-fat (HF diets was monitored using microarrays during a 16-week period. A functional characterization of 1663 HF-responsive genes reveals perturbations in lipid, cholesterol and oxidative metabolism, immune and inflammatory responses and stress-related pathways. The major changes in gene expression take place during the early (day 3 and late (week 12 phases of HF feeding. This is also associated with characteristic opposite regulation of many HF-affected pathways between these two phases. The most prominent switch occurs in the expression of inflammatory/immune pathways (early activation, late repression and lipogenic/adipogenic pathways (early repression, late activation. Transcriptional network analysis identifies NF-kappaB, NEMO, Akt, PPARgamma and SREBP1 as the key controllers of these processes and suggests that direct regulatory interactions between these factors may govern the transition from early (stressed, inflammatory to late (pathological, steatotic hepatic adaptation to HF feeding. This transition observed by hepatic gene expression analysis is confirmed by expression of inflammatory proteins in plasma and the late increase in hepatic triglyceride content. In addition, the genes most predictive of fat accumulation in liver during 16-week high-fat feeding period are uncovered by regression analysis of hepatic gene expression and triglyceride levels. CONCLUSIONS: The transition from an inflammatory to a steatotic transcriptional program

  13. Structure of the germline genome of Tetrahymena thermophila and relationship to the massively rearranged somatic genome.

    Science.gov (United States)

    Hamilton, Eileen P; Kapusta, Aurélie; Huvos, Piroska E; Bidwell, Shelby L; Zafar, Nikhat; Tang, Haibao; Hadjithomas, Michalis; Krishnakumar, Vivek; Badger, Jonathan H; Caler, Elisabet V; Russ, Carsten; Zeng, Qiandong; Fan, Lin; Levin, Joshua Z; Shea, Terrance; Young, Sarah K; Hegarty, Ryan; Daza, Riza; Gujja, Sharvari; Wortman, Jennifer R; Birren, Bruce W; Nusbaum, Chad; Thomas, Jainy; Carey, Clayton M; Pritham, Ellen J; Feschotte, Cédric; Noto, Tomoko; Mochizuki, Kazufumi; Papazyan, Romeo; Taverna, Sean D; Dear, Paul H; Cassidy-Hanley, Donna M; Xiong, Jie; Miao, Wei; Orias, Eduardo; Coyne, Robert S

    2016-11-28

    The germline genome of the binucleated ciliate Tetrahymena thermophila undergoes programmed chromosome breakage and massive DNA elimination to generate the somatic genome. Here, we present a complete sequence assembly of the germline genome and analyze multiple features of its structure and its relationship to the somatic genome, shedding light on the mechanisms of genome rearrangement as well as the evolutionary history of this remarkable germline/soma differentiation. Our results strengthen the notion that a complex, dynamic, and ongoing interplay between mobile DNA elements and the host genome have shaped Tetrahymena chromosome structure, locally and globally. Non-standard outcomes of rearrangement events, including the generation of short-lived somatic chromosomes and excision of DNA interrupting protein-coding regions, may represent novel forms of developmental gene regulation. We also compare Tetrahymena's germline/soma differentiation to that of other characterized ciliates, illustrating the wide diversity of adaptations that have occurred within this phylum.

  14. Genomic definition of species. Revision 2

    Energy Technology Data Exchange (ETDEWEB)

    Crkvenjakov, R.; Drmanac, R.

    1993-03-01

    A genome is the sum total of the DNA sequences in the cells of an individual organism. The common usage that species possess genomes comes naturally to biochemists, who have shown that all protein and nucleic acid molecules are at the same time species- and individual-specific, with minor individual variations being superimposed on a consensus sequence that is constant for a species. By extension, this property is attributed to the common features of DNA in the chromosomes of members of a given species and is called species genome. Our proposal for the definition of a biological species is as follows: A species comprises a group of actual and potential biological organisms built according to a unique genome program that is recorded, and at least in part expressed, in the structures of their genomic nucleic acid molecule(s), having intragroup sequence differences which can be fully interconverted in the process of organismal reproduction.

  15. Genomic Medicine

    Directory of Open Access Journals (Sweden)

    Ignacio Briceño Balcázar

    2011-04-01

    Full Text Available Until the twilight of the 20th century, genetics was a branch of medicine applied to diseases of rare occurrence.  The advent of the human genome sequence and the possibility of studying it at affordable costs for patients and healthcare institutions, has permitted its application in high-priority diseases like cancer, cardiovascular disease, diabetes, and Alzheimer’s, among others. There is great potential in predictive and preventive medicine, through studying polymorphic genetic variants associated to risks for different diseases. Currently, clinical laboratories offer studies of over 30,000 variants associated with susceptibilities, to which individuals can access without much difficulty because a medical prescription is not required. These exams permit conducting a specific plan of preventive medicine.  For example, upon the possibility of finding a deleterious mutation in the BRCA1 and BRCA2 genes, the patient can prevent the breast cancer by mastectomy or chemoprophylaxis and in the presence of polymorphisms associated to cardiovascular risk preventive action may be undertaken through changes in life style (diet, exercise, etc.. Legal aspects are also present in this new conception of medicine.  For example, currently there is legislation for medications to indicate on their labels the different responses such medication can offer regarding the genetic variants of the patients, given that similar doses may provoke adverse reactions in an individual, while for another such dosage may be insufficient. This scenario would allow verifying the polymorphisms of drug response prior to administering medications like anticoagulants, hyperlipidemia treatments, or chemotherapy, among others. We must specially mention recessive diseases, produced by the presence of two alleles of a mutated gene, which are inherited from the mother, as well as the father. By studying the mutations, we may learn if a couple is at risk of bearing children with the

  16. GENOMIC MEDICINE

    Directory of Open Access Journals (Sweden)

    Ignacio Briceño Balcázar

    2011-03-01

    Full Text Available Until the twilight of the 20th century, genetics was a branch of medicine applied to diseases of rare occurrence. The advent of the human genome sequence and the possibility of studying it at affordable costs for patients and healthcare institutions, has permitted its application in high-priority diseases like cancer, cardiovascular disease, diabetes, and Alzheimer’s, among others.There is great potential in predictive and preventive medicine, through studying polymorphic genetic variants associated to risks for different diseases. Currently, clinical laboratories offer studies of over 30,000 variants associated with susceptibilities, to which individuals can access without much difficulty because a medical prescription is not required. These exams permit conducting a specific plan of preventive medicine. For example, upon the possibility of finding a deleterious mutation in the BRCA1 and BRCA2 genes, the patient can prevent the breast cancer by mastectomy or chemoprophylaxis and in the presence of polymorphisms associated to cardiovascular risk preventive action may be undertaken through changes in life style (diet, exercise, etc..Legal aspects are also present in this new conception of medicine. For example, currently there is legislation for medications to indicate on their labels the different responses such medication can offer regarding the genetic variants of the patients, given that similar doses may provoke adverse reactions in an individual, while for another such dosage may be insufficient. This scenario would allow verifying the polymorphisms of drug response prior to administering medications like anticoagulants, hyperlipidemia treatments, or chemotherapy, among others.We must specially mention recessive diseases, produced by the presence of two alleles of a mutated gene, which are inherited from the mother, as well as the father. By studying the mutations, we may learn if a couple is at risk of bearing children with the disease

  17. Between Two Fern Genomes

    OpenAIRE

    Sessa, Emily B.; Banks, Jo; Michael S Barker; Der, Joshua P; Duffy, Aaron M; Graham, Sean W.; Hasebe, Mitsuyasu; Langdale, Jane; Li, Fay-Wei; Marchant, D; Kathleen M. Pryer; Rothfels, Carl J.; Roux, Stanley J.; Salmi, Mari L; Sigel, Erin M.

    2014-01-01

    Ferns are the only major lineage of vascular plants not represented by a sequenced nuclear genome. This lack of genome sequence information significantly impedes our ability to understand and reconstruct genome evolution not only in ferns, but across all land plants. Azolla and Ceratopteris are ideal and complementary candidates to be the first ferns to have their nuclear genomes sequenced. They differ dramatically in genome size, life history, and habit, and thus represent the immense divers...

  18. Genomes and evolutionary genomics of animals

    Institute of Scientific and Technical Information of China (English)

    Luting SONG; Wen WANG

    2013-01-01

    Alongside recent advances and booming applications of DNA sequencing technologies,a great number of complete genome sequences for animal species are available to researchers.Hundreds of animals have been involved in whole genome sequencing,and at least 87 non-human animal species' complete or draft genome sequences have been published since 1998.Based on these technological advances and the subsequent accumulation of large quantity of genomic data,evolutionary genomics has become one of the most rapidly advancing disciplines in biology.Scientists now can perform a number of comparative and evolutionary genomic studies for animals,to identify conserved genes or other functional elements among species,genomic elements that confer animals their own specific characteristics and new phenotypes for adaptation.This review deals with the current genomic and evolutionary research on non-human animals,and displays a comprehensive landscape of genomes and the evolutionary genomics of non-human animals.It is very helpful to a better understanding of the biology and evolution of the myriad forms within the animal kingdom [Current Zoology 59 (1):87-98,2013].

  19. MycoCosm, an Integrated Fungal Genomics Resource

    Energy Technology Data Exchange (ETDEWEB)

    Shabalov, Igor; Grigoriev, Igor

    2012-03-16

    MycoCosm is a web-based interactive fungal genomics resource, which was first released in March 2010, in response to an urgent call from the fungal community for integration of all fungal genomes and analytical tools in one place (Pan-fungal data resources meeting, Feb 21-22, 2010, Alexandria, VA). MycoCosm integrates genomics data and analysis tools to navigate through over 100 fungal genomes sequenced at JGI and elsewhere. This resource allows users to explore fungal genomes in the context of both genome-centric analysis and comparative genomics, and promotes user community participation in data submission, annotation and analysis. MycoCosm has over 4500 unique visitors/month or 35000+ visitors/year as well as hundreds of registered users contributing their data and expertise to this resource. Its scalable architecture allows significant expansion of the data expected from JGI Fungal Genomics Program, its users, and integration with external resources used by fungal community.

  20. Coping with antibiotic resistance: contributions from genomics

    OpenAIRE

    Rossolini, G.; Thaller, M.

    2010-01-01

    Antibiotic resistance is a public health issue of global dimensions with a significant impact on morbidity, mortality and healthcare-associated costs. The problem has recently been worsened by the steady increase in multiresistant strains and by the restriction of antibiotic discovery and development programs. Recent advances in the field of bacterial genomics will further current knowledge on antibiotic resistance and help to tackle the problem. Bacterial genomics and transcriptomics can inf...

  1. Genome bioinformatics of tomato and potato

    OpenAIRE

    E Datema

    2011-01-01

    In the past two decades genome sequencing has developed from a laborious and costly technology employed by large international consortia to a widely used, automated and affordable tool used worldwide by many individual research groups. Genome sequences of many food animals and crop plants have been deciphered and are being exploited for fundamental research and applied to improve their breeding programs. The developments in sequencing technologies have also impacted the associated bioinformat...

  2. G-InforBIO: integrated system for microbial genomics

    Directory of Open Access Journals (Sweden)

    Abe Takashi

    2006-08-01

    Full Text Available Abstract Background Genome databases contain diverse kinds of information, including gene annotations and nucleotide and amino acid sequences. It is not easy to integrate such information for genomic study. There are few tools for integrated analyses of genomic data, therefore, we developed software that enables users to handle, manipulate, and analyze genome data with a variety of sequence analysis programs. Results The G-InforBIO system is a novel tool for genome data management and sequence analysis. The system can import genome data encoded as eXtensible Markup Language documents as formatted text documents, including annotations and sequences, from DNA Data Bank of Japan and GenBank encoded as flat files. The genome database is constructed automatically after importing, and the database can be exported as documents formatted with eXtensible Markup Language or tab-deliminated text. Users can retrieve data from the database by keyword searches, edit annotation data of genes, and process data with G-InforBIO. In addition, information in the G-InforBIO database can be analyzed seamlessly with nine different software programs, including programs for clustering and homology analyses. Conclusion The G-InforBIO system simplifies genome analyses by integrating several available software programs to allow efficient handling and manipulation of genome data. G-InforBIO is freely available from the download site.

  3. Using CAVE technology for functional genomics studies.

    Science.gov (United States)

    Sensen, Christoph W

    2002-01-01

    We have established the first Java 3D-enabled CAVE (CAVE automated virtual environment). The Java application programming interface allows the complete separation of the program development from the program execution, opening new application domains for the CAVE technology. Programs can be developed on any Java-enabled computer platform, including Windows, Macintosh, and Linux workstations, and executed in the CAVE without modification. The introduction of Java, one of the major programming environments for bioinformatics, into the CAVE environment allows the rapid development applications for genome research, especially for the analysis of the spatial and temporal data that are being produced by functional genomics experiments. The CAVE technology will play a major role in the modeling of biological systems that is necessary to understand how these systems are organized and how they function.

  4. Comparative Genome Analysis and Genome Evolution

    NARCIS (Netherlands)

    Snel, Berend

    2002-01-01

    This thesis described a collection of bioinformatic analyses on complete genome sequence data. We have studied the evolution of gene content and find that vertical inheritance dominates over horizontal gene trasnfer, even to the extent that we can use the gene content to make genome phylogenies. Usi

  5. Comparative Genome Analysis and Genome Evolution

    NARCIS (Netherlands)

    Snel, Berend

    2003-01-01

    This thesis described a collection of bioinformatic analyses on complete genome sequence data. We have studied the evolution of gene content and find that vertical inheritance dominates over horizontal gene trasnfer, even to the extent that we can use the gene content to make genome phylogenies. Usi

  6. Directed genome engineering for genome optimization.

    Science.gov (United States)

    D'Halluin, Kathleen; Ruiter, Rene

    2013-01-01

    The ability to develop nucleases with tailor-made activities for targeted DNA double-strand break induction at will at any desired position in the genome has been a major breakthrough to make targeted genome optimization feasible in plants. The development of site specific nucleases for precise genome modification has expanded the repertoire of tools for the development and optimization of traits, already including mutation breeding, molecular breeding and transgenesis.Through directed genome engineering technology, the huge amount of information provided by genomics and systems biology can now more effectively be used for the creation of plants with improved or new traits, and for the dissection of gene functions. Although still in an early phase of deployment, its utility has been demonstrated for engineering disease resistance, herbicide tolerance, altered metabolite profiles, and for molecular trait stacking to allow linked transmission of transgenes. In this article, we will briefly review the different approaches for directed genome engineering with the emphasis on double strand break (DSB)-mediated engineering to-wards genome optimization for crop improvement and towards the acceleration of functional genomics.

  7. Genomic Data Commons | Office of Cancer Genomics

    Science.gov (United States)

    The NCI’s Center for Cancer Genomics launches the Genomic Data Commons (GDC), a unified data sharing platform for the cancer research community. The mission of the GDC is to enable data sharing across the entire cancer research community, to ultimately support precision medicine in oncology.

  8. Modulating the Genomic Programming of Adipocytes

    DEFF Research Database (Denmark)

    Loft, Anne; Schmidt, Søren Fisker; Mandrup, Susanne

    2015-01-01

    , an antidiabetic agonist of the key adipocyte transcription factor peroxisome proliferator-activated receptor γ (PPARγ), involves redistribution of PPARγ binding to form browning-selective PPARγ super-enhancers that drive expression of key browning genes. These include genes encoding transcriptional regulators...

  9. Rat Genome Database (RGD)

    Data.gov (United States)

    U.S. Department of Health & Human Services — The Rat Genome Database (RGD) is a collaborative effort between leading research institutions involved in rat genetic and genomic research to collect, consolidate,...

  10. Genomic Data Commons launches

    Science.gov (United States)

    The Genomic Data Commons (GDC), a unified data system that promotes sharing of genomic and clinical data between researchers, launched today with a visit from Vice President Joe Biden to the operations center at the University of Chicago.

  11. Genomics of Sorghum

    OpenAIRE

    PATERSON, ANDREW H

    2008-01-01

    Sorghum (Sorghum bicolor (L.) Moench) is a subject of plant genomics research based on its importance as one of the world's leading cereal crops, a biofuels crop of high and growing importance, a progenitor of one of the world's most noxious weeds, and a botanical model for many tropical grasses with complex genomes. A rich history of genome analysis, culminating in the recent complete sequencing of the genome of a leading inbred, provides a foundation for invigorating progress toward relatin...

  12. National Human Genome Research Institute

    Science.gov (United States)

    ... the Director Organization Reports & Publications Español The National Human Genome Research Institute conducts genetic and genomic research, funds ... Landscape Social Media Videos Image Gallery Fact Sheets Human Genome Project Clinical Studies Genomic Careers DNA Day Calendar ...

  13. Ebolavirus comparative genomics

    DEFF Research Database (Denmark)

    Jun, Se-Ran; Leuze, Michael R.; Nookaew, Intawat

    2015-01-01

    The 2014 Ebola outbreak in West Africa is the largest documented for this virus. To examine the dynamics of this genome, we compare more than 100 currently available ebolavirus genomes to each other and to other viral genomes. Based on oligomer frequency analysis, the family Filoviridae forms...

  14. Chicken's Genome Decoded

    Institute of Scientific and Technical Information of China (English)

    2005-01-01

    @@ After completing the work on mapping chicken genome sequence and chicken genome variation in early March, 2004, two international research consortiums have made significant progress in reading the maps, shedding new light on the studies into the first bird as well as the first agricultural animal that has its genome sequenced and analyzed in the world.

  15. Genomic Prediction in Barley

    DEFF Research Database (Denmark)

    Edriss, Vahid; Cericola, Fabio; Jensen, Jens D;

    Genomic prediction uses markers (SNPs) across the whole genome to predict individual breeding values at an early growth stage potentially before large scale phenotyping. One of the applications of genomic prediction in plant breeding is to identify the best individual candidate lines to contribut...

  16. Genomic Prediction in Barley

    DEFF Research Database (Denmark)

    Edriss, Vahid; Cericola, Fabio; Jensen, Jens D;

    2015-01-01

    Genomic prediction uses markers (SNPs) across the whole genome to predict individual breeding values at an early growth stage potentially before large scale phenotyping. One of the applications of genomic prediction in plant breeding is to identify the best individual candidate lines to contribut...

  17. Assembly complexity of prokaryotic genomes using short reads

    Directory of Open Access Journals (Sweden)

    Pop Mihai

    2010-01-01

    Full Text Available Abstract Background De Bruijn graphs are a theoretical framework underlying several modern genome assembly programs, especially those that deal with very short reads. We describe an application of de Bruijn graphs to analyze the global repeat structure of prokaryotic genomes. Results We provide the first survey of the repeat structure of a large number of genomes. The analysis gives an upper-bound on the performance of genome assemblers for de novo reconstruction of genomes across a wide range of read lengths. Further, we demonstrate that the majority of genes in prokaryotic genomes can be reconstructed uniquely using very short reads even if the genomes themselves cannot. The non-reconstructible genes are overwhelmingly related to mobile elements (transposons, IS elements, and prophages. Conclusions Our results improve upon previous studies on the feasibility of assembly with short reads and provide a comprehensive benchmark against which to compare the performance of the short-read assemblers currently being developed.

  18. 10. international mouse genome conference

    Energy Technology Data Exchange (ETDEWEB)

    Meisler, M.H.

    1996-12-31

    Ten years after hosting the First International Mammalian Genome Conference in Paris in 1986, Dr. Jean-Louis Guenet presided over the Tenth Conference at the Pasteur Institute, October 7--10, 1996. The 1986 conference was a satellite to the Human Gene Mapping Workshop and had approximately 50 attendees. The 1996 meeting was attended by 300 scientists from around the world. In the interim, the number of mapped loci in the mouse increased from 1,000 to over 20,000. This report contains a listing of the program and its participants, and two articles that review the meeting and the role of the laboratory mouse in the Human Genome project. More than 200 papers were presented at the conference covering the following topics: International mouse chromosome committee meetings; Mutant generation and identification; Physical and genetic maps; New technology and resources; Chromatin structure and gene regulation; Rate and hamster genetic maps; Informatics and databases; and Quantitative trait analysis.

  19. [Genomics and functional genomics in microbiology].

    Science.gov (United States)

    Encarnación-Guevara, Sergio

    2006-01-01

    Functional genomics is changing our understanding of biology and changing our approach to biological research. It brings about concerted, high-throughput genetics with analyses of gene transcripts, proteins, and metabolites to answer the ultimate question posed by all genome-sequencing projects: what is the biological function of each and every gene? Functional genomics is stimulating a change in the research paradigm away from the analysis of single genes, proteins, or metabolites towards the analysis of each of these parameters on a global scale. By identifying and measuring several, if not the entire, molecular group of actors that take part in a given biological process, functional genomics offers the panorama of obtaining a truly holistic representation of life. Functional genomics methods are defined by high-throughput methods which are, not necessarily hypothesis-dependent. They offer insights into mRNA expression, protein expression, protein localization, and protein interactions and may cast light on the flow of information within signaling pathways. At its beginning, biology involved observing nature and experimenting on its isolated parts. Genomic research now generates new types of complex observational data derived from nature. This review describes the tools that are currently being used for functional genomics work and considers the impact that this new discipline on microbiology research.

  20. Skittle: A 2-Dimensional Genome Visualization Tool

    Directory of Open Access Journals (Sweden)

    Sanford John C

    2009-12-01

    Full Text Available Abstract Background It is increasingly evident that there are multiple and overlapping patterns within the genome, and that these patterns contain different types of information - regarding both genome function and genome history. In order to discover additional genomic patterns which may have biological significance, novel strategies are required. To partially address this need, we introduce a new data visualization tool entitled Skittle. Results This program first creates a 2-dimensional nucleotide display by assigning four colors to the four nucleotides, and then text-wraps to a user adjustable width. This nucleotide display is accompanied by a "repeat map" which comprehensively displays all local repeating units, based upon analysis of all possible local alignments. Skittle includes a smooth-zooming interface which allows the user to analyze genomic patterns at any scale. Skittle is especially useful in identifying and analyzing tandem repeats, including repeats not normally detectable by other methods. However, Skittle is also more generally useful for analysis of any genomic data, allowing users to correlate published annotations and observable visual patterns, and allowing for sequence and construct quality control. Conclusions Preliminary observations using Skittle reveal intriguing genomic patterns not otherwise obvious, including structured variations inside tandem repeats. The striking visual patterns revealed by Skittle appear to be useful for hypothesis development, and have already led the authors to theorize that imperfect tandem repeats could act as information carriers, and may form tertiary structures within the interphase nucleus.

  1. Genomic taxonomy of vibrios

    DEFF Research Database (Denmark)

    Thompson, Cristiane C.; Vicente, Ana Carolina P.; Souza, Rangel C.

    2009-01-01

    . RESULTS: We have generated four new genome sequences of three Vibrio species, i.e., V. alginolyticus 40B, V. harveyi-like 1DA3, and V. mimicus strains VM573 and VM603, and present a broad analyses of these genomes along with other sequenced Vibrio species. The genome atlas and pangenome plots provide...... a tantalizing image of the genomic differences that occur between closely related sister species, e.g. V. cholerae and V. mimicus. The vibrio pangenome contains around 26504 genes. The V. cholerae core genome and pangenome consist of 1520 and 6923 genes, respectively. Pangenomes might allow different strains...

  2. Genome organization of the SARS-CoV

    DEFF Research Database (Denmark)

    Xu, Jing; Hu, Jianfei; Wang, Jing;

    2003-01-01

    Annotation of the genome sequence of the SARS-CoV (severe acute respiratory syndrome-associated coronavirus) is indispensable to understand its evolution and pathogenesis. We have performed a full annotation of the SARS-CoV genome sequences by using annotation programs publicly available or devel...

  3. Combining SNPs in latent variables to improve genomic prediction

    DEFF Research Database (Denmark)

    Heuven, Henri C M; Rosa, G J M; Janss, Luc

    The objective of this study was to develop and test hierarchical genomic models with latent variables that represent parts of the genomic values. An interaction model and a chromosome model were compared with a model based on variable selection in a simulated and real dataset. The program Bayz......: Hierarchical genetic model; Predictive value; Gibbs sampling; Variable selection....

  4. The life cycle of a genome project: perspectives and guidelines inspired by insect genome projects [version 1; referees: 2 approved, 1 approved with reservations

    Directory of Open Access Journals (Sweden)

    Alexie Papanicolaou

    2016-01-01

    Full Text Available Many research programs on non-model species biology have been empowered by genomics. In turn, genomics is underpinned by a reference sequence and ancillary information created by so-called “genome projects”. The most reliable genome projects are the ones created as part of an active research program and designed to address specific questions but their life extends past publication. In this opinion paper I outline four key insights that have facilitated maintaining genomic communities: the key role of computational capability, the iterative process of building genomic resources, the value of community participation and the importance of manual curation. Taken together, these ideas can and do ensure the longevity of genome projects and the growing non-model species community can use them to focus a discussion with regards to its future genomic infrastructure.

  5. Microbial genomic taxonomy.

    Science.gov (United States)

    Thompson, Cristiane C; Chimetto, Luciane; Edwards, Robert A; Swings, Jean; Stackebrandt, Erko; Thompson, Fabiano L

    2013-12-23

    A need for a genomic species definition is emerging from several independent studies worldwide. In this commentary paper, we discuss recent studies on the genomic taxonomy of diverse microbial groups and a unified species definition based on genomics. Accordingly, strains from the same microbial species share >95% Average Amino Acid Identity (AAI) and Average Nucleotide Identity (ANI), >95% identity based on multiple alignment genes, genomic signature, and > 70% in silico Genome-to-Genome Hybridization similarity (GGDH). Species of the same genus will form monophyletic groups on the basis of 16S rRNA gene sequences, Multilocus Sequence Analysis (MLSA) and supertree analysis. In addition to the established requirements for species descriptions, we propose that new taxa descriptions should also include at least a draft genome sequence of the type strain in order to obtain a clear outlook on the genomic landscape of the novel microbe. The application of the new genomic species definition put forward here will allow researchers to use genome sequences to define simultaneously coherent phenotypic and genomic groups.

  6. UCSC genome browser tutorial.

    Science.gov (United States)

    Zweig, Ann S; Karolchik, Donna; Kuhn, Robert M; Haussler, David; Kent, W James

    2008-08-01

    The University of California Santa Cruz (UCSC) Genome Bioinformatics website consists of a suite of free, open-source, on-line tools that can be used to browse, analyze, and query genomic data. These tools are available to anyone who has an Internet browser and an interest in genomics. The website provides a quick and easy-to-use visual display of genomic data. It places annotation tracks beneath genome coordinate positions, allowing rapid visual correlation of different types of information. Many of the annotation tracks are submitted by scientists worldwide; the others are computed by the UCSC Genome Bioinformatics group from publicly available sequence data. It also allows users to upload and display their own experimental results or annotation sets by creating a custom track. The suite of tools, downloadable data files, and links to documentation and other information can be found at http://genome.ucsc.edu/.

  7. 2012 U.S. Department of Energy: Joint Genome Institute: Progress Report

    Energy Technology Data Exchange (ETDEWEB)

    Gilbert, David [DOE JGI Public Affairs Manager

    2013-01-01

    The mission of the U.S. Department of Energy Joint Genome Institute (DOE JGI) is to serve the diverse scientific community as a user facility, enabling the application of large-scale genomics and analysis of plants, microbes, and communities of microbes to address the DOE mission goals in bioenergy and the environment. The DOE JGI's sequencing efforts fall under the Eukaryote Super Program, which includes the Plant and Fungal Genomics Programs; and the Prokaryote Super Program, which includes the Microbial Genomics and Metagenomics Programs. In 2012, several projects made news for their contributions to energy and environment research.

  8. HomologMiner: looking for homologous genomic groups in whole genomes.

    Science.gov (United States)

    Hou, Minmei; Berman, Piotr; Hsu, Chih-Hao; Harris, Robert S

    2007-04-15

    Complex genomes contain numerous repeated sequences, and genomic duplication is believed to be a main evolutionary mechanism to obtain new functions. Several tools are available for de novo repeat sequence identification, and many approaches exist for clustering homologous protein sequences. We present an efficient new approach to identify and cluster homologous DNA sequences with high accuracy at the level of whole genomes, excluding low-complexity repeats, tandem repeats and annotated interspersed repeats. We also determine the boundaries of each group member so that it closely represents a biological unit, e.g. a complete gene, or a partial gene coding a protein domain. We developed a program called HomologMiner to identify homologous groups applicable to genome sequences that have been properly marked for low-complexity repeats and annotated interspersed repeats. We applied it to the whole genomes of human (hg17), macaque (rheMac2) and mouse (mm8). Groups obtained include gene families (e.g. olfactory receptor gene family, zinc finger families), unannotated interspersed repeats and additional homologous groups that resulted from recent segmental duplications. Our program incorporates several new methods: a new abstract definition of consistent duplicate units, a new criterion to remove moderately frequent tandem repeats, and new algorithmic techniques. We also provide preliminary analysis of the output on the three genomes mentioned above, and show several applications including identifying boundaries of tandem gene clusters and novel interspersed repeat families. All programs and datasets are downloadable from www.bx.psu.edu/miller_lab.

  9. A Genome-Wide Perspective on Metabolism

    DEFF Research Database (Denmark)

    Rauch, Alexander; Mandrup, Susanne

    2015-01-01

    Mammals have at least 210 histologically diverse cell types (Alberts, Molecular biology of the cell. Garland Science, New York, 2008) and the number would be even higher if functional differences are taken into account. The genome in each of these cell types is differentially programmed to expres...

  10. Whole-exome/genome sequencing and genomics.

    Science.gov (United States)

    Grody, Wayne W; Thompson, Barry H; Hudgins, Louanne

    2013-12-01

    As medical genetics has progressed from a descriptive entity to one focused on the functional relationship between genes and clinical disorders, emphasis has been placed on genomics. Genomics, a subelement of genetics, is the study of the genome, the sum total of all the genes of an organism. The human genome, which is contained in the 23 pairs of nuclear chromosomes and in the mitochondrial DNA of each cell, comprises >6 billion nucleotides of genetic code. There are some 23,000 protein-coding genes, a surprisingly small fraction of the total genetic material, with the remainder composed of noncoding DNA, regulatory sequences, and introns. The Human Genome Project, launched in 1990, produced a draft of the genome in 2001 and then a finished sequence in 2003, on the 50th anniversary of the initial publication of Watson and Crick's paper on the double-helical structure of DNA. Since then, this mass of genetic information has been translated at an ever-increasing pace into useable knowledge applicable to clinical medicine. The recent advent of massively parallel DNA sequencing (also known as shotgun, high-throughput, and next-generation sequencing) has brought whole-genome analysis into the clinic for the first time, and most of the current applications are directed at children with congenital conditions that are undiagnosable by using standard genetic tests for single-gene disorders. Thus, pediatricians must become familiar with this technology, what it can and cannot offer, and its technical and ethical challenges. Here, we address the concepts of human genomic analysis and its clinical applicability for primary care providers.

  11. Genome evolution of Oryza

    Directory of Open Access Journals (Sweden)

    Tieyan Liu

    2014-01-01

    Full Text Available The genus Oryza is composed of approximately 24 species. Wild species of Oryza contain a largely untapped resource of agronomically important genes. As an increasing number of genomes of wild rice species have been or will be sequenced, Oryza is becoming a model system for plant comparative, functional and evolutionary genomics studies. Comparative analyses of large genomic regions and whole-genome sequences have revealed molecular mechanisms involved in genome size variation, gene movement, genome evolution of polyploids, transition of euchromatin to heterochromatin and centromere evolution in the genus Oryza. Transposon activity and removal of transposable elements by unequal recombination or illegitimate recombination are two important factors contributing to expansion or contraction of Oryza genomes. Double-strand break repair mediated gene movement, especially non-homologous end joining, is an important source of non-colinear genes. Transition of euchromatin to heterochromatin is accompanied by transposable element amplification, segmental and tandem duplication of genic segments, and acquisition of heterochromatic genes from other genomic locations. Comparative analyses of multiple genomes dramatically improve the precision and sensitivity of evolutionary inference than single-genome analyses can provide. Further investigations on the impact of structural variation, lineage-specific genes and evolution of agriculturally important genes on phenotype diversity and adaptation in the genus Oryza should facilitate molecular breeding and genetic improvement of rice.

  12. Pairagon: a highly accurate, HMM-based cDNA-to-genome aligner

    DEFF Research Database (Denmark)

    Lu, David V; Brown, Randall H; Arumugam, Manimozhiyan;

    2009-01-01

    MOTIVATION: The most accurate way to determine the intron-exon structures in a genome is to align spliced cDNA sequences to the genome. Thus, cDNA-to-genome alignment programs are a key component of most annotation pipelines. The scoring system used to choose the best alignment is a primary...

  13. 76 FR 28056 - National Human Genome Research Institute; Notice of Closed Meeting

    Science.gov (United States)

    2011-05-13

    ... HUMAN SERVICES National Institutes of Health National Human Genome Research Institute; Notice of Closed....), notice is hereby given of a meeting of the Board of Scientific Counselors, National Human Genome Research... individual intramural programs and projects conducted by the National Human Genome Research...

  14. 76 FR 35224 - National Human Genome Research Institute; Notice of Closed Meeting

    Science.gov (United States)

    2011-06-16

    ... HUMAN SERVICES National Institutes of Health National Human Genome Research Institute; Notice of Closed.... Day, PhD, Scientific Review Officer, CIR, National Human Genome Research Institute, National... . (Catalogue of Federal Domestic Assistance Program Nos. 93.172, Human Genome Research, National Institutes...

  15. 78 FR 61851 - National Human Genome Research Institute; Notice of Closed Meeting

    Science.gov (United States)

    2013-10-04

    ... clearly unwarranted invasion of personal privacy. Name of Committee: National Human Genome Research... Human Genome Research Institute, 4076 Conference Room, 5635 Fishers Lane, Rockville, MD 20852... Domestic Assistance Program Nos. 93.172, Human Genome Research, National Institutes of Health, HHS)...

  16. 75 FR 62548 - National Human Genome Research Institute; Notice of Closed Meeting

    Science.gov (United States)

    2010-10-12

    ... HUMAN SERVICES National Institutes of Health National Human Genome Research Institute; Notice of Closed..., PhD, Scientific Review Officer, CIDR, National Human Genome Research Institute, National Institutes... . Catalogue of Federal Domestic Assistance Program Nos. 93.172, Human Genome Research, National Institutes...

  17. 76 FR 22112 - National Human Genome Research Institute; Notice of Closed Meeting

    Science.gov (United States)

    2011-04-20

    ... HUMAN SERVICES National Institutes of Health National Human Genome Research Institute; Notice of Closed... of Committee: National Human Genome Research Institute Special Emphasis Panel, Special Emphasis Panel... Assistance Program Nos. 93.172, Human Genome Research, National Institutes of Health, HHS) Dated: April...

  18. 76 FR 19780 - National Human Genome Research Institute; Notice of Closed Meeting

    Science.gov (United States)

    2011-04-08

    ... HUMAN SERVICES National Institutes of Health National Human Genome Research Institute; Notice of Closed... Officer, CIDR, National Human Genome Research Institute, National Institutes of Health, 5635 Fishers Lane... Assistance Program No. 93.172, Human Genome Research, National Institutes of Health, HHS) Dated: April...

  19. 77 FR 20646 - National Human Genome Research Institute; Notice of Closed Meetings

    Science.gov (United States)

    2012-04-05

    ... HUMAN SERVICES National Institutes of Health National Human Genome Research Institute; Notice of Closed... of Committee: National Human Genome Research Institute Special Emphasis Panel; Loan Repayment Program...: National Human Genome Research Institute, 5635 Fishers Lane, 3rd Floor Conference Room, Rockville, MD...

  20. 76 FR 66731 - National Human Genome Research Institute; Notice of Closed Meeting

    Science.gov (United States)

    2011-10-27

    ... HUMAN SERVICES National Institutes of Health National Human Genome Research Institute; Notice of Closed... of Committee: National Human Genome Research Institute Special Emphasis Panel, DAP for CEGS-SEP. Date...@mail.nih.gov . (Catalogue of Federal Domestic Assistance Program Nos. 93.172, Human Genome...

  1. 77 FR 64816 - National Human Genome Research Institute; Notice of Closed Meeting

    Science.gov (United States)

    2012-10-23

    ... HUMAN SERVICES National Institutes of Health National Human Genome Research Institute; Notice of Closed...: Camilla E. Day, Ph.D., Scientific Review Officer, CIDR, National Human Genome Research Institute, National... . (Catalogue of Federal Domestic Assistance Program Nos. 93.172, Human Genome Research, National Institutes...

  2. 78 FR 77477 - National Human Genome Research Institute; Notice of Closed Meeting

    Science.gov (United States)

    2013-12-23

    ... HUMAN SERVICES National Institutes of Health National Human Genome Research Institute; Notice of Closed...: Camilla E. Day, Ph.D., Scientific Review Officer, CIDR, National Human Genome Research Institute, National... . (Catalogue of Federal Domestic Assistance Program Nos. 93.172, Human Genome Research, National Institutes...

  3. 76 FR 9031 - National Human Genome Research Institute; Notice of Closed Meeting

    Science.gov (United States)

    2011-02-16

    ... HUMAN SERVICES National Institutes of Health National Human Genome Research Institute; Notice of Closed..., PhD, Scientific Review Officer, CIDR, National Human Genome Research Institute, National Institutes... . (Catalogue of Federal Domestic Assistance Program Nos. 93.172, Human Genome Research, National Institutes...

  4. 75 FR 13558 - National Human Genome Research Institute; Notice of Closed Meeting

    Science.gov (United States)

    2010-03-22

    ... HUMAN SERVICES National Institutes of Health National Human Genome Research Institute; Notice of Closed....), notice is hereby given of a meeting of the Board of Scientific Counselors, National Human Genome Research... individual intramural programs and projects conducted by the National Human Genome Research...

  5. 75 FR 8977 - National Human Genome Research Institute; Notice of Closed Meeting

    Science.gov (United States)

    2010-02-26

    ... HUMAN SERVICES National Institutes of Health National Human Genome Research Institute; Notice of Closed.... Nakamura, PhD, Scientific Review Officer, Scientific Review Branch, National Human Genome Research...-402-0838. (Catalogue of Federal Domestic Assistance Program Nos. 93.172, Human Genome...

  6. 77 FR 35991 - National Human Genome Research Institute; Notice of Closed Meeting

    Science.gov (United States)

    2012-06-15

    ... HUMAN SERVICES National Institutes of Health National Human Genome Research Institute; Notice of Closed...: Camilla E. Day, Ph.D., Scientific Review Officer, CIDR, National Human Genome Research Institute, National... . (Catalogue of Federal Domestic Assistance Program Nos. 93.172, Human Genome Research, National Institutes...

  7. 75 FR 67380 - National Human Genome Research Institute; Notice of Closed Meeting

    Science.gov (United States)

    2010-11-02

    ... HUMAN SERVICES National Institutes of Health National Human Genome Research Institute; Notice of Closed...: Ken D. Nakamura, PhD, Scientific Review Officer, Scientific Review Branch, National Human Genome... Program Nos. 93.172, Human Genome Research, National Institutes of Health, HHS) Dated: October 26,...

  8. 78 FR 11898 - National Human Genome Research Institute; Notice of Closed Meeting

    Science.gov (United States)

    2013-02-20

    ... HUMAN SERVICES National Institutes of Health National Human Genome Research Institute; Notice of Closed.... Day, Ph.D., Scientific Review Officer CIDR, National Human Genome Research Institute, National... . (Catalogue of Federal Domestic Assistance Program Nos. 93.172, Human Genome Research, National Institutes...

  9. 78 FR 70063 - National Human Genome Research Institute; Notice of Closed Meeting

    Science.gov (United States)

    2013-11-22

    ... HUMAN SERVICES National Institutes of Health National Human Genome Research Institute; Notice of Closed....), notice is hereby given of a meeting of the Board of Scientific Counselors, National Human Genome Research... individual intramural programs and projects conducted by the NATIONAL HUMAN GENOME RESEARCH...

  10. 76 FR 36930 - National Human Genome Research Institute; Notice of Closed Meeting

    Science.gov (United States)

    2011-06-23

    ... HUMAN SERVICES National Institutes of Health National Human Genome Research Institute; Notice of Closed... of Committee: National Human Genome Research Institute Special Emphasis Panel, DAP R-25. Date: July...@mail.nih.gov . (Catalogue of Federal Domestic Assistance Program Nos. 93.172, Human Genome...

  11. 75 FR 8373 - National Human Genome Research Institute; Notice of Closed Meeting

    Science.gov (United States)

    2010-02-24

    ... HUMAN SERVICES National Institutes of Health National Human Genome Research Institute; Notice of Closed... of Committee: National Human Genome Research Institute Special Emphasis Panel, GWAS Comparing Design... of Federal Domestic Assistance Program Nos. 93.172, Human Genome Research, National Institutes...

  12. 75 FR 60467 - National Human Genome Research Institute; Notice of Meeting

    Science.gov (United States)

    2010-09-30

    ... HUMAN SERVICES National Institutes of Health National Human Genome Research Institute; Notice of Meeting... hereby given of a meeting of the Board of Scientific Counselors, National Human Genome Research Institute... intramural programs and projects conducted by the National Human Genome Research Institute,...

  13. 78 FR 47715 - National Human Genome Research Institute; Notice of Closed Meeting

    Science.gov (United States)

    2013-08-06

    ... HUMAN SERVICES National Institutes of Health National Human Genome Research Institute; Notice of Closed...., Scientific Review Officer, CIDR, National Human Genome Research Institute, National Institutes of Health... Federal Domestic Assistance Program Nos. 93.172, Human Genome Research, National Institutes of Health,...

  14. 76 FR 50486 - National Human Genome Research Institute; Notice of Closed Meeting

    Science.gov (United States)

    2011-08-15

    ... HUMAN SERVICES National Institutes of Health National Human Genome Research Institute; Notice of Closed.... Day, PhD, Scientific Review Officer, CIDR, National Human Genome Research Institute, National... . (Catalogue of Federal Domestic Assistance Program Nos. 93.172, Human Genome Research, National Institutes...

  15. 76 FR 10909 - National Human Genome Research Institute; Notice of Closed Meeting

    Science.gov (United States)

    2011-02-28

    ... HUMAN SERVICES National Institutes of Health National Human Genome Research Institute; Notice of Closed.... Nakamura, PhD, Scientific Review Officer, Scientific Review Branch, National Human Genome Research...-402-0838. (Catalogue of Federal Domestic Assistance Program Nos. 93.172, Human Genome...

  16. 77 FR 50140 - National Human Genome Research Institute; Notice of Closed Meeting

    Science.gov (United States)

    2012-08-20

    ... HUMAN SERVICES National Institutes of Health National Human Genome Research Institute; Notice of Closed.... Day, Ph.D., Scientific Review Officer, CIDR, National Human Genome Research Institute, National... . (Catalogue of Federal Domestic Assistance Program Nos. 93.172, Human Genome Research, National Institutes...

  17. 77 FR 74676 - National Human Genome Research Institute; Notice of Closed Meeting

    Science.gov (United States)

    2012-12-17

    ... HUMAN SERVICES National Institutes of Health National Human Genome Research Institute; Notice of Closed... Person: Camilla E. Day, Ph.D., Scientific Review Officer, CIDR, National Human Genome Research Institute...@nih.gov . (Catalogue of Federal Domestic Assistance Program Nos. 93.172, Human Genome...

  18. 75 FR 56115 - National Human Genome Research Institute; Notice of Closed Meeting

    Science.gov (United States)

    2010-09-15

    ... HUMAN SERVICES National Institutes of Health National Human Genome Research Institute; Notice of Closed... of Committee: National Human Genome Research Institute Special Emphasis Panel; CEGS DAP. Date... Assistance Program Nos. 93.172, Human Genome Research, National Institutes of Health, HHS) Dated: September...

  19. 75 FR 48977 - National Human Genome Research Institute; Notice of Closed Meeting

    Science.gov (United States)

    2010-08-12

    ... HUMAN SERVICES National Institutes of Health National Human Genome Research Institute; Notice of Closed.... Day, PhD, Scientific Review Officer, CIDR, National Human Genome Research Institute, National... . (Catalogue of Federal Domestic Assistance Program Nos. 93.172, Human Genome Research, National Institutes...

  20. 76 FR 65204 - National Human Genome Research Institute; Notice of Meeting

    Science.gov (United States)

    2011-10-20

    ... HUMAN SERVICES National Institutes of Health National Human Genome Research Institute; Notice of Meeting... hereby given of a meeting of the Board of Scientific Counselors, National Human Genome Research Institute... intramural programs and projects conducted by the National Human Genome Research Institute,...

  1. 77 FR 31863 - National Human Genome Research Institute; Notice of Closed Meeting

    Science.gov (United States)

    2012-05-30

    ... HUMAN SERVICES National Institutes of Health National Human Genome Research Institute; Notice of Closed... of Committee: National Human Genome Research Institute Special Emphasis Panel DAP R25 Eppig.... (Catalogue of Federal Domestic Assistance Program Nos. 93.172, Human Genome Research, National Institutes...

  2. 75 FR 32957 - National Human Genome Research Institute; Notice of Closed Meeting

    Science.gov (United States)

    2010-06-10

    ... HUMAN SERVICES National Institutes of Health National Human Genome Research Institute; Notice of Closed... of Committee: National Human Genome Research Institute Special Emphasis Panel, Protein Resource RFA... of Federal Domestic Assistance Program Nos. 93.172, Human Genome Research, National Institutes...

  3. 77 FR 64816 - National Human Genome Research Institute; Notice of Meeting

    Science.gov (United States)

    2012-10-23

    ... HUMAN SERVICES National Institutes of Health National Human Genome Research Institute; Notice of Meeting... hereby given of a meeting of the Board of Scientific Counselors, National Human Genome Research Institute... intramural programs and projects conducted by the National Human Genome Research Institute,...

  4. 76 FR 22407 - National Human Genome Research Institute; Notice of Closed Meeting

    Science.gov (United States)

    2011-04-21

    ... HUMAN SERVICES National Institutes of Health National Human Genome Research Institute; Notice of Closed... of Committee: National Human Genome Research Institute Special Emphasis Panel; Loan Repayment Program....172, Human Genome Research, National Institutes of Health, HHS) Dated: April 12, 2011. Jennifer...

  5. 76 FR 79199 - National Human Genome Research Institute; Notice of Closed Meeting

    Science.gov (United States)

    2011-12-21

    ... HUMAN SERVICES National Institutes of Health National Human Genome Research Institute; Notice of Closed...., Scientific Review Officer, CIDR, National Human Genome Research Institute, National Institutes of Health... Federal Domestic Assistance Program Nos. 93.172, Human Genome Research, National Institutes of Health,...

  6. 75 FR 35821 - National Human Genome Research Institute; Notice of Closed Meeting

    Science.gov (United States)

    2010-06-23

    ... HUMAN SERVICES National Institutes of Health National Human Genome Research Institute; Notice of Closed..., Scientific Review Officer, CIDR, National Human Genome Research Institute, National Institutes of Health... Federal Domestic Assistance Program Nos. 93.172, Human Genome Research, National Institutes of Health,...

  7. Multiple models for Rosaceae genomics.

    Science.gov (United States)

    Shulaev, Vladimir; Korban, Schuyler S; Sosinski, Bryon; Abbott, Albert G; Aldwinckle, Herb S; Folta, Kevin M; Iezzoni, Amy; Main, Dorrie; Arús, Pere; Dandekar, Abhaya M; Lewers, Kim; Brown, Susan K; Davis, Thomas M; Gardiner, Susan E; Potter, Daniel; Veilleux, Richard E

    2008-07-01

    The plant family Rosaceae consists of over 100 genera and 3,000 species that include many important fruit, nut, ornamental, and wood crops. Members of this family provide high-value nutritional foods and contribute desirable aesthetic and industrial products. Most rosaceous crops have been enhanced by human intervention through sexual hybridization, asexual propagation, and genetic improvement since ancient times, 4,000 to 5,000 B.C. Modern breeding programs have contributed to the selection and release of numerous cultivars having significant economic impact on the U.S. and world markets. In recent years, the Rosaceae community, both in the United States and internationally, has benefited from newfound organization and collaboration that have hastened progress in developing genetic and genomic resources for representative crops such as apple (Malus spp.), peach (Prunus spp.), and strawberry (Fragaria spp.). These resources, including expressed sequence tags, bacterial artificial chromosome libraries, physical and genetic maps, and molecular markers, combined with genetic transformation protocols and bioinformatics tools, have rendered various rosaceous crops highly amenable to comparative and functional genomics studies. This report serves as a synopsis of the resources and initiatives of the Rosaceae community, recent developments in Rosaceae genomics, and plans to apply newly accumulated knowledge and resources toward breeding and crop improvement.

  8. The genome BLASTatlas-a GeneWiz extension for visualization of whole-genome homology.

    Science.gov (United States)

    Hallin, Peter F; Binnewies, Tim T; Ussery, David W

    2008-05-01

    The development of fast and inexpensive methods for sequencing bacterial genomes has led to a wealth of data, often with many genomes being sequenced of the same species or closely related organisms. Thus, there is a need for visualization methods that will allow easy comparison of many sequenced genomes to a defined reference strain. The BLASTatlas is one such tool that is useful for mapping and visualizing whole genome homology of genes and proteins within a reference strain compared to other strains or species of one or more prokaryotic organisms. We provide examples of BLASTatlases, including the Clostridium tetani plasmid p88, where homologues for toxin genes can be easily visualized in other sequenced Clostridium genomes, and for a Clostridium botulinum genome, compared to 14 other Clostridium genomes. DNA structural information is also included in the atlas to visualize the DNA chromosomal context of regions. Additional information can be added to these plots, and as an example we have added circles showing the probability of the DNA helix opening up under superhelical tension. The tool is SOAP compliant and WSDL (web services description language) files are located on our website: (http://www.cbs.dtu.dk/ws/BLASTatlas), where programming examples are available in Perl. By providing an interoperable method to carry out whole genome visualization of homology, this service offers bioinformaticians as well as biologists an easy-to-adopt workflow that can be directly called from the programming language of the user, hence enabling automation of repeated tasks. This tool can be relevant in many pangenomic as well as in metagenomic studies, by giving a quick overview of clusters of insertion sites, genomic islands and overall homology between a reference sequence and a data set.

  9. Bioinformatics decoding the genome

    CERN Document Server

    CERN. Geneva; Deutsch, Sam; Michielin, Olivier; Thomas, Arthur; Descombes, Patrick

    2006-01-01

    Extracting the fundamental genomic sequence from the DNA From Genome to Sequence : Biology in the early 21st century has been radically transformed by the availability of the full genome sequences of an ever increasing number of life forms, from bacteria to major crop plants and to humans. The lecture will concentrate on the computational challenges associated with the production, storage and analysis of genome sequence data, with an emphasis on mammalian genomes. The quality and usability of genome sequences is increasingly conditioned by the careful integration of strategies for data collection and computational analysis, from the construction of maps and libraries to the assembly of raw data into sequence contigs and chromosome-sized scaffolds. Once the sequence is assembled, a major challenge is the mapping of biologically relevant information onto this sequence: promoters, introns and exons of protein-encoding genes, regulatory elements, functional RNAs, pseudogenes, transposons, etc. The methodological ...

  10. Genomics of oral bacteria.

    Science.gov (United States)

    Duncan, Margaret J

    2003-01-01

    Advances in bacterial genetics came with the discovery of the genetic code, followed by the development of recombinant DNA technologies. Now the field is undergoing a new revolution because of investigators' ability to sequence and assemble complete bacterial genomes. Over 200 genome projects have been completed or are in progress, and the oral microbiology research community has benefited through projects for oral bacteria and their non-oral-pathogen relatives. This review describes features of several oral bacterial genomes, and emphasizes the themes of species relationships, comparative genomics, and lateral gene transfer. Genomics is having a broad impact on basic research in microbial pathogenesis, and will lead to new approaches in clinical research and therapeutics. The oral microbiota is a unique community especially suited for new challenges to sequence the metagenomes of microbial consortia, and the genomes of uncultivable bacteria.

  11. State of cat genomics.

    Science.gov (United States)

    O'Brien, Stephen J; Johnson, Warren; Driscoll, Carlos; Pontius, Joan; Pecon-Slattery, Jill; Menotti-Raymond, Marilyn

    2008-06-01

    Our knowledge of cat family biology was recently expanded to include a genomics perspective with the completion of a draft whole genome sequence of an Abyssinian cat. The utility of the new genome information has been demonstrated by applications ranging from disease gene discovery and comparative genomics to species conservation. Patterns of genomic organization among cats and inbred domestic cat breeds have illuminated our view of domestication, revealing linkage disequilibrium tracks consequent of breed formation, defining chromosome exchanges that punctuated major lineages of mammals and suggesting ancestral continental migration events that led to 37 modern species of Felidae. We review these recent advances here. As the genome resources develop, the cat is poised to make a major contribution to many areas in genetics and biology.

  12. Reference Based Genome Compression

    CERN Document Server

    Chern, Bobbie; Manolakos, Alexandros; No, Albert; Venkat, Kartik; Weissman, Tsachy

    2012-01-01

    DNA sequencing technology has advanced to a point where storage is becoming the central bottleneck in the acquisition and mining of more data. Large amounts of data are vital for genomics research, and generic compression tools, while viable, cannot offer the same savings as approaches tuned to inherent biological properties. We propose an algorithm to compress a target genome given a known reference genome. The proposed algorithm first generates a mapping from the reference to the target genome, and then compresses this mapping with an entropy coder. As an illustration of the performance: applying our algorithm to James Watson's genome with hg18 as a reference, we are able to reduce the 2991 megabyte (MB) genome down to 6.99 MB, while Gzip compresses it to 834.8 MB.

  13. Causes of genome instability

    DEFF Research Database (Denmark)

    Langie, Sabine A S; Koppen, Gudrun; Desaulniers, Daniel

    2015-01-01

    , genome instability can be defined as an enhanced tendency for the genome to acquire mutations; ranging from changes to the nucleotide sequence to chromosomal gain, rearrangements or loss. This review raises the hypothesis that in addition to known human carcinogens, exposure to low dose of other...... scientists aware of the increasing need to unravel the underlying mechanisms via which chemicals at low doses can induce genome instability and thus promote carcinogenesis.......Genome instability is a prerequisite for the development of cancer. It occurs when genome maintenance systems fail to safeguard the genome's integrity, whether as a consequence of inherited defects or induced via exposure to environmental agents (chemicals, biological agents and radiation). Thus...

  14. Reference Based Genome Compression

    OpenAIRE

    Chern, Bobbie; Ochoa, Idoia; Manolakos, Alexandros; No, Albert; Venkat, Kartik; Weissman, Tsachy

    2012-01-01

    DNA sequencing technology has advanced to a point where storage is becoming the central bottleneck in the acquisition and mining of more data. Large amounts of data are vital for genomics research, and generic compression tools, while viable, cannot offer the same savings as approaches tuned to inherent biological properties. We propose an algorithm to compress a target genome given a known reference genome. The proposed algorithm first generates a mapping from the reference to the target gen...

  15. Developing a platform for genomic medicine in Mexico.

    Science.gov (United States)

    Jimenez-Sanchez, Gerardo

    2003-04-11

    Mexico is preparing to develop a genomic medicine program focused on national health problems. Modern Mexicans result from an admixture of more than 65 native Indian groups with Spaniards, leading to a unique genetic makeup and a characteristic set of disease susceptibilities. Since 1999, more than 100 experts from different fields have joined efforts with government, academia, and industry to identify priorities and goals for genomic medicine in Mexico. The plan includes establishment of an Institute of Genomic Medicine with strong intramural and extramural programs. This project is expected to ease the social and financial burden of health problems in Mexico.

  16. Genomic Database Searching.

    Science.gov (United States)

    Hutchins, James R A

    2017-01-01

    The availability of reference genome sequences for virtually all species under active research has revolutionized biology. Analyses of genomic variations in many organisms have provided insights into phenotypic traits, evolution and disease, and are transforming medicine. All genomic data from publicly funded projects are freely available in Internet-based databases, for download or searching via genome browsers such as Ensembl, Vega, NCBI's Map Viewer, and the UCSC Genome Browser. These online tools generate interactive graphical outputs of relevant chromosomal regions, showing genes, transcripts, and other genomic landmarks, and epigenetic features mapped by projects such as ENCODE.This chapter provides a broad overview of the major genomic databases and browsers, and describes various approaches and the latest resources for searching them. Methods are provided for identifying genomic locus and sequence information using gene names or codes, identifiers for DNA and RNA molecules and proteins; also from karyotype bands, chromosomal coordinates, sequences, motifs, and matrix-based patterns. Approaches are also described for batch retrieval of genomic information, performing more complex queries, and analyzing larger sets of experimental data, for example from next-generation sequencing projects.

  17. Between two fern genomes.

    Science.gov (United States)

    Sessa, Emily B; Banks, Jo Ann; Barker, Michael S; Der, Joshua P; Duffy, Aaron M; Graham, Sean W; Hasebe, Mitsuyasu; Langdale, Jane; Li, Fay-Wei; Marchant, D Blaine; Pryer, Kathleen M; Rothfels, Carl J; Roux, Stanley J; Salmi, Mari L; Sigel, Erin M; Soltis, Douglas E; Soltis, Pamela S; Stevenson, Dennis W; Wolf, Paul G

    2014-01-01

    Ferns are the only major lineage of vascular plants not represented by a sequenced nuclear genome. This lack of genome sequence information significantly impedes our ability to understand and reconstruct genome evolution not only in ferns, but across all land plants. Azolla and Ceratopteris are ideal and complementary candidates to be the first ferns to have their nuclear genomes sequenced. They differ dramatically in genome size, life history, and habit, and thus represent the immense diversity of extant ferns. Together, this pair of genomes will facilitate myriad large-scale comparative analyses across ferns and all land plants. Here we review the unique biological characteristics of ferns and describe a number of outstanding questions in plant biology that will benefit from the addition of ferns to the set of taxa with sequenced nuclear genomes. We explain why the fern clade is pivotal for understanding genome evolution across land plants, and we provide a rationale for how knowledge of fern genomes will enable progress in research beyond the ferns themselves.

  18. [Landscape and ecological genomics].

    Science.gov (United States)

    Tetushkin, E Ia

    2013-10-01

    Landscape genomics is the modern version of landscape genetics, a discipline that arose approximately 10 years ago as a combination of population genetics, landscape ecology, and spatial statistics. It studies the effects of environmental variables on gene flow and other microevolutionary processes that determine genetic connectivity and variations in populations. In contrast to population genetics, it operates at the level of individual specimens rather than at the level of population samples. Another important difference between landscape genetics and genomics and population genetics is that, in the former, the analysis of gene flow and local adaptations takes quantitative account of landforms and features of the matrix, i.e., hostile spaces that separate species habitats. Landscape genomics is a part of population ecogenomics, which, along with community genomics, is a major part of ecological genomics. One of the principal purposes of landscape genomics is the identification and differentiation of various genome-wide and locus-specific effects. The approaches and computation tools developed for combined analysis of genomic and landscape variables make it possible to detect adaptation-related genome fragments, which facilitates the planning of conservation efforts and the prediction of species' fate in response to expected changes in the environment.

  19. Genomics of Clostridium tetani.

    Science.gov (United States)

    Brüggemann, Holger; Brzuszkiewicz, Elzbieta; Chapeton-Montes, Diana; Plourde, Lucile; Speck, Denis; Popoff, Michel R

    2015-05-01

    Genomic information about Clostridium tetani, the causative agent of the tetanus disease, is scarce. The genome of strain E88, a strain used in vaccine production, was sequenced about 10 years ago. One additional genome (strain 12124569) has recently been released. Here we report three new genomes of C. tetani and describe major differences among all five C. tetani genomes. They all harbor tetanus-toxin-encoding plasmids that contain highly conserved genes for TeNT (tetanus toxin), TetR (transcriptional regulator of TeNT) and ColT (collagenase), but substantially differ in other plasmid regions. The chromosomes share a large core genome that contains about 85% of all genes of a given chromosome. The non-core chromosome comprises mainly prophage-like genomic regions and genes encoding environmental interaction and defense functions (e.g. surface proteins, restriction-modification systems, toxin-antitoxin systems, CRISPR/Cas systems) and other fitness functions (e.g. transport systems, metabolic activities). This new genome information will help to assess the level of genome plasticity of the species C. tetani and provide the basis for detailed comparative studies.

  20. The UCSC Genome Browser database: 2017 update.

    Science.gov (United States)

    Tyner, Cath; Barber, Galt P; Casper, Jonathan; Clawson, Hiram; Diekhans, Mark; Eisenhart, Christopher; Fischer, Clayton M; Gibson, David; Gonzalez, Jairo Navarro; Guruvadoo, Luvina; Haeussler, Maximilian; Heitner, Steve; Hinrichs, Angie S; Karolchik, Donna; Lee, Brian T; Lee, Christopher M; Nejad, Parisa; Raney, Brian J; Rosenbloom, Kate R; Speir, Matthew L; Villarreal, Chris; Vivian, John; Zweig, Ann S; Haussler, David; Kuhn, Robert M; Kent, W James

    2017-01-04

    Since its 2001 debut, the University of California, Santa Cruz (UCSC) Genome Browser (http://genome.ucsc.edu/) team has provided continuous support to the international genomics and biomedical communities through a web-based, open source platform designed for the fast, scalable display of sequence alignments and annotations landscaped against a vast collection of quality reference genome assemblies. The browser's publicly accessible databases are the backbone of a rich, integrated bioinformatics tool suite that includes a graphical interface for data queries and downloads, alignment programs, command-line utilities and more. This year's highlights include newly designed home and gateway pages; a new 'multi-region' track display configuration for exon-only, gene-only and custom regions visualization; new genome browsers for three species (brown kiwi, crab-eating macaque and Malayan flying lemur); eight updated genome assemblies; extended support for new data types such as CRAM, RNA-seq expression data and long-range chromatin interaction pairs; and the unveiling of a new supported mirror site in Japan.

  1. Genomic sequencing of Pleistocene cave bears

    Energy Technology Data Exchange (ETDEWEB)

    Noonan, James P.; Hofreiter, Michael; Smith, Doug; Priest, JamesR.; Rohland, Nadin; Rabeder, Gernot; Krause, Johannes; Detter, J. Chris; Paabo, Svante; Rubin, Edward M.

    2005-04-01

    Despite the information content of genomic DNA, ancient DNA studies to date have largely been limited to amplification of mitochondrial DNA due to technical hurdles such as contamination and degradation of ancient DNAs. In this study, we describe two metagenomic libraries constructed using unamplified DNA extracted from the bones of two 40,000-year-old extinct cave bears. Analysis of {approx}1 Mb of sequence from each library showed that, despite significant microbial contamination, 5.8 percent and 1.1 percent of clones in the libraries contain cave bear inserts, yielding 26,861 bp of cave bear genome sequence. Alignment of this sequence to the dog genome, the closest sequenced genome to cave bear in terms of evolutionary distance, revealed roughly the expected ratio of cave bear exons, repeats and conserved noncoding sequences. Only 0.04 percent of all clones sequenced were derived from contamination with modern human DNA. Comparison of cave bear with orthologous sequences from several modern bear species revealed the evolutionary relationship of these lineages. Using the metagenomic approach described here, we have recovered substantial quantities of mammalian genomic sequence more than twice as old as any previously reported, establishing the feasibility of ancient DNA genomic sequencing programs.

  2. The UCSC Genome Browser database: 2017 update

    Science.gov (United States)

    Tyner, Cath; Barber, Galt P.; Casper, Jonathan; Clawson, Hiram; Diekhans, Mark; Eisenhart, Christopher; Fischer, Clayton M.; Gibson, David; Gonzalez, Jairo Navarro; Guruvadoo, Luvina; Haeussler, Maximilian; Heitner, Steve; Hinrichs, Angie S.; Karolchik, Donna; Lee, Brian T.; Lee, Christopher M.; Nejad, Parisa; Raney, Brian J.; Rosenbloom, Kate R.; Speir, Matthew L.; Villarreal, Chris; Vivian, John; Zweig, Ann S.; Haussler, David; Kuhn, Robert M.; Kent, W. James

    2017-01-01

    Since its 2001 debut, the University of California, Santa Cruz (UCSC) Genome Browser (http://genome.ucsc.edu/) team has provided continuous support to the international genomics and biomedical communities through a web-based, open source platform designed for the fast, scalable display of sequence alignments and annotations landscaped against a vast collection of quality reference genome assemblies. The browser's publicly accessible databases are the backbone of a rich, integrated bioinformatics tool suite that includes a graphical interface for data queries and downloads, alignment programs, command-line utilities and more. This year's highlights include newly designed home and gateway pages; a new ‘multi-region’ track display configuration for exon-only, gene-only and custom regions visualization; new genome browsers for three species (brown kiwi, crab-eating macaque and Malayan flying lemur); eight updated genome assemblies; extended support for new data types such as CRAM, RNA-seq expression data and long-range chromatin interaction pairs; and the unveiling of a new supported mirror site in Japan. PMID:27899642

  3. MIPS plant genome information resources.

    Science.gov (United States)

    Spannagl, Manuel; Haberer, Georg; Ernst, Rebecca; Schoof, Heiko; Mayer, Klaus F X

    2007-01-01

    The Munich Institute for Protein Sequences (MIPS) has been involved in maintaining plant genome databases since the Arabidopsis thaliana genome project. Genome databases and analysis resources have focused on individual genomes and aim to provide flexible and maintainable data sets for model plant genomes as a backbone against which experimental data, for example from high-throughput functional genomics, can be organized and evaluated. In addition, model genomes also form a scaffold for comparative genomics, and much can be learned from genome-wide evolutionary studies.

  4. Genomics of pear and other Rosaceae fruit trees.

    Science.gov (United States)

    Yamamoto, Toshiya; Terakami, Shingo

    2016-01-01

    The family Rosaceae includes many economically important fruit trees, such as pear, apple, peach, cherry, quince, apricot, plum, raspberry, and loquat. Over the past few years, whole-genome sequences have been released for Chinese pear, European pear, apple, peach, Japanese apricot, and strawberry. These sequences help us to conduct functional and comparative genomics studies and to develop new cultivars with desirable traits by marker-assisted selection in breeding programs. These genomics resources also allow identification of evolutionary relationships in Rosaceae, development of genome-wide SNP and SSR markers, and construction of reference genetic linkage maps, which are available through the Genome Database for the Rosaceae website. Here, we review the recent advances in genomics studies and their practical applications for Rosaceae fruit trees, particularly pear, apple, peach, and cherry.

  5. Brief Guide to Genomics: DNA, Genes and Genomes

    Science.gov (United States)

    ... Breve guía de genómica A Brief Guide to Genomics DNA, Genes and Genomes Deoxyribonucleic acid (DNA) is ... genetic basis for health and disease. Implications of Genomics for Medical Science Virtually every human ailment has ...

  6. The Drosophila genome nexus: a population genomic resource of 623 Drosophila melanogaster genomes, including 197 from a single ancestral range population.

    Science.gov (United States)

    Lack, Justin B; Cardeno, Charis M; Crepeau, Marc W; Taylor, William; Corbett-Detig, Russell B; Stevens, Kristian A; Langley, Charles H; Pool, John E

    2015-04-01

    Hundreds of wild-derived Drosophila melanogaster genomes have been published, but rigorous comparisons across data sets are precluded by differences in alignment methodology. The most common approach to reference-based genome assembly is a single round of alignment followed by quality filtering and variant detection. We evaluated variations and extensions of this approach and settled on an assembly strategy that utilizes two alignment programs and incorporates both substitutions and short indels to construct an updated reference for a second round of mapping prior to final variant detection. Utilizing this approach, we reassembled published D. melanogaster population genomic data sets and added unpublished genomes from several sub-Saharan populations. Most notably, we present aligned data from phase 3 of the Drosophila Population Genomics Project (DPGP3), which provides 197 genomes from a single ancestral range population of D. melanogaster (from Zambia). The large sample size, high genetic diversity, and potentially simpler demographic history of the DPGP3 sample will make this a highly valuable resource for fundamental population genetic research. The complete set of assemblies described here, termed the Drosophila Genome Nexus, presently comprises 623 consistently aligned genomes and is publicly available in multiple formats with supporting documentation and bioinformatic tools. This resource will greatly facilitate population genomic analysis in this model species by reducing the methodological differences between data sets. Copyright © 2015 by the Genetics Society of America.

  7. Integrative genomic analysis by interoperation of bioinformatics tools in GenomeSpace

    Science.gov (United States)

    Thorvaldsdottir, Helga; Liefeld, Ted; Ocana, Marco; Borges-Rivera, Diego; Pochet, Nathalie; Robinson, James T.; Demchak, Barry; Hull, Tim; Ben-Artzi, Gil; Blankenberg, Daniel; Barber, Galt P.; Lee, Brian T.; Kuhn, Robert M.; Nekrutenko, Anton; Segal, Eran; Ideker, Trey; Reich, Michael; Regev, Aviv; Chang, Howard Y.; Mesirov, Jill P.

    2015-01-01

    Integrative analysis of multiple data types to address complex biomedical questions requires the use of multiple software tools in concert and remains an enormous challenge for most of the biomedical research community. Here we introduce GenomeSpace (http://www.genomespace.org), a cloud-based, cooperative community resource. Seeded as a collaboration of six of the most popular genomics analysis tools, GenomeSpace now supports the streamlined interaction of 20 bioinformatics tools and data resources. To facilitate the ability of non-programming users’ to leverage GenomeSpace in integrative analysis, it offers a growing set of ‘recipes’, short workflows involving a few tools and steps to guide investigators through high utility analysis tasks. PMID:26780094

  8. Ensembl Genomes 2013

    DEFF Research Database (Denmark)

    Kersey, Paul Julian; Allen, James E; Christensen, Mikkel

    2014-01-01

    genomes, and now includes the genomes of over 9000 bacteria. Specific extensions to the web and programmatic interfaces have been developed to support users in navigating these large data sets. Looking forward, analytic tools to allow targeted selection of data for visualization and download are likely...

  9. Estimation of genome length

    Institute of Scientific and Technical Information of China (English)

    2001-01-01

    The genome length is a fundamental feature of a species. This note outlined the general concept and estimation method of the physical and genetic length. Some formulae for estimating the genetic length were derived in detail. As examples, the genome genetic length of Pinus pinaster Ait. and the genetic length of chromosome Ⅵ of Oryza sativa L. were estimated from partial linkage data.

  10. Genetics and Genomics

    Science.gov (United States)

    Good progress is being made on genetics and genomics of sugar beet, however it is in process and the tools are now being generated and some results are being analyzed. The GABI BeetSeq project released a first draft of the sugar beet genome of KWS2320, a dihaploid (see http://bvseq.molgen.mpg.de/Gen...

  11. Safeguarding genome integrity

    DEFF Research Database (Denmark)

    Sørensen, Claus Storgaard; Syljuåsen, Randi G

    2012-01-01

    Mechanisms that preserve genome integrity are highly important during the normal life cycle of human cells. Loss of genome protective mechanisms can lead to the development of diseases such as cancer. Checkpoint kinases function in the cellular surveillance pathways that help cells to cope with DNA...

  12. Genome-Scale Models

    DEFF Research Database (Denmark)

    Bergdahl, Basti; Sonnenschein, Nikolaus; Machado, Daniel

    2016-01-01

    An introduction to genome-scale models, how to build and use them, will be given in this chapter. Genome-scale models have become an important part of systems biology and metabolic engineering, and are increasingly used in research, both in academica and in industry, both for modeling chemical pr...

  13. Unlocking the bovine genome

    Directory of Open Access Journals (Sweden)

    Worley Kim C

    2009-04-01

    Full Text Available Abstract The draft genome sequence of cattle (Bos taurus has now been analyzed by the Bovine Genome Sequencing and Analysis Consortium and the Bovine HapMap Consortium, which together represent an extensive collaboration involving more than 300 scientists from 25 different countries.

  14. Genomic understanding of dinoflagellates.

    Science.gov (United States)

    Lin, Senjie

    2011-01-01

    The phylum of dinoflagellates is characterized by many unusual and interesting genomic and physiological features, the imprint of which, in its immense genome, remains elusive. Much novel understanding has been achieved in the last decade on various aspects of dinoflagellate biology, but most remarkably about the structure, expression pattern and epigenetic modification of protein-coding genes in the nuclear and organellar genomes. Major findings include: 1) the great diversity of dinoflagellates, especially at the base of the dinoflagellate tree of life; 2) mini-circularization of the genomes of typical dinoflagellate plastids (with three membranes, chlorophylls a, c1 and c2, and carotenoid peridinin), the scrambled mitochondrial genome and the extensive mRNA editing occurring in both systems; 3) ubiquitous spliced leader trans-splicing of nuclear-encoded mRNA and demonstrated potential as a novel tool for studying dinoflagellate transcriptomes in mixed cultures and natural assemblages; 4) existence and expression of histones and other nucleosomal proteins; 5) a ribosomal protein set expected of typical eukaryotes; 6) genetic potential of non-photosynthetic solar energy utilization via proton-pump rhodopsin; 7) gene candidates in the toxin synthesis pathways; and 8) evidence of a highly redundant, high gene number and highly recombined genome. Despite this progress, much more work awaits genome-wide transcriptome and whole genome sequencing in order to unfold the molecular mechanisms underlying the numerous mysterious attributes of dinoflagellates.

  15. Maintaining genome stability in the nervous system.

    Science.gov (United States)

    McKinnon, Peter J

    2013-11-01

    Active maintenance of genome stability is a prerequisite for the development and function of the nervous system. The high replication index during neurogenesis and the long life of mature neurons highlight the need for efficient cellular programs to safeguard genetic fidelity. Multiple DNA damage response pathways ensure that replication stress and other types of DNA lesions, such as oxidative damage, do not affect neural homeostasis. Numerous human neurologic syndromes result from defective DNA damage signaling and compromised genome integrity. These syndromes can involve different neuropathology, which highlights the diverse maintenance roles that are required for genome stability in the nervous system. Understanding how DNA damage signaling pathways promote neural development and preserve homeostasis is essential for understanding fundamental brain function.

  16. Improving Genetic Gain with Genomic Selection in Autotetraploid Potato

    Directory of Open Access Journals (Sweden)

    Anthony T. Slater

    2016-11-01

    Full Text Available Potato ( L. breeders consider a large number of traits during cultivar development and progress in conventional breeding can be slow. There is accumulating evidence that some of these traits, such as yield, are affected by a large number of genes with small individual effects. Recently, significant efforts have been applied to the development of genomic resources to improve potato breeding, culminating in a draft genome sequence and the identification of a large number of single nucleotide polymorphisms (SNPs. The availability of these genome-wide SNPs is a prerequisite for implementing genomic selection for improvement of polygenic traits such as yield. In this review, we investigate opportunities for the application of genomic selection to potato, including novel breeding program designs. We have considered a number of factors that will influence this process, including the autotetraploid and heterozygous genetic nature of potato, the rate of decay of linkage disequilibrium, the number of required markers, the design of a reference population, and trait heritability. Based on estimates of the effective population size derived from a potato breeding program, we have calculated the expected accuracy of genomic selection for four key traits of varying heritability and propose that it will be reasonably accurate. We compared the expected genetic gain from genomic selection with the expected gain from phenotypic and pedigree selection, and found that genetic gain can be substantially improved by using genomic selection.

  17. NCBI viral genomes resource.

    Science.gov (United States)

    Brister, J Rodney; Ako-Adjei, Danso; Bao, Yiming; Blinkova, Olga

    2015-01-01

    Recent technological innovations have ignited an explosion in virus genome sequencing that promises to fundamentally alter our understanding of viral biology and profoundly impact public health policy. Yet, any potential benefits from the billowing cloud of next generation sequence data hinge upon well implemented reference resources that facilitate the identification of sequences, aid in the assembly of sequence reads and provide reference annotation sources. The NCBI Viral Genomes Resource is a reference resource designed to bring order to this sequence shockwave and improve usability of viral sequence data. The resource can be accessed at http://www.ncbi.nlm.nih.gov/genome/viruses/ and catalogs all publicly available virus genome sequences and curates reference genome sequences. As the number of genome sequences has grown, so too have the difficulties in annotating and maintaining reference sequences. The rapid expansion of the viral sequence universe has forced a recalibration of the data model to better provide extant sequence representation and enhanced reference sequence products to serve the needs of the various viral communities. This, in turn, has placed increased emphasis on leveraging the knowledge of individual scientific communities to identify important viral sequences and develop well annotated reference virus genome sets.

  18. Training in Psychiatric Genomics during Residency: A New Challenge

    Science.gov (United States)

    Winner, Joel G.; Goebert, Deborah; Matsu, Courtenay; Mrazek, David A.

    2010-01-01

    Objective: The authors ascertained the amount of training in psychiatric genomics that is provided in North American psychiatric residency programs. Methods: A sample of 217 chief residents in psychiatric residency programs in the United States and Canada were identified by e-mail and surveyed to assess their training in psychiatric genetics and…

  19. Genomic taxonomy of vibrios

    Directory of Open Access Journals (Sweden)

    Iida Tetsuya

    2009-10-01

    Full Text Available Abstract Background Vibrio taxonomy has been based on a polyphasic approach. In this study, we retrieve useful taxonomic information (i.e. data that can be used to distinguish different taxonomic levels, such as species and genera from 32 genome sequences of different vibrio species. We use a variety of tools to explore the taxonomic relationship between the sequenced genomes, including Multilocus Sequence Analysis (MLSA, supertrees, Average Amino Acid Identity (AAI, genomic signatures, and Genome BLAST atlases. Our aim is to analyse the usefulness of these tools for species identification in vibrios. Results We have generated four new genome sequences of three Vibrio species, i.e., V. alginolyticus 40B, V. harveyi-like 1DA3, and V. mimicus strains VM573 and VM603, and present a broad analyses of these genomes along with other sequenced Vibrio species. The genome atlas and pangenome plots provide a tantalizing image of the genomic differences that occur between closely related sister species, e.g. V. cholerae and V. mimicus. The vibrio pangenome contains around 26504 genes. The V. cholerae core genome and pangenome consist of 1520 and 6923 genes, respectively. Pangenomes might allow different strains of V. cholerae to occupy different niches. MLSA and supertree analyses resulted in a similar phylogenetic picture, with a clear distinction of four groups (Vibrio core group, V. cholerae-V. mimicus, Aliivibrio spp., and Photobacterium spp.. A Vibrio species is defined as a group of strains that share > 95% DNA identity in MLSA and supertree analysis, > 96% AAI, ≤ 10 genome signature dissimilarity, and > 61% proteome identity. Strains of the same species and species of the same genus will form monophyletic groups on the basis of MLSA and supertree. Conclusion The combination of different analytical and bioinformatics tools will enable the most accurate species identification through genomic computational analysis. This endeavour will culminate in

  20. Genomic signal processing

    CERN Document Server

    Shmulevich, Ilya

    2007-01-01

    Genomic signal processing (GSP) can be defined as the analysis, processing, and use of genomic signals to gain biological knowledge, and the translation of that knowledge into systems-based applications that can be used to diagnose and treat genetic diseases. Situated at the crossroads of engineering, biology, mathematics, statistics, and computer science, GSP requires the development of both nonlinear dynamical models that adequately represent genomic regulation, and diagnostic and therapeutic tools based on these models. This book facilitates these developments by providing rigorous mathema

  1. Center for Cancer Genomics | Office of Cancer Genomics

    Science.gov (United States)

    The Center for Cancer Genomics (CCG) was established to unify the National Cancer Institute's activities in cancer genomics, with the goal of advancing genomics research and translating findings into the clinic to improve the precise diagnosis and treatment of cancers. In addition to promoting genomic sequencing approach

  2. Parasitic nematodes - from genomes to control.

    Science.gov (United States)

    Mitreva, Makedonka; Zarlenga, Dante S; McCarter, James P; Jasmer, Douglas P

    2007-08-19

    The diseases caused by parasitic nematodes in domestic and companion animals are major factors that decrease production and quality of the agricultural products. Methods available for the control of the parasitic nematode infections are mainly based on chemical treatment, non-chemical management practices, immune modulation and biological control. However, even with integrated pest management that frequently combines these approaches, the effective and long-lasting control strategies are hampered by the persistent exposure of host animals to environmental stages of parasites, the incomplete protective response of the host and acquisition of anthelmintic resistance by an increasing number of parasitic nematodes. Therefore, the challenges to improve control of parasitic nematode infections are multi-fold and no single category of information will meet them all. However, new information, such as nematode genomics, functional genomics and proteomics, can strengthen basic and applied biological research aimed to develop improvements. In this review we will, summarize existing control strategies of nematode infections and discuss ongoing developments in nematode genomics. Genomics approaches offer a growing and fundamental base of information, which when coupled with downstream functional genomics and proteomics can accelerate progress towards developing more efficient and sustainable control programs.

  3. Genomic libraries: I. Construction and screening of fosmid genomic libraries.

    Science.gov (United States)

    Quail, Mike A; Matthews, Lucy; Sims, Sarah; Lloyd, Christine; Beasley, Helen; Baxter, Simon W

    2011-01-01

    Large insert genome libraries have been a core resource required to sequence genomes, analyze haplotypes, and aid gene discovery. While next generation sequencing technologies are revolutionizing the field of genomics, traditional genome libraries will still be required for accurate genome assembly. Their utility is also being extended to functional studies for understanding DNA regulatory elements. Here, we present a detailed method for constructing genomic fosmid libraries, testing for common contaminants, gridding the library to nylon membranes, then hybridizing the library membranes with a radiolabeled probe to identify corresponding genomic clones. While this chapter focuses on fosmid libraries, many of these steps can also be applied to bacterial artificial chromosome libraries.

  4. Transposon domestication versus mutualism in ciliate genome rearrangements.

    Directory of Open Access Journals (Sweden)

    Alexander Vogt

    Full Text Available Ciliated protists rearrange their genomes dramatically during nuclear development via chromosome fragmentation and DNA deletion to produce a trimmer and highly reorganized somatic genome. The deleted portion of the genome includes potentially active transposons or transposon-like sequences that reside in the germline. Three independent studies recently showed that transposase proteins of the DDE/DDD superfamily are indispensible for DNA processing in three distantly related ciliates. In the spirotrich Oxytricha trifallax, high copy-number germline-limited transposons mediate their own excision from the somatic genome but also contribute to programmed genome rearrangement through a remarkable transposon mutualism with the host. By contrast, the genomes of two oligohymenophorean ciliates, Tetrahymena thermophila and Paramecium tetraurelia, encode homologous PiggyBac-like transposases as single-copy genes in both their germline and somatic genomes. These domesticated transposases are essential for deletion of thousands of different internal sequences in these species. This review contrasts the events underlying somatic genome reduction in three different ciliates and considers their evolutionary origins and the relationships among their distinct mechanisms for genome remodeling.

  5. Toward 959 nematode genomes

    National Research Council Canada - National Science Library

    Kumar, Sujai; Koutsovoulos, Georgios; Kaur, Gaganjot; Blaxter, Mark

    2012-01-01

    The sequencing of the complete genome of the nematode Caenorhabditis elegans was a landmark achievement and ushered in a new era of whole-organism, systems analyses of the biology of this powerful model organism...

  6. The genomics of adaptation.

    Science.gov (United States)

    Radwan, Jacek; Babik, Wiesław

    2012-12-22

    The amount and nature of genetic variation available to natural selection affect the rate, course and outcome of evolution. Consequently, the study of the genetic basis of adaptive evolutionary change has occupied biologists for decades, but progress has been hampered by the lack of resolution and the absence of a genome-level perspective. Technological advances in recent years should now allow us to answer many long-standing questions about the nature of adaptation. The data gathered so far are beginning to challenge some widespread views of the way in which natural selection operates at the genomic level. Papers in this Special Feature of Proceedings of the Royal Society B illustrate various aspects of the broad field of adaptation genomics. This introductory article sets up a context and, on the basis of a few selected examples, discusses how genomic data can advance our understanding of the process of adaptation.

  7. Mouse genome database 2016.

    Science.gov (United States)

    Bult, Carol J; Eppig, Janan T; Blake, Judith A; Kadin, James A; Richardson, Joel E

    2016-01-01

    The Mouse Genome Database (MGD; http://www.informatics.jax.org) is the primary community model organism database for the laboratory mouse and serves as the source for key biological reference data related to mouse genes, gene functions, phenotypes and disease models with a strong emphasis on the relationship of these data to human biology and disease. As the cost of genome-scale sequencing continues to decrease and new technologies for genome editing become widely adopted, the laboratory mouse is more important than ever as a model system for understanding the biological significance of human genetic variation and for advancing the basic research needed to support the emergence of genome-guided precision medicine. Recent enhancements to MGD include new graphical summaries of biological annotations for mouse genes, support for mobile access to the database, tools to support the annotation and analysis of sets of genes, and expanded support for comparative biology through the expansion of homology data.

  8. Yeast genome sequencing:

    DEFF Research Database (Denmark)

    Piskur, Jure; Langkjær, Rikke Breinhold

    2004-01-01

    For decades, unicellular yeasts have been general models to help understand the eukaryotic cell and also our own biology. Recently, over a dozen yeast genomes have been sequenced, providing the basis to resolve several complex biological questions. Analysis of the novel sequence data has shown...... of closely related species helps in gene annotation and to answer how many genes there really are within the genomes. Analysis of non-coding regions among closely related species has provided an example of how to determine novel gene regulatory sequences, which were previously difficult to analyse because...... they are short and degenerate and occupy different positions. Comparative genomics helps to understand the origin of yeasts and points out crucial molecular events in yeast evolutionary history, such as whole-genome duplication and horizontal gene transfer(s). In addition, the accumulating sequence data provide...

  9. The Lotus japonicus genome

    DEFF Research Database (Denmark)

    This book provides insights into some of the key achievements made in the study of Lotus japonicus (birdsfoot trefoil), as well as a timely overview of topics that are pertinent for future developments in legume genomics. Key topics covered include endosymbiosis, development, hormone regulation......, carbon/nitrogen and secondary metabolism, as well as advances made in high-throughput genomic and genetic approaches. Research focusing on model plants has underpinned the recent growth in plant genomics and genetics and provided a basis for investigations of major crop species. In the legume family...... Fabaceae, groundbreaking genetic and genomic research has established a significant body of knowledge on Lotus japonicus, which was adopted as a model species more than 20 years ago. The diverse nature of legumes means that such research has a wide potential and agricultural impact, for example...

  10. Lophotrochozoan mitochondrial genomes

    Energy Technology Data Exchange (ETDEWEB)

    Valles, Yvonne; Boore, Jeffrey L.

    2005-10-01

    Progress in both molecular techniques and phylogeneticmethods has challenged many of the interpretations of traditionaltaxonomy. One example is in the recognition of the animal superphylumLophotrochozoa (annelids, mollusks, echiurans, platyhelminthes,brachiopods, and other phyla), although the relationships within thisgroup and the inclusion of some phyla remain uncertain. While much ofthis progress in phylogenetic reconstruction has been based on comparingsingle gene sequences, we are beginning to see the potential of comparinglarge-scale features of genomes, such as the relative order of genes.Even though tremendous progress is being made on the sequencedetermination of whole nuclear genomes, the dataset of choice forgenome-level characters for many animals across a broad taxonomic rangeremains mitochondrial genomes. We review here what is known aboutmitochondrial genomes of the lophotrochozoans and discuss the promisethat this dataset will enable insight into theirrelationships.

  11. Mouse Genome Informatics (MGI)

    Data.gov (United States)

    U.S. Department of Health & Human Services — MGI is the international database resource for the laboratory mouse, providing integrated genetic, genomic, and biological data to facilitate the study of human...

  12. Genetical Genomics for Evolutionary Studies

    NARCIS (Netherlands)

    Prins, J.C.P.; Smant, G.; Jansen, R.C.

    2012-01-01

    enetical genomics combines acquired high-throughput genomic data with genetic analysis. In this chapter, we discuss the application of genetical genomics for evolutionary studies, where new high-throughput molecular technologies are combined with mapping quantitative trait loci (QTL) on the genome

  13. An Introduction to Genome Annotation.

    Science.gov (United States)

    Campbell, Michael S; Yandell, Mark

    2015-12-17

    Genome projects have evolved from large international undertakings to tractable endeavors for a single lab. Accurate genome annotation is critical for successful genomic, genetic, and molecular biology experiments. These annotations can be generated using a number of approaches and available software tools. This unit describes methods for genome annotation and a number of software tools commonly used in gene annotation.

  14. Human social genomics.

    Directory of Open Access Journals (Sweden)

    Steven W Cole

    2014-08-01

    Full Text Available A growing literature in human social genomics has begun to analyze how everyday life circumstances influence human gene expression. Social-environmental conditions such as urbanity, low socioeconomic status, social isolation, social threat, and low or unstable social status have been found to associate with differential expression of hundreds of gene transcripts in leukocytes and diseased tissues such as metastatic cancers. In leukocytes, diverse types of social adversity evoke a common conserved transcriptional response to adversity (CTRA characterized by increased expression of proinflammatory genes and decreased expression of genes involved in innate antiviral responses and antibody synthesis. Mechanistic analyses have mapped the neural "social signal transduction" pathways that stimulate CTRA gene expression in response to social threat and may contribute to social gradients in health. Research has also begun to analyze the functional genomics of optimal health and thriving. Two emerging opportunities now stand to revolutionize our understanding of the everyday life of the human genome: network genomics analyses examining how systems-level capabilities emerge from groups of individual socially sensitive genomes and near-real-time transcriptional biofeedback to empirically optimize individual well-being in the context of the unique genetic, geographic, historical, developmental, and social contexts that jointly shape the transcriptional realization of our innate human genomic potential for thriving.

  15. How the genome folds

    Science.gov (United States)

    Lieberman Aiden, Erez

    2012-02-01

    I describe Hi-C, a novel technology for probing the three-dimensional architecture of whole genomes by coupling proximity-based ligation with massively parallel sequencing. Working with collaborators at the Broad Institute and UMass Medical School, we used Hi-C to construct spatial proximity maps of the human genome at a resolution of 1Mb. These maps confirm the presence of chromosome territories and the spatial proximity of small, gene-rich chromosomes. We identified an additional level of genome organization that is characterized by the spatial segregation of open and closed chromatin to form two genome-wide compartments. At the megabase scale, the chromatin conformation is consistent with a fractal globule, a knot-free conformation that enables maximally dense packing while preserving the ability to easily fold and unfold any genomic locus. The fractal globule is distinct from the more commonly used globular equilibrium model. Our results demonstrate the power of Hi-C to map the dynamic conformations of whole genomes.

  16. An archaeal genomic signature

    Science.gov (United States)

    Graham, D. E.; Overbeek, R.; Olsen, G. J.; Woese, C. R.

    2000-01-01

    Comparisons of complete genome sequences allow the most objective and comprehensive descriptions possible of a lineage's evolution. This communication uses the completed genomes from four major euryarchaeal taxa to define a genomic signature for the Euryarchaeota and, by extension, the Archaea as a whole. The signature is defined in terms of the set of protein-encoding genes found in at least two diverse members of the euryarchaeal taxa that function uniquely within the Archaea; most signature proteins have no recognizable bacterial or eukaryal homologs. By this definition, 351 clusters of signature proteins have been identified. Functions of most proteins in this signature set are currently unknown. At least 70% of the clusters that contain proteins from all the euryarchaeal genomes also have crenarchaeal homologs. This conservative set, which appears refractory to horizontal gene transfer to the Bacteria or the Eukarya, would seem to reflect the significant innovations that were unique and fundamental to the archaeal "design fabric." Genomic protein signature analysis methods may be extended to characterize the evolution of any phylogenetically defined lineage. The complete set of protein clusters for the archaeal genomic signature is presented as supplementary material (see the PNAS web site, www.pnas.org).

  17. Genomics and personalized medicine.

    Science.gov (United States)

    Sadee, Wolfgang

    2011-08-30

    The role of genomics in personalized medicine continues to undergo profound changes, in step with dramatic technological advances. Ability to sequence the entire human genome with relative ease raises expectations that we can use an individual's complete genomic blueprint to understand disease risk and predicting therapy outcomes, thereby, optimizing drug therapy. Yet, doubts persist as to what extent genetic/genomic factors influence disease and treatment outcomes or whether robust predictive biomarker tests can be developed. Encompassing more than just DNA sequences, the definition of genomics now often is taken to include transcriptomics, proteomics, metabolomics, and epigenomics, with integration of genomic and environmental factors, in an area referred to systems biology. While we can learn much about a cell's innermost workings, summation of these diverse areas is far from enabling the prediction of therapeutic outcomes. Typically, only a handful of specific biomarkers, genetic or otherwise, are 'actionable', i.e., they can be used to guide therapy. I will focus on pharmacogenetic biomarkers, highlighting current successes but also the main challenges that remain in optimizing individualized therapy. Copyright © 2011 Elsevier B.V. All rights reserved.

  18. GENOMEMASKER package for designing unique genomic PCR primers

    Directory of Open Access Journals (Sweden)

    Kaplinski Lauris

    2006-03-01

    Full Text Available Abstract Background The design of oligonucleotides and PCR primers for studying large genomes is complicated by the redundancy of sequences. The eukaryotic genomes are particularly difficult to study due to abundant repeats. The speed of most existing primer evaluation programs is not sufficient for large-scale experiments. Results In order to improve the efficiency and success rate of automatic primer/oligo design, we created a novel method which allows rapid masking of repeats in large sequence files, for example in eukaryotic genomes. It also allows the detection of all alternative binding sites of PCR primers and the prediction of PCR products. The new method was implemented in a collection of efficient programs, the GENOMEMASKER package. The performance of the programs was compared to other similar programs. We also modified the PRIMER3 program, to be able to design primers from lowercase-masked sequences. Conclusion The GENOMEMASKER package is able to mask the entire human genome for non-unique primers within 6 hours and find locations of all binding sites for 10 000 designed primer pairs within 10 minutes. Additionally, it predicts all alternative PCR products from large genomes for given primer pairs.

  19. Optimized Adaptor Polymerase Chain Reaction Method for Efficient Genomic Walking

    Institute of Scientific and Technical Information of China (English)

    Peng XU; Rui-Ying HU; Xiao-Yan DING

    2006-01-01

    Genomic walking is one of the most useful approaches in genome-related research. Three kinds of PCR-based methods are available for this purpose. However, none of them has been generally applied because they are either insensitive or inefficient. Here we present an efficient PCR protocol, an optimized adaptor PCR method for genomic walking. Using a combination of a touchdown PCR program and a special adaptor, the optimized adaptor PCR protocol achieves high sensitivity with low background noise. By applying this protocol, the insertion sites of a gene trap mouse line and two gene promoters from the incompletely sequenced Xenopus laevis genome were successfully identified with high efficiency. The general application of this protocol in genomic walking was promising.

  20. Cell death in genome evolution.

    Science.gov (United States)

    Teng, Xinchen; Hardwick, J Marie

    2015-03-01

    Inappropriate survival of abnormal cells underlies tumorigenesis. Most discoveries about programmed cell death have come from studying model organisms. Revisiting the experimental contexts that inspired these discoveries helps explain confounding biases that inevitably accompany such discoveries. Amending early biases has added a newcomer to the collection of cell death models. Analysis of gene-dependent death in yeast revealed the surprising influence of single gene mutations on subsequent eukaryotic genome evolution. Similar events may influence the selection for mutations during early tumorigenesis. The possibility that any early random mutation might drive the selection for a cancer driver mutation is conceivable but difficult to demonstrate. This was tested in yeast, revealing that mutation of almost any gene appears to specify the selection for a new second mutation. Some human tumors contain pairs of mutant genes homologous to co-occurring mutant genes in yeast. Here we consider how yeast again provide novel insights into tumorigenesis.

  1. Evolution of small prokaryotic genomes

    OpenAIRE

    Martínez-Cano, David J.; Reyes-Prieto, Mariana; Martínez-Romero, Esperanza; Partida-Martínez, Laila P.; Latorre, Amparo; Moya, Andrés; Delaye, Luis

    2015-01-01

    As revealed by genome sequencing, the biology of prokaryotes with reduced genomes is strikingly diverse. These include free-living prokaryotes with ∼800 genes as well as endosymbiotic bacteria with as few as ∼140 genes. Comparative genomics is revealing the evolutionary mechanisms that led to these small genomes. In the case of free-living prokaryotes, natural selection directly favored genome reduction, while in the case of endosymbiotic prokaryotes neutral processes played a more prominent ...

  2. Evolution of small prokaryotic genomes

    OpenAIRE

    David José Martínez-Cano; Mariana eReyes-Prieto; Esperanza eMartinez-Romero; Laila Pamela Partida-Martinez; Amparo eLatorre; Andres eMoya; Luis eDelaye

    2015-01-01

    As revealed by genome sequencing, the biology of prokaryotes with reduced genomes is strikingly diverse. These include free-living prokaryotes with ~800 genes as well as endosymbiotic bacteria with as few as ~140 genes. Comparative genomics is revealing the evolutionary mechanisms that led to these small genomes. In the case of free-living prokaryotes, natural selection directly favored genome reduction, while in the case of endosymbiotic prokaryotes neutral processes played a more prominent ...

  3. A comparison of virus genome sequences with their host silkworm, Bombyx mori.

    Science.gov (United States)

    Tang, Xu-Dong; Yue, Ya-Jie; Wang, Wei; Li, Nan; Shen, Zhong-Yuan

    2016-01-15

    With the recent availability of the genomes of many viruses and the silkworm, Bombyx mori, as well as a variety of Basic Local Alignment Search Tool (BLAST) programs, a new opportunity to gain insight into the interaction of viruses with the silkworm is possible. This study aims to determine the possible existence of sequence identities between the genomes of viruses and the silkworm and attempts to explain this phenomenon. BLAST searches of the genomes of viruses against the silkworm genome were performed using the resources of the National Center for Biotechnology Information. All studied viruses contained variable numbers of short regions with sequence identity to the genome of the silkworm. The short regions of sequence identity in the genome of the silkworm may be derived from the genomes of viruses in the long history of silkworm-virus interaction. This study is the first to compare these genomes, and may contribute to research on the interaction between viruses and the silkworm.

  4. New genomic resources for switchgrass: a BAC library and comparative analysis of homoeologous genomic regions harboring bioenergy traits

    Directory of Open Access Journals (Sweden)

    Feltus Frank A

    2011-07-01

    Full Text Available Abstract Background Switchgrass, a C4 species and a warm-season grass native to the prairies of North America, has been targeted for development into an herbaceous biomass fuel crop. Genetic improvement of switchgrass feedstock traits through marker-assisted breeding and biotechnology approaches calls for genomic tools development. Establishment of integrated physical and genetic maps for switchgrass will accelerate mapping of value added traits useful to breeding programs and to isolate important target genes using map based cloning. The reported polyploidy series in switchgrass ranges from diploid (2X = 18 to duodecaploid (12X = 108. Like in other large, repeat-rich plant genomes, this genomic complexity will hinder whole genome sequencing efforts. An extensive physical map providing enough information to resolve the homoeologous genomes would provide the necessary framework for accurate assembly of the switchgrass genome. Results A switchgrass BAC library constructed by partial digestion of nuclear DNA with EcoRI contains 147,456 clones covering the effective genome approximately 10 times based on a genome size of 3.2 Gigabases (~1.6 Gb effective. Restriction digestion and PFGE analysis of 234 randomly chosen BACs indicated that 95% of the clones contained inserts, ranging from 60 to 180 kb with an average of 120 kb. Comparative sequence analysis of two homoeologous genomic regions harboring orthologs of the rice OsBRI1 locus, a low-copy gene encoding a putative protein kinase and associated with biomass, revealed that orthologous clones from homoeologous chromosomes can be unambiguously distinguished from each other and correctly assembled to respective fingerprint contigs. Thus, the data obtained not only provide genomic resources for further analysis of switchgrass genome, but also improve efforts for an accurate genome sequencing strategy. Conclusions The construction of the first switchgrass BAC library and comparative analysis of

  5. Genomic Prediction from Whole Genome Sequence in Livestock: The 1000 Bull Genomes Project

    DEFF Research Database (Denmark)

    Hayes, Benjamin J; MacLeod, Iona M; Daetwyler, Hans D

    Advantages of using whole genome sequence data to predict genomic estimated breeding values (GEBV) include better persistence of accuracy of GEBV across generations and more accurate GEBV across breeds. The 1000 Bull Genomes Project provides a database of whole genome sequenced key ancestor bulls...

  6. MicrobesOnline: an integrated portal for comparative functional genomics

    OpenAIRE

    Joachimiak, Marcin P.

    2014-01-01

    The Virtual Institute for Microbial Stress and Survival (VIMSS, http://vimss.lbl.gov/) funded by the Dept. of Energy's Genomics:GTL Program, is dedicated to using integrated environmental, functional genomic, and comparative sequence and phylogeny data to understand mechanisms by which microbes and microbial communities survive in uncertain environments while carrying out processes of interest for bioremediation and energy generation. To support this work, VIMSS has developed a Web portal, al...

  7. A biologist's guide to de novo genome assembly using next-generation sequence data: A test with fungal genomes.

    Science.gov (United States)

    Haridas, Sajeet; Breuill, Colette; Bohlmann, Joerg; Hsiang, Tom

    2011-09-01

    We offer a guide to de novo genome assembly using sequence data generated by the Illumina platform for biologists working with fungi or other organisms whose genomes are less than 100Mb in size. The guide requires no familiarity with sequencing assembly technology or associated computer programs. It defines commonly used terms in genome sequencing and assembly; provides examples of assembling short-read genome sequence data for four strains of the fungus Grosmannia clavigera using four assembly programs; gives examples of protocols and software; and presents a commented flowchart that extends from DNA preparation for submission to a sequencing center, through to processing and assembly of the raw sequence reads using freely available operating systems and software.

  8. Genome-wide discovery and verification of novel structured RNAs in Plasmodium falciparum

    DEFF Research Database (Denmark)

    Mourier, Tobias; Carret, Celine; Kyes, Sue;

    2008-01-01

    We undertook a genome-wide search for novel noncoding RNAs (ncRNA) in the malaria parasite Plasmodium falciparum. We used the RNAz program to predict structures in the noncoding regions of the P. falciparum 3D7 genome that were conserved with at least one of seven other Plasmodium spp. genome seq...

  9. Database of Periodic DNA Regions in Major Genomes

    Directory of Open Access Journals (Sweden)

    Felix E. Frenkel

    2017-01-01

    Full Text Available Summary. We analyzed several prokaryotic and eukaryotic genomes looking for the periodicity sequences availability and employing a new mathematical method. The method envisaged using the random position weight matrices and dynamic programming. Insertions and deletions were allowed inside periodicities, thus adding a novelty to the results we obtained. A periodicity length, one of the key periodicity features, varied from 2 to 50 nt. Totally over 60,000 periodicity sequences were found in 15 genomes including some chromosomes of the H. sapiens (partial, C. elegans, D. melanogaster, and A. thaliana genomes.

  10. Database of Periodic DNA Regions in Major Genomes

    Science.gov (United States)

    2017-01-01

    Summary. We analyzed several prokaryotic and eukaryotic genomes looking for the periodicity sequences availability and employing a new mathematical method. The method envisaged using the random position weight matrices and dynamic programming. Insertions and deletions were allowed inside periodicities, thus adding a novelty to the results we obtained. A periodicity length, one of the key periodicity features, varied from 2 to 50 nt. Totally over 60,000 periodicity sequences were found in 15 genomes including some chromosomes of the H. sapiens (partial), C. elegans, D. melanogaster, and A. thaliana genomes. PMID:28182099

  11. Complete genome sequence of Ferroglobus placidus AEDII12DO

    Energy Technology Data Exchange (ETDEWEB)

    Anderson, Iain [U.S. Department of Energy, Joint Genome Institute; Risso, Carla [University of Massachusetts, Amherst; Holmes, Dawn [University of Massachusetts, Amherst; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Copeland, A [U.S. Department of Energy, Joint Genome Institute; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Bruce, David [Los Alamos National Laboratory (LANL); Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Saunders, Elizabeth H [Los Alamos National Laboratory (LANL); Brettin, Thomas S [ORNL; Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Han, Cliff [Los Alamos National Laboratory (LANL); Tapia, Roxanne [Los Alamos National Laboratory (LANL); Larimer, Frank W [ORNL; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Lovley, Derek [University of Massachusetts, Amherst; Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute

    2011-01-01

    Ferroglobus placidus belongs to the order Archaeoglobales within the archaeal phylum Euryar- chaeota. Strain AEDII12DO is the type strain of the species and was isolated from a shallow marine hydrothermal system at Vulcano, Italy. It is a hyperthermophilic, anaerobic chemoli- thoautotroph, but it can also use a variety of aromatic compounds as electron donors. Here we describe the features of this organism together with the complete genome sequence and anno- tation. The 2,196,266 bp genome with its 2,567 protein-coding and 55 RNA genes was se- quenced as part of a DOE Joint Genome Institute Laboratory Sequencing Program (LSP) project.

  12. Complete genome sequence of Serratia plymuthica strain AS12

    Energy Technology Data Exchange (ETDEWEB)

    Neupane, Saraswoti [Uppsala University, Uppsala, Sweden; Finlay, Roger D. [Uppsala University, Uppsala, Sweden; Alstrom, Sadhna [Uppsala University, Uppsala, Sweden; Goodwin, Lynne A. [Los Alamos National Laboratory (LANL); Kyrpides, Nikos C [U.S. Department of Energy, Joint Genome Institute; Lucas, Susan [U.S. Department of Energy, Joint Genome Institute; Lapidus, Alla L. [U.S. Department of Energy, Joint Genome Institute; Bruce, David [Los Alamos National Laboratory (LANL); Pitluck, Sam [U.S. Department of Energy, Joint Genome Institute; Peters, Lin [U.S. Department of Energy, Joint Genome Institute; Ovchinnikova, Galina [U.S. Department of Energy, Joint Genome Institute; Chertkov, Olga [Los Alamos National Laboratory (LANL); Han, James [U.S. Department of Energy, Joint Genome Institute; Han, Cliff [Los Alamos National Laboratory (LANL); Tapia, Roxanne [Los Alamos National Laboratory (LANL); Detter, J. Chris [U.S. Department of Energy, Joint Genome Institute; Land, Miriam L [ORNL; Hauser, Loren John [ORNL; Cheng, Jan-Fang [U.S. Department of Energy, Joint Genome Institute; Ivanova, N [U.S. Department of Energy, Joint Genome Institute; Pagani, Ioanna [U.S. Department of Energy, Joint Genome Institute; Klenk, Hans-Peter [DSMZ - German Collection of Microorganisms and Cell Cultures GmbH, Braunschweig, Germany; Woyke, Tanja [U.S. Department of Energy, Joint Genome Institute; Hogberg, Nils [Uppsala University, Uppsala, Sweden

    2012-01-01

    A plant associated member of the family Enterobacteriaceae, Serratia plymuthica strain AS12 was isolated from rapeseed roots. It is of scientific interest due to its plant growth promoting and plant pathogen inhibiting ability. The genome of S. plymuthica AS12 comprises a 5,443,009 bp long circular chromosome, which consists of 4,952 protein-coding genes, 87 tRNA genes and 7 rRNA operons. This genome was sequenced within the 2010 DOE-JGI Community Sequencing Program (CSP2010) as part of the project entitled 'Genomics of four rapeseed plant growth promoting bacteria with antagonistic effect on plant pathogens'.

  13. GapBlaster-A Graphical Gap Filler for Prokaryote Genomes.

    Science.gov (United States)

    de Sá, Pablo H C G; Miranda, Fábio; Veras, Adonney; de Melo, Diego Magalhães; Soares, Siomar; Pinheiro, Kenny; Guimarães, Luis; Azevedo, Vasco; Silva, Artur; Ramos, Rommel T J

    2016-01-01

    The advent of NGS (Next Generation Sequencing) technologies has resulted in an exponential increase in the number of complete genomes available in biological databases. This advance has allowed the development of several computational tools enabling analyses of large amounts of data in each of the various steps, from processing and quality filtering to gap filling and manual curation. The tools developed for gap closure are very useful as they result in more complete genomes, which will influence downstream analyses of genomic plasticity and comparative genomics. However, the gap filling step remains a challenge for genome assembly, often requiring manual intervention. Here, we present GapBlaster, a graphical application to evaluate and close gaps. GapBlaster was developed via Java programming language. The software uses contigs obtained in the assembly of the genome to perform an alignment against a draft of the genome/scaffold, using BLAST or Mummer to close gaps. Then, all identified alignments of contigs that extend through the gaps in the draft sequence are presented to the user for further evaluation via the GapBlaster graphical interface. GapBlaster presents significant results compared to other similar software and has the advantage of offering a graphical interface for manual curation of the gaps. GapBlaster program, the user guide and the test datasets are freely available at https://sourceforge.net/projects/gapblaster2015/. It requires Sun JDK 8 and Blast or Mummer.

  14. GapBlaster-A Graphical Gap Filler for Prokaryote Genomes.

    Directory of Open Access Journals (Sweden)

    Pablo H C G de Sá

    Full Text Available The advent of NGS (Next Generation Sequencing technologies has resulted in an exponential increase in the number of complete genomes available in biological databases. This advance has allowed the development of several computational tools enabling analyses of large amounts of data in each of the various steps, from processing and quality filtering to gap filling and manual curation. The tools developed for gap closure are very useful as they result in more complete genomes, which will influence downstream analyses of genomic plasticity and comparative genomics. However, the gap filling step remains a challenge for genome assembly, often requiring manual intervention. Here, we present GapBlaster, a graphical application to evaluate and close gaps. GapBlaster was developed via Java programming language. The software uses contigs obtained in the assembly of the genome to perform an alignment against a draft of the genome/scaffold, using BLAST or Mummer to close gaps. Then, all identified alignments of contigs that extend through the gaps in the draft sequence are presented to the user for further evaluation via the GapBlaster graphical interface. GapBlaster presents significant results compared to other similar software and has the advantage of offering a graphical interface for manual curation of the gaps. GapBlaster program, the user guide and the test datasets are freely available at https://sourceforge.net/projects/gapblaster2015/. It requires Sun JDK 8 and Blast or Mummer.

  15. Annotation-based genome-wide SNP discovery in the large and complex Aegilops tauschii genome using next-generation sequencing without a reference genome sequence

    Directory of Open Access Journals (Sweden)

    Luo Ming-Cheng

    2011-01-01

    Full Text Available Abstract Background Many plants have large and complex genomes with an abundance of repeated sequences. Many plants are also polyploid. Both of these attributes typify the genome architecture in the tribe Triticeae, whose members include economically important wheat, rye and barley. Large genome sizes, an abundance of repeated sequences, and polyploidy present challenges to genome-wide SNP discovery using next-generation sequencing (NGS of total genomic DNA by making alignment and clustering of short reads generated by the NGS platforms difficult, particularly in the absence of a reference genome sequence. Results An annotation-based, genome-wide SNP discovery pipeline is reported using NGS data for large and complex genomes without a reference genome sequence. Roche 454 shotgun reads with low genome coverage of one genotype are annotated in order to distinguish single-copy sequences and repeat junctions from repetitive sequences and sequences shared by paralogous genes. Multiple genome equivalents of shotgun reads of another genotype generated with SOLiD or Solexa are then mapped to the annotated Roche 454 reads to identify putative SNPs. A pipeline program package, AGSNP, was developed and used for genome-wide SNP discovery in Aegilops tauschii-the diploid source of the wheat D genome, and with a genome size of 4.02 Gb, of which 90% is repetitive sequences. Genomic DNA of Ae. tauschii accession AL8/78 was sequenced with the Roche 454 NGS platform. Genomic DNA and cDNA of Ae. tauschii accession AS75 was sequenced primarily with SOLiD, although some Solexa and Roche 454 genomic sequences were also generated. A total of 195,631 putative SNPs were discovered in gene sequences, 155,580 putative SNPs were discovered in uncharacterized single-copy regions, and another 145,907 putative SNPs were discovered in repeat junctions. These SNPs were dispersed across the entire Ae. tauschii genome. To assess the false positive SNP discovery rate, DNA

  16. Matching curated genome databases: a non trivial task

    Directory of Open Access Journals (Sweden)

    Labedan Bernard

    2008-10-01

    Full Text Available Abstract Background Curated databases of completely sequenced genomes have been designed independently at the NCBI (RefSeq and EBI (Genome Reviews to cope with non-standard annotation found in the version of the sequenced genome that has been published by databanks GenBank/EMBL/DDBJ. These curation attempts were expected to review the annotations and to improve their pertinence when using them to annotate newly released genome sequences by homology to previously annotated genomes. However, we observed that such an uncoordinated effort has two unwanted consequences. First, it is not trivial to map the protein identifiers of the same sequence in both databases. Secondly, the two reannotated versions of the same genome differ at the level of their structural annotation. Results Here, we propose CorBank, a program devised to provide cross-referencing protein identifiers no matter what the level of identity is found between their matching sequences. Approximately 98% of the 1,983,258 amino acid sequences are matching, allowing instantaneous retrieval of their respective cross-references. CorBank further allows detecting any differences between the independently curated versions of the same genome. We found that the RefSeq and Genome Reviews versions are perfectly matching for only 50 of the 641 complete genomes we have analyzed. In all other cases there are differences occurring at the level of the coding sequence (CDS, and/or in the total number of CDS in the respective version of the same genome. CorBank is freely accessible at http://www.corbank.u-psud.fr. The CorBank site contains also updated publication of the exhaustive results obtained by comparing RefSeq and Genome Reviews versions of each genome. Accordingly, this web site allows easy search of cross-references between RefSeq, Genome Reviews, and UniProt, for either a single CDS or a whole replicon. Conclusion CorBank is very efficient in rapid detection of the numerous differences existing

  17. Genome Project Standards in a New Era of Sequencing

    Energy Technology Data Exchange (ETDEWEB)

    GSC Consortia; HMP Jumpstart Consortia; Chain, P. S. G.; Grafham, D. V.; Fulton, R. S.; FitzGerald, M. G.; Hostetler, J.; Muzny, D.; Detter, J. C.; Ali, J.; Birren, B.; Bruce, D. C.; Buhay, C.; Cole, J. R.; Ding, Y.; Dugan, S.; Field, D.; Garrity, G. M.; Gibbs, R.; Graves, T.; Han, C. S.; Harrison, S. H.; Highlander, S.; Hugenholtz, P.; Khouri, H. M.; Kodira, C. D.; Kolker, E.; Kyrpides, N. C.; Lang, D.; Lapidus, A.; Malfatti, S. A.; Markowitz, V.; Metha, T.; Nelson, K. E.; Parkhill, J.; Pitluck, S.; Qin, X.; Read, T. D.; Schmutz, J.; Sozhamannan, S.; Strausberg, R.; Sutton, G.; Thomson, N. R.; Tiedje, J. M.; Weinstock, G.; Wollam, A.

    2009-06-01

    For over a decade, genome 43 sequences have adhered to only two standards that are relied on for purposes of sequence analysis by interested third parties (1, 2). However, ongoing developments in revolutionary sequencing technologies have resulted in a redefinition of traditional whole genome sequencing that requires a careful reevaluation of such standards. With commercially available 454 pyrosequencing (followed by Illumina, SOLiD, and now Helicos), there has been an explosion of genomes sequenced under the moniker 'draft', however these can be very poor quality genomes (due to inherent errors in the sequencing technologies, and the inability of assembly programs to fully address these errors). Further, one can only infer that such draft genomes may be of poor quality by navigating through the databases to find the number and type of reads deposited in sequence trace repositories (and not all genomes have this available), or to identify the number of contigs or genome fragments deposited to the database. The difficulty in assessing the quality of such deposited genomes has created some havoc for genome analysis pipelines and contributed to many wasted hours of (mis)interpretation. These same novel sequencing technologies have also brought an exponential leap in raw sequencing capability, and at greatly reduced prices that have further skewed the time- and cost-ratios of draft data generation versus the painstaking process of improving and finishing a genome. The resulting effect is an ever-widening gap between drafted and finished genomes that only promises to continue (Figure 1), hence there is an urgent need to distinguish good and poor datasets. The sequencing institutes in the authorship, along with the NIH's Human Microbiome Project Jumpstart Consortium (3), strongly believe that a new set of standards is required for genome sequences. The following represents a set of six community-defined categories of genome sequence standards that better

  18. Genomes to Proteomes

    Energy Technology Data Exchange (ETDEWEB)

    Panisko, Ellen A. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Grigoriev, Igor [USDOE Joint Genome Inst., Walnut Creek, CA (United States); Daly, Don S. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Webb-Robertson, Bobbie-Jo [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Baker, Scott E. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States)

    2009-03-01

    Biologists are awash with genomic sequence data. In large part, this is due to the rapid acceleration in the generation of DNA sequence that occurred as public and private research institutes raced to sequence the human genome. In parallel with the large human genome effort, mostly smaller genomes of other important model organisms were sequenced. Projects following on these initial efforts have made use of technological advances and the DNA sequencing infrastructure that was built for the human and other organism genome projects. As a result, the genome sequences of many organisms are available in high quality draft form. While in many ways this is good news, there are limitations to the biological insights that can be gleaned from DNA sequences alone; genome sequences offer only a bird's eye view of the biological processes endemic to an organism or community. Fortunately, the genome sequences now being produced at such a high rate can serve as the foundation for other global experimental platforms such as proteomics. Proteomic methods offer a snapshot of the proteins present at a point in time for a given biological sample. Current global proteomics methods combine enzymatic digestion, separations, mass spectrometry and database searching for peptide identification. One key aspect of proteomics is the prediction of peptide sequences from mass spectrometry data. Global proteomic analysis uses computational matching of experimental mass spectra with predicted spectra based on databases of gene models that are often generated computationally. Thus, the quality of gene models predicted from a genome sequence is crucial in the generation of high quality peptide identifications. Once peptides are identified they can be assigned to their parent protein. Proteins identified as expressed in a given experiment are most useful when compared to other expressed proteins in a larger biological context or biochemical pathway. In this chapter we will discuss the automatic

  19. Translational genomics for plant breeding with the genome sequence explosion.

    Science.gov (United States)

    Kang, Yang Jae; Lee, Taeyoung; Lee, Jayern; Shim, Sangrea; Jeong, Haneul; Satyawan, Dani; Kim, Moon Young; Lee, Suk-Ha

    2016-04-01

    The use of next-generation sequencers and advanced genotyping technologies has propelled the field of plant genomics in model crops and plants and enhanced the discovery of hidden bridges between genotypes and phenotypes. The newly generated reference sequences of unstudied minor plants can be annotated by the knowledge of model plants via translational genomics approaches. Here, we reviewed the strategies of translational genomics and suggested perspectives on the current databases of genomic resources and the database structures of translated information on the new genome. As a draft picture of phenotypic annotation, translational genomics on newly sequenced plants will provide valuable assistance for breeders and researchers who are interested in genetic studies.

  20. BLAST Ring Image Generator (BRIG: simple prokaryote genome comparisons

    Directory of Open Access Journals (Sweden)

    Beatson Scott A

    2011-08-01

    Full Text Available Abstract Background Visualisation of genome comparisons is invaluable for helping to determine genotypic differences between closely related prokaryotes. New visualisation and abstraction methods are required in order to improve the validation, interpretation and communication of genome sequence information; especially with the increasing amount of data arising from next-generation sequencing projects. Visualising a prokaryote genome as a circular image has become a powerful means of displaying informative comparisons of one genome to a number of others. Several programs, imaging libraries and internet resources already exist for this purpose, however, most are either limited in the number of comparisons they can show, are unable to adequately utilise draft genome sequence data, or require a knowledge of command-line scripting for implementation. Currently, there is no freely available desktop application that enables users to rapidly visualise comparisons between hundreds of draft or complete genomes in a single image. Results BLAST Ring Image Generator (BRIG can generate images that show multiple prokaryote genome comparisons, without an arbitrary limit on the number of genomes compared. The output image shows similarity between a central reference sequence and other sequences as a set of concentric rings, where BLAST matches are coloured on a sliding scale indicating a defined percentage identity. Images can also include draft genome assembly information to show read coverage, assembly breakpoints and collapsed repeats. In addition, BRIG supports the mapping of unassembled sequencing reads against one or more central reference sequences. Many types of custom data and annotations can be shown using BRIG, making it a versatile approach for visualising a range of genomic comparison data. BRIG is readily accessible to any user, as it assumes no specialist computational knowledge and will perform all required file parsing and BLAST comparisons

  1. Exploration of plant genomes in the FLAGdb++ environment

    Directory of Open Access Journals (Sweden)

    Leplé Jean-Charles

    2011-03-01

    Full Text Available Abstract Background In the contexts of genomics, post-genomics and systems biology approaches, data integration presents a major concern. Databases provide crucial solutions: they store, organize and allow information to be queried, they enhance the visibility of newly produced data by comparing them with previously published results, and facilitate the exploration and development of both existing hypotheses and new ideas. Results The FLAGdb++ information system was developed with the aim of using whole plant genomes as physical references in order to gather and merge available genomic data from in silico or experimental approaches. Available through a JAVA application, original interfaces and tools assist the functional study of plant genes by considering them in their specific context: chromosome, gene family, orthology group, co-expression cluster and functional network. FLAGdb++ is mainly dedicated to the exploration of large gene groups in order to decipher functional connections, to highlight shared or specific structural or functional features, and to facilitate translational tasks between plant species (Arabidopsis thaliana, Oryza sativa, Populus trichocarpa and Vitis vinifera. Conclusion Combining original data with the output of experts and graphical displays that differ from classical plant genome browsers, FLAGdb++ presents a powerful complementary tool for exploring plant genomes and exploiting structural and functional resources, without the need for computer programming knowledge. First launched in 2002, a 15th version of FLAGdb++ is now available and comprises four model plant genomes and over eight million genomic features.

  2. Genomic homeology between Pennisetum purpureum and Pennisetum glaucum (Poaceae

    Directory of Open Access Journals (Sweden)

    Gabriela Barreto dos Reis

    2014-08-01

    Full Text Available The genus Pennisetum (Richard, 1805 includes two economically important tropical forage plants: Pennisetum purpureum (Schumacher, 1827 (elephant grass, with 2n = 4x = 28 chromosomes and genomes A'A'BB, and Pennisetum glaucum (Linnaeus, 1753 (pearl millet, with 2n = 2x = 14 chromosomes and genomes AA. The genetic proximity between them allows obtaining hybrids (2n = 3x = 21 that yield forage of higher quality in relation to the parents. The study of genomic relationships provides subsidies for the knowledge about phylogenetic relations and evolution, and is useful in breeding programs seeking gene introgression. Concerning elephant grass and pearl millet, the homeology between the genomes A and A', and between these and the genome B, has been reported by conventional cytogenetic techniques. The objective of the present study was to demonstrate the degree of homeology between these genomes by means of genomic in situ hybridization (GISH. The results confirmed the homeology between the genomes A of pearl millet and A'B of elephant grass, and showed that there are differences in the distribution and proportion of homologous regions after hybridization. Discussion regarding the evolutionary origin of P. purpureum and P. glaucum was also included.

  3. Recent updates and developments to plant genome size databases

    Science.gov (United States)

    Garcia, Sònia; Leitch, Ilia J.; Anadon-Rosell, Alba; Canela, Miguel Á.; Gálvez, Francisco; Garnatje, Teresa; Gras, Airy; Hidalgo, Oriane; Johnston, Emmeline; Mas de Xaxars, Gemma; Pellicer, Jaume; Siljak-Yakovlev, Sonja; Vallès, Joan; Vitales, Daniel; Bennett, Michael D.

    2014-01-01

    Two plant genome size databases have been recently updated and/or extended: the Plant DNA C-values database (http://data.kew.org/cvalues), and GSAD, the Genome Size in Asteraceae database (http://www.asteraceaegenomesize.com). While the first provides information on nuclear DNA contents across land plants and some algal groups, the second is focused on one of the largest and most economically important angiosperm families, Asteraceae. Genome size data have numerous applications: they can be used in comparative studies on genome evolution, or as a tool to appraise the cost of whole-genome sequencing programs. The growing interest in genome size and increasing rate of data accumulation has necessitated the continued update of these databases. Currently, the Plant DNA C-values database (Release 6.0, Dec. 2012) contains data for 8510 species, while GSAD has 1219 species (Release 2.0, June 2013), representing increases of 17 and 51%, respectively, in the number of species with genome size data, compared with previous releases. Here we provide overviews of the most recent releases of each database, and outline new features of GSAD. The latter include (i) a tool to visually compare genome size data between species, (ii) the option to export data and (iii) a webpage containing information about flow cytometry protocols. PMID:24288377

  4. Translational Genomics in Low- and Middle-Income Countries: Opportunities and Challenges.

    Science.gov (United States)

    Tekola-Ayele, Fasil; Rotimi, Charles N

    2015-01-01

    Translation of genomic discoveries into patient care is slowly becoming a reality in developed economies around the world. In contrast, low- and middle-income countries (LMIC) have participated minimally in genomic research for several reasons including the lack of coherent national policies, the limited number of well-trained genomic scientists, poor research infrastructure, and local economic and cultural challenges. Recent initiatives such as the Human Heredity and Health in Africa (H3Africa), the Qatar Genome Project, and the Mexico National Institute of Genomic Medicine (INMEGEN) that aim to address these problems through capacity building and empowerment of local researchers have sparked a paradigm shift. In this short communication, we describe experiences of small-scale medical genetics and translational genomic research programs in LMIC. The lessons drawn from these programs drive home the importance of addressing resource, policy, and sociocultural dynamics to realize the promise of precision medicine driven by genomic science globally. By echoing lessons from a bench-to-community translational genomic research, we advocate that large-scale genomic research projects can be successfully linked with health care programs. To harness the benefits of genomics-led health care, LMIC governments should begin to develop national genomics policies that will address human and technology capacity development within the context of their national economic and sociocultural uniqueness. These policies should encourage international collaboration and promote the link between the public health program and genomics researchers. Finally, we highlight the potential catalytic roles of the global community to foster translational genomics in LMIC.

  5. Domestication and plant genomes.

    Science.gov (United States)

    Tang, Haibao; Sezen, Uzay; Paterson, Andrew H

    2010-04-01

    The techniques of plant improvement have been evolving with the advancement of technology, progressing from crop domestication by Neolithic humans to scientific plant breeding, and now including DNA-based genotyping and genetic engineering. Archeological findings have shown that early human ancestors often unintentionally selected for and finally fixed a few major domestication traits over time. Recent advancement of molecular and genomic tools has enabled scientists to pinpoint changes to specific chromosomal regions and genetic loci that are responsible for dramatic morphological and other transitions that distinguish crops from their wild progenitors. Extensive studies in a multitude of additional crop species, facilitated by rapid progress in sequencing and resequencing(s) of crop genomes, will further our understanding of the genomic impact from both the unusual population history of cultivated plants and millennia of human selection.

  6. Genomics of Preterm Birth

    Science.gov (United States)

    Swaggart, Kayleigh A.; Pavlicev, Mihaela; Muglia, Louis J.

    2015-01-01

    The molecular mechanisms controlling human birth timing at term, or resulting in preterm birth, have been the focus of considerable investigation, but limited insights have been gained over the past 50 years. In part, these processes have remained elusive because of divergence in reproductive strategies and physiology shown by model organisms, making extrapolation to humans uncertain. Here, we summarize the evolution of progesterone signaling and variation in pregnancy maintenance and termination. We use this comparative physiology to support the hypothesis that selective pressure on genomic loci involved in the timing of parturition have shaped human birth timing, and that these loci can be identified with comparative genomic strategies. Previous limitations imposed by divergence of mechanisms provide an important new opportunity to elucidate fundamental pathways of parturition control through increasing availability of sequenced genomes and associated reproductive physiology characteristics across diverse organisms. PMID:25646385

  7. Genomic Imprinting in Mammals

    Science.gov (United States)

    Barlow, Denise P.

    2014-01-01

    Genomic imprinting affects a subset of genes in mammals and results in a monoallelic, parental-specific expression pattern. Most of these genes are located in clusters that are regulated through the use of insulators or long noncoding RNAs (lncRNAs). To distinguish the parental alleles, imprinted genes are epigenetically marked in gametes at imprinting control elements through the use of DNA methylation at the very least. Imprinted gene expression is subsequently conferred through lncRNAs, histone modifications, insulators, and higher-order chromatin structure. Such imprints are maintained after fertilization through these mechanisms despite extensive reprogramming of the mammalian genome. Genomic imprinting is an excellent model for understanding mammalian epigenetic regulation. PMID:24492710

  8. Genomic dairy cattle breeding

    DEFF Research Database (Denmark)

    Mark, Thomas; Sandøe, Peter

    2010-01-01

    The aim of this paper is to discuss the potential consequences of modern dairy cattle breeding for the welfare of dairy cows. The paper focuses on so-called genomic selection, which deploys thousands of genetic markers to estimate breeding values. The discussion should help to structure...... the thoughts of breeders and other stakeholders on how to best make use of genomic breeding in the future. Intensive breeding has played a major role in securing dramatic increases in milk yield since the Second World War. Until recently, the main focus in dairy cattle breeding was on production traits......, unfavourable genetic trends for metabolic, reproductive, claw and leg diseases indicate that these attempts have been insufficient. Today, novel genome-wide sequencing techniques are revolutionising dairy cattle breeding; these enable genetic changes to occur at least twice as rapidly as previously. While...

  9. Genomic dairy cattle breeding

    DEFF Research Database (Denmark)

    Mark, Thomas; Sandøe, Peter

    2010-01-01

    The aim of this paper is to discuss the potential consequences of modern dairy cattle breeding for the welfare of dairy cows. The paper focuses on so-called genomic selection, which deploys thousands of genetic markers to estimate breeding values. The discussion should help to structure...... the thoughts of breeders and other stakeholders on how to best make use of genomic breeding in the future. Intensive breeding has played a major role in securing dramatic increases in milk yield since the Second World War. Until recently, the main focus in dairy cattle breeding was on production traits......, unfavourable genetic trends for metabolic, reproductive, claw and leg diseases indicate that these attempts have been insufficient. Today, novel genome-wide sequencing techniques are revolutionising dairy cattle breeding; these enable genetic changes to occur at least twice as rapidly as previously. While...

  10. Genomics and drug discovery.

    Science.gov (United States)

    Haseltine, W A

    2001-09-01

    Genomics, the systematic study of all the genes of an organism, offers a new and much-needed source of systematic productivity for the pharmaceutical industry. The isolation of the majority of human genes in their most useful form is leading to the creation of new drugs based on human proteins, antibodies, peptides, and genes. Human Genome Sciences, Inc, was the first company to use the systematic, genomics approach to discovering drugs, and we have placed 4 of these in clinical trials. Two are described: repifermin (keratinocyte growth factor-2, KGF-2) for wound healing and treatment of mucositis caused by cancer therapy, and B lymphocyte stimulator (BLyS) for stimulation of the immune system. An anti-BLyS antibody drug is in advanced preclinical development for treatment of autoimmune diseases.

  11. Genomics of Salmonella Species

    Science.gov (United States)

    Canals, Rocio; McClelland, Michael; Santiviago, Carlos A.; Andrews-Polymenis, Helene

    Progress in the study of Salmonella survival, colonization, and virulence has increased rapidly with the advent of complete genome sequencing and higher capacity assays for transcriptomic and proteomic analysis. Although many of these techniques have yet to be used to directly assay Salmonella growth on foods, these assays are currently in use to determine Salmonella factors necessary for growth in animal models including livestock animals and in in vitro conditions that mimic many different environments. As sequencing of the Salmonella genome and microarray analysis have revolutionized genomics and transcriptomics of salmonellae over the last decade, so are new high-throughput sequencing technologies currently accelerating the pace of our studies and allowing us to approach complex problems that were not previously experimentally tractable.

  12. Ebolavirus comparative genomics

    Science.gov (United States)

    Jun, Se-Ran; Leuze, Michael R.; Nookaew, Intawat; Uberbacher, Edward C.; Land, Miriam; Zhang, Qian; Wanchai, Visanu; Chai, Juanjuan; Nielsen, Morten; Trolle, Thomas; Lund, Ole; Buzard, Gregory S.; Pedersen, Thomas D.; Wassenaar, Trudy M.; Ussery, David W.

    2015-01-01

    The 2014 Ebola outbreak in West Africa is the largest documented for this virus. To examine the dynamics of this genome, we compare more than 100 currently available ebolavirus genomes to each other and to other viral genomes. Based on oligomer frequency analysis, the family Filoviridae forms a distinct group from all other sequenced viral genomes. All filovirus genomes sequenced to date encode proteins with similar functions and gene order, although there is considerable divergence in sequences between the three genera Ebolavirus, Cuevavirus and Marburgvirus within the family Filoviridae. Whereas all ebolavirus genomes are quite similar (multiple sequences of the same strain are often identical), variation is most common in the intergenic regions and within specific areas of the genes encoding the glycoprotein (GP), nucleoprotein (NP) and polymerase (L). We predict regions that could contain epitope-binding sites, which might be good vaccine targets. This information, combined with glycosylation sites and experimentally determined epitopes, can identify the most promising regions for the development of therapeutic strategies. This manuscript has been authored by UT-Battelle, LLC under Contract No. DE-AC05-00OR22725 with the U.S. Department of Energy. The United States Government retains and the publisher, by accepting the article for publication, acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes. The Department of Energy will provide public access to these results of federally sponsored research in accordance with the DOE Public Access Plan (http://energy.gov/downloads/doe-public-access-plan). PMID:26175035

  13. Precision genome editing

    DEFF Research Database (Denmark)

    Steentoft, Catharina; Bennett, Eric P; Schjoldager, Katrine Ter-Borch Gram

    2014-01-01

    of glycobiology, primarily due to their low efficiencies, with resultant failure to impose substantial phenotypic consequences upon the final glycosylation products. Here, we review novel nuclease-based precision genome editing techniques enabling efficient and stable gene editing, including gene disruption...... by introducing single or double-stranded breaks at a defined genomic sequence. We here compare and contrast the different techniques and summarize their current applications, highlighting cases from the field of glycobiology as well as pointing to future opportunities. The emerging potential of precision gene...

  14. Methanococcus jannaschii genome: revisited

    Science.gov (United States)

    Kyrpides, N. C.; Olsen, G. J.; Klenk, H. P.; White, O.; Woese, C. R.

    1996-01-01

    Analysis of genomic sequences is necessarily an ongoing process. Initial gene assignments tend (wisely) to be on the conservative side (Venter, 1996). The analysis of the genome then grows in an iterative fashion as additional data and more sophisticated algorithms are brought to bear on the data. The present report is an emendation of the original gene list of Methanococcus jannaschii (Bult et al., 1996). By using a somewhat more updated database and more relaxed (and operator-intensive) pattern matching methods, we were able to add significantly to, and in a few cases amend, the gene identification table originally published by Bult et al. (1996).

  15. The genome editing revolution

    DEFF Research Database (Denmark)

    Stella, Stefano; Montoya, Guillermo

    2016-01-01

    In the last 10 years, we have witnessed a blooming of targeted genome editing systems and applications. The area was revolutionized by the discovery and characterization of the transcription activator-like effector proteins, which are easier to engineer to target new DNA sequences than the previo......In the last 10 years, we have witnessed a blooming of targeted genome editing systems and applications. The area was revolutionized by the discovery and characterization of the transcription activator-like effector proteins, which are easier to engineer to target new DNA sequences than...

  16. Comparative Genomics Reveals High Genomic Diversity in the Genus Photobacterium

    DEFF Research Database (Denmark)

    Machado, Henrique; Gram, Lone

    2017-01-01

    Vibrionaceae is a large marine bacterial family, which can constitute up to 50% of the prokaryotic population in marine waters. Photobacterium is the second largest genus in the family and we used comparative genomics on 35 strains representing 16 of the 28 species described so far, to understand...... the genomic diversity present in the Photobacterium genus. Such understanding is important for ecophysiology studies of the genus. We used whole genome sequences to evaluate phylogenetic relationships using several analyses (16S rRNA, MLSA, fur, amino-acid usage, ANI), which allowed us to identify two...... misidentified strains. Genome analyses also revealed occurrence of higher and lower GC content clades, correlating with phylogenetic clusters. Pan-and core-genome analysis revealed the conservation of 25% of the genome throughout the genus, with a large and open pan-genome. The major source of genomic diversity...

  17. Playing with heart and soul…and genomes: sports implications and applications of personal genomics

    Directory of Open Access Journals (Sweden)

    Jennifer K. Wagner

    2013-08-01

    Full Text Available Whether the integration of genetic/omic technologies in sports contexts will facilitate player success, promote player safety, or spur genetic discrimination depends largely upon the game rules established by those currently designing genomic sports medicine programs. The integration has already begun, but there is not yet a playbook for best practices. Thus far discussions have focused largely on whether the integration would occur and how to prevent the integration from occurring, rather than how it could occur in such a way that maximizes benefits, minimizes risks, and avoids the exacerbation of racial disparities. Previous empirical research has identified members of the personal genomics industry offering sports-related DNA tests, and previous legal research has explored the impact of collective bargaining in professional sports as it relates to the employment protections of the Genetic Information Nondiscrimination Act (GINA. Building upon that research and upon participant observations with specific sports-related DNA tests purchased from four direct-to-consumer companies in 2011 and broader personal genomics (PGx services, this anthropological, legal, and ethical (ALE discussion highlights fundamental issues that must be addressed by those developing personal genomic sports medicine programs, either independently or through collaborations with commercial providers. For example, the vulnerability of student-athletes creates a number of issues that require careful, deliberate consideration. More broadly, however, this ALE discussion highlights potential sports-related implications (that ultimately might mitigate or, conversely, exacerbate racial disparities among athletes of whole exome/genome sequencing conducted by biomedical researchers and clinicians for non-sports purposes. For example, the possibility that exome/genome sequencing of individuals who are considered to be non-patients, asymptomatic, normal, etc. will reveal the presence

  18. Playing with heart and soul…and genomes: sports implications and applications of personal genomics.

    Science.gov (United States)

    Wagner, Jennifer K

    2013-01-01

    Whether the integration of genetic/omic technologies in sports contexts will facilitate player success, promote player safety, or spur genetic discrimination depends largely upon the game rules established by those currently designing genomic sports medicine programs. The integration has already begun, but there is not yet a playbook for best practices. Thus far discussions have focused largely on whether the integration would occur and how to prevent the integration from occurring, rather than how it could occur in such a way that maximizes benefits, minimizes risks, and avoids the exacerbation of racial disparities. Previous empirical research has identified members of the personal genomics industry offering sports-related DNA tests, and previous legal research has explored the impact of collective bargaining in professional sports as it relates to the employment protections of the Genetic Information Nondiscrimination Act (GINA). Building upon that research and upon participant observations with specific sports-related DNA tests purchased from four direct-to-consumer companies in 2011 and broader personal genomics (PGx) services, this anthropological, legal, and ethical (ALE) discussion highlights fundamental issues that must be addressed by those developing personal genomic sports medicine programs, either independently or through collaborations with commercial providers. For example, the vulnerability of student-athletes creates a number of issues that require careful, deliberate consideration. More broadly, however, this ALE discussion highlights potential sports-related implications (that ultimately might mitigate or, conversely, exacerbate racial disparities among athletes) of whole exome/genome sequencing conducted by biomedical researchers and clinicians for non-sports purposes. For example, the possibility that exome/genome sequencing of individuals who are considered to be non-patients, asymptomatic, normal, etc. will reveal the presence of variants of

  19. The Cassava Genome: Current Progress, Future Directions.

    Science.gov (United States)

    Prochnik, Simon; Marri, Pradeep Reddy; Desany, Brian; Rabinowicz, Pablo D; Kodira, Chinnappa; Mohiuddin, Mohammed; Rodriguez, Fausto; Fauquet, Claude; Tohme, Joseph; Harkins, Timothy; Rokhsar, Daniel S; Rounsley, Steve

    2012-03-01

    The starchy swollen roots of cassava provide an essential food source for nearly a billion people, as well as possibilities for bioenergy, yet improvements to nutritional content and resistance to threatening diseases are currently impeded. A 454-based whole genome shotgun sequence has been assembled, which covers 69% of the predicted genome size and 96% of protein-coding gene space, with genome finishing underway. The predicted 30,666 genes and 3,485 alternate splice forms are supported by 1.4 M expressed sequence tags (ESTs). Maps based on simple sequence repeat (SSR)-, and EST-derived single nucleotide polymorphisms (SNPs) already exist. Thanks to the genome sequence, a high-density linkage map is currently being developed from a cross between two diverse cassava cultivars: one susceptible to cassava brown streak disease; the other resistant. An efficient genotyping-by-sequencing (GBS) approach is being developed to catalog SNPs both within the mapping population and among diverse African farmer-preferred varieties of cassava. These resources will accelerate marker-assisted breeding programs, allowing improvements in disease-resistance and nutrition, and will help us understand the genetic basis for disease resistance.

  20. Comparative Genomics Reveals High Genomic Diversity in the Genus Photobacterium.

    Science.gov (United States)

    Machado, Henrique; Gram, Lone

    2017-01-01

    Vibrionaceae is a large marine bacterial family, which can constitute up to 50% of the prokaryotic population in marine waters. Photobacterium is the second largest genus in the family and we used comparative genomics on 35 strains representing 16 of the 28 species described so far, to understand the genomic diversity present in the Photobacterium genus. Such understanding is important for ecophysiology studies of the genus. We used whole genome sequences to evaluate phylogenetic relationships using several analyses (16S rRNA, MLSA, fur, amino-acid usage, ANI), which allowed us to identify two misidentified strains. Genome analyses also revealed occurrence of higher and lower GC content clades, correlating with phylogenetic clusters. Pan- and core-genome analysis revealed the conservation of 25% of the genome throughout the genus, with a large and open pan-genome. The major source of genomic diversity could be traced to the smaller chromosome and plasmids. Several of the physiological traits studied in the genus did not correlate with phylogenetic data. Since horizontal gene transfer (HGT) is often suggested as a source of genetic diversity and a potential driver of genomic evolution in bacterial species, we looked into evidence of such in Photobacterium genomes. Genomic islands were the source of genomic differences between strains of the same species. Also, we found transposase genes and CRISPR arrays that suggest multiple encounters with foreign DNA. Presence of genomic exchange traits was widespread and abundant in the genus, suggesting a role in genomic evolution. The high genetic variability and indications of genetic exchange make it difficult to elucidate genome evolutionary paths and raise the awareness of the roles of foreign DNA in the genomic evolution of environmental organisms.

  1. Exploring functional elements and genomic variation in the noncoding genome

    NARCIS (Netherlands)

    van Heesch, S.A.A.C.

    2014-01-01

    Gene expression regulation is a delicate process that depends on multiple aspects including genome structure and transcription factor binding to DNA elements. The majority of our genome consists of noncoding DNA, which was shown to be crucial in providing the correct context for genome function. Alt

  2. Improving pan-genome annotation using whole genome multiple alignment

    Directory of Open Access Journals (Sweden)

    Salzberg Steven L

    2011-06-01

    Full Text Available Abstract Background Rapid annotation and comparisons of genomes from multiple isolates (pan-genomes is becoming commonplace due to advances in sequencing technology. Genome annotations can contain inconsistencies and errors that hinder comparative analysis even within a single species. Tools are needed to compare and improve annotation quality across sets of closely related genomes. Results We introduce a new tool, Mugsy-Annotator, that identifies orthologs and evaluates annotation quality in prokaryotic genomes using whole genome multiple alignment. Mugsy-Annotator identifies anomalies in annotated gene structures, including inconsistently located translation initiation sites and disrupted genes due to draft genome sequencing or pseudogenes. An evaluation of species pan-genomes using the tool indicates that such anomalies are common, especially at translation initiation sites. Mugsy-Annotator reports alternate annotations that improve consistency and are candidates for further review. Conclusions Whole genome multiple alignment can be used to efficiently identify orthologs and annotation problem areas in a bacterial pan-genome. Comparisons of annotated gene structures within a species may show more variation than is actually present in the genome, indicating errors in genome annotation. Our new tool Mugsy-Annotator assists re-annotation efforts by highlighting edits that improve annotation consistency.

  3. Exploring functional elements and genomic variation in the noncoding genome

    NARCIS (Netherlands)

    van Heesch, S.A.A.C.|info:eu-repo/dai/nl/336463286

    2014-01-01

    Gene expression regulation is a delicate process that depends on multiple aspects including genome structure and transcription factor binding to DNA elements. The majority of our genome consists of noncoding DNA, which was shown to be crucial in providing the correct context for genome function. Alt

  4. Genome-wide profiling of structural genomic variations in Korean HapMap individuals.

    Directory of Open Access Journals (Sweden)

    Joon Seol Bae

    Full Text Available BACKGROUND: Structural genomic variation study, along with microarray technology development has provided many genomic resources related with architecture of human genome, and led to the fact that human genome structure is a lot more complicated than previously thought. METHODOLOGY/PRINCIPAL FINDINGS: In the case of International HapMap Project, Epstein-Barr various immortalized cell lines were preferably used over blood in order to get a larger number of genomic DNA. However, genomic aberration stemming from immortalization process, biased representation of the donor tissue, and culture process may influence the accuracy of SNP genotypes. In order to identify chromosome aberrations including loss of heterozygosity (LOH, large-scale and small-scale copy number variations, we used Illumina HumanHap500 BeadChip (555,352 markers on Korean HapMap individuals (n = 90 to obtain Log R ratio and B allele frequency information, and then utilized the data with various programs including Illumina ChromoZone, cnvParition and PennCNV. As a result, we identified 28 LOHs (>3 mb and 35 large-scale CNVs (>1 mb, with 4 samples having completely duplicated chromosome. In addition, after checking the sample quality (standard deviation of log R ratio <0.30, we selected 79 samples and used both signal intensity and B allele frequency simultaneously for identification of small-scale CNVs (<1 mb to discover 4,989 small-scale CNVs. Identified CNVs in this study were successfully validated using visual examination of the genoplot images, overlapping analysis with previously reported CNVs in DGV, and quantitative PCR. CONCLUSION/SIGNIFICANCE: In this study, we describe the result of the identified chromosome aberrations in Korean HapMap individuals, and expect that these findings will provide more meaningful information on the human genome.

  5. GRAbB: Selective Assembly of Genomic Regions, a New Niche for Genomic Research.

    Science.gov (United States)

    Brankovics, Balázs; Zhang, Hao; van Diepeningen, Anne D; van der Lee, Theo A J; Waalwijk, Cees; de Hoog, G Sybren

    2016-06-01

    GRAbB (Genomic Region Assembly by Baiting) is a new program that is dedicated to assemble specific genomic regions from NGS data. This approach is especially useful when dealing with multi copy regions, such as mitochondrial genome and the rDNA repeat region, parts of the genome that are often neglected or poorly assembled, although they contain interesting information from phylogenetic or epidemiologic perspectives, but also single copy regions can be assembled. The program is capable of targeting multiple regions within a single run. Furthermore, GRAbB can be used to extract specific loci from NGS data, based on homology, like sequences that are used for barcoding. To make the assembly specific, a known part of the region, such as the sequence of a PCR amplicon or a homologous sequence from a related species must be specified. By assembling only the region of interest, the assembly process is computationally much less demanding and may lead to assemblies of better quality. In this study the different applications and functionalities of the program are demonstrated such as: exhaustive assembly (rDNA region and mitochondrial genome), extracting homologous regions or genes (IGS, RPB1, RPB2 and TEF1a), as well as extracting multiple regions within a single run. The program is also compared with MITObim, which is meant for the exhaustive assembly of a single target based on a similar query sequence. GRAbB is shown to be more efficient than MITObim in terms of speed, memory and disk usage. The other functionalities (handling multiple targets simultaneously and extracting homologous regions) of the new program are not matched by other programs. The program is available with explanatory documentation at https://github.com/b-brankovics/grabb. GRAbB has been tested on Ubuntu (12.04 and 14.04), Fedora (23), CentOS (7.1.1503) and Mac OS X (10.7). Furthermore, GRAbB is available as a docker repository: brankovics/grabb (https://hub.docker.com/r/brankovics/grabb/).

  6. GRAbB: Selective Assembly of Genomic Regions, a New Niche for Genomic Research.

    Directory of Open Access Journals (Sweden)

    Balázs Brankovics

    2016-06-01

    Full Text Available GRAbB (Genomic Region Assembly by Baiting is a new program that is dedicated to assemble specific genomic regions from NGS data. This approach is especially useful when dealing with multi copy regions, such as mitochondrial genome and the rDNA repeat region, parts of the genome that are often neglected or poorly assembled, although they contain interesting information from phylogenetic or epidemiologic perspectives, but also single copy regions can be assembled. The program is capable of targeting multiple regions within a single run. Furthermore, GRAbB can be used to extract specific loci from NGS data, based on homology, like sequences that are used for barcoding. To make the assembly specific, a known part of the region, such as the sequence of a PCR amplicon or a homologous sequence from a related species must be specified. By assembling only the region of interest, the assembly process is computationally much less demanding and may lead to assemblies of better quality. In this study the different applications and functionalities of the program are demonstrated such as: exhaustive assembly (rDNA region and mitochondrial genome, extracting homologous regions or genes (IGS, RPB1, RPB2 and TEF1a, as well as extracting multiple regions within a single run. The program is also compared with MITObim, which is meant for the exhaustive assembly of a single target based on a similar query sequence. GRAbB is shown to be more efficient than MITObim in terms of speed, memory and disk usage. The other functionalities (handling multiple targets simultaneously and extracting homologous regions of the new program are not matched by other programs. The program is available with explanatory documentation at https://github.com/b-brankovics/grabb. GRAbB has been tested on Ubuntu (12.04 and 14.04, Fedora (23, CentOS (7.1.1503 and Mac OS X (10.7. Furthermore, GRAbB is available as a docker repository: brankovics/grabb (https://hub.docker.com/r/brankovics/grabb/.

  7. Genetic Transformation and Genomic Resources for Next-Generation Precise Genome Engineering in Vegetable Crops

    Science.gov (United States)

    Cardi, Teodoro; D’Agostino, Nunzio; Tripodi, Pasquale

    2017-01-01

    In the frame of modern agriculture facing the predicted increase of population and general environmental changes, the securement of high quality food remains a major challenge to deal with. Vegetable crops include a large number of species, characterized by multiple geographical origins, large genetic variability and diverse reproductive features. Due to their nutritional value, they have an important place in human diet. In recent years, many crop genomes have been sequenced permitting the identification of genes and superior alleles associated with desirable traits. Furthermore, innovative biotechnological approaches allow to take a step forward towards the development of new improved cultivars harboring precise genome modifications. Sequence-based knowledge coupled with advanced biotechnologies is supporting the widespread application of new plant breeding techniques to enhance the success in modification and transfer of useful alleles into target varieties. Clustered regularly interspaced short palindromic repeats (CRISPR)/CRISPR-associated protein 9 system, zinc-finger nucleases, and transcription activator-like effector nucleases represent the main methods available for plant genome engineering through targeted modifications. Such technologies, however, require efficient transformation protocols as well as extensive genomic resources and accurate knowledge before they can be efficiently exploited in practical breeding programs. In this review, we revise the state of the art in relation to availability of such scientific and technological resources in various groups of vegetables, describe genome editing results obtained so far and discuss the implications for future applications. PMID:28275380

  8. Pairagon: a highly accurate, HMM-based cDNA-to-genome aligner

    DEFF Research Database (Denmark)

    Lu, David V; Brown, Randall H; Arumugam, Manimozhiyan

    2009-01-01

    MOTIVATION: The most accurate way to determine the intron-exon structures in a genome is to align spliced cDNA sequences to the genome. Thus, cDNA-to-genome alignment programs are a key component of most annotation pipelines. The scoring system used to choose the best alignment is a primary......' simulated cDNA sequences by splicing the sequences of exons in the reference genome sequences of fly and human. The complete reference genome sequences were then mutated to various degrees using a realistic mutation simulator and the perfect cDNAs were aligned to them using Pairagon and 12 other aligners...... heuristics. RESULTS: We present Pairagon, a pair hidden Markov model based cDNA-to-genome alignment program, as the most accurate aligner for sequences with high- and low-identity levels. We conducted a series of experiments testing alignment accuracy with varying sequence identity. We first created 'perfect...

  9. Whole-genome prokaryotic phylogeny

    National Research Council Canada - National Science Library

    Henz, Stefan R; Huson, Daniel H; Auch, Alexander F; Nieselt-Struwe, Kay; Schuster, Stephan C

    2005-01-01

    .... We introduce a new strategy, GBDP, 'genome blast distance phylogeny', and show that different variants of this approach robustly produce phylogenies that are biologically sound, when applied to 91 prokaryotic genomes...

  10. Illuminating the Druggable Genome (IDG)

    Data.gov (United States)

    Federal Laboratory Consortium — Results from the Human Genome Project revealed that the human genome contains 20,000 to 25,000 genes. A gene contains (encodes) the information that each cell uses...

  11. On genomics, kin, and privacy.

    Science.gov (United States)

    Telenti, Amalio; Ayday, Erman; Hubaux, Jean Pierre

    2014-01-01

    The storage of greater numbers of exomes or genomes raises the question of loss of privacy for the individual and for families if genomic data are not properly protected. Access to genome data may result from a personal decision to disclose, or from gaps in protection. In either case, revealing genome data has consequences beyond the individual, as it compromises the privacy of family members. Increasing availability of genome data linked or linkable to metadata through online social networks and services adds one additional layer of complexity to the protection of genome privacy.  The field of computer science and information technology offers solutions to secure genomic data so that individuals, medical personnel or researchers can access only the subset of genomic information required for healthcare or dedicated studies.

  12. Better chocolate through genomics

    Science.gov (United States)

    Theobroma cacao, the cacao or chocolate tree, is a tropical understory tree whose seeds are used to make chocolate. And like any important crop, cacao is the subject of much research. On September 15, 2010, scientists publicly released a preliminary sequence of the cacao genome--which contains all o...

  13. Genetics, genomics and fertility

    Science.gov (United States)

    In order to enhance the sustainability of dairy businesses, new management tools are needed to increase the fertility of dairy cattle. Genomic selection has been successfully used by AI studs to screen potential sires and significantly decrease the generation interval of bulls. Buoyed by the success...

  14. The Genomic Standards Consortium

    DEFF Research Database (Denmark)

    Field, Dawn; Amaral-Zettler, Linda; Cochrane, Guy;

    2011-01-01

    A vast and rich body of information has grown up as a result of the world's enthusiasm for 'omics technologies. Finding ways to describe and make available this information that maximise its usefulness has become a major effort across the 'omics world. At the heart of this effort is the Genomic S...

  15. The Nostoc punctiforme Genome

    Energy Technology Data Exchange (ETDEWEB)

    John C. Meeks

    2001-12-31

    Nostoc punctiforme is a filamentous cyanobacterium with extensive phenotypic characteristics and a relatively large genome, approaching 10 Mb. The phenotypic characteristics include a photoautotrophic, diazotrophic mode of growth, but N. punctiforme is also facultatively heterotrophic; its vegetative cells have multiple development alternatives, including terminal differentiation into nitrogen-fixing heterocysts and transient differentiation into spore-like akinetes or motile filaments called hormogonia; and N. punctiforme has broad symbiotic competence with fungi and terrestrial plants, including bryophytes, gymnosperms and an angiosperm. The shotgun-sequencing phase of the N. punctiforme strain ATCC 29133 genome has been completed by the Joint Genome Institute. Annotation of an 8.9 Mb database yielded 7432 open reading frames, 45% of which encode proteins with known or probable known function and 29% of which are unique to N. punctiforme. Comparative analysis of the sequence indicates a genome that is highly plastic and in a state of flux, with numerous insertion sequences and multilocus repeats, as well as genes encoding transposases and DNA modification enzymes. The sequence also reveals the presence of genes encoding putative proteins that collectively define almost all characteristics of cyanobacteria as a group. N. punctiforme has an extensive potential to sense and respond to environmental signals as reflected by the presence of more than 400 genes encoding sensor protein kinases, response regulators and other transcriptional factors. The signal transduction systems and any of the large number of unique genes may play essential roles in the cell differentiation and symbiotic interaction properties of N. punctiforme.

  16. Comparative genomics of Eukaryotes

    NARCIS (Netherlands)

    Noort, Vera van

    2007-01-01

    This thesis focuses on developing comparative genomics methods in eukaryotes, with an emphasis on applications for gene function prediction and regulatory element detection. In the past, methods have been developed to predict functional associations between gene pairs in prokaryotes. The challenge

  17. Poster: the macaque genome.

    Science.gov (United States)

    2007-04-13

    The rhesus macaque (Macaca mulatta) facilitates an extraordinary range of biomedical and basic research, and the publication of the genome only makes it a more powerful model for studies of human disease; moreover, the macaque's position relative to humans and chimpanzees affords the opportunity to learn about the processes that have shaped the last 25 million years of primate evolution. To allow users to explore these themes of the macaque genome, Science has created a special interactive version of the poster published in the print edition of the 13 April 2007 issue. The interactive version includes additional text and exploration, as well as embedded video featuring seven scientists discussing the importance of the macaque and its genome sequence in studies of biomedicine and evolution. We have also created an accompanying teaching resource, including a lesson plan aimed at teachers of advanced high school life science students, for exploring what a comparison of the macaque and human genomes can tell us about human biology and evolution. These items are free to all site visitors.

  18. RIKEN mouse genome encyclopedia.

    Science.gov (United States)

    Hayashizaki, Yoshihide

    2003-01-01

    We have been working to establish the comprehensive mouse full-length cDNA collection and sequence database to cover as many genes as we can, named Riken mouse genome encyclopedia. Recently we are constructing higher-level annotation (Functional ANnoTation Of Mouse cDNA; FANTOM) not only with homology search based annotation but also with expression data profile, mapping information and protein-protein database. More than 1,000,000 clones prepared from 163 tissues were end-sequenced to classify into 159,789 clusters and 60,770 representative clones were fully sequenced. As a conclusion, the 60,770 sequences contained 33,409 unique. The next generation of life science is clearly based on all of the genome information and resources. Based on our cDNA clones we developed the additional system to explore gene function. We developed cDNA microarray system to print all of these cDNA clones, protein-protein interaction screening system, protein-DNA interaction screening system and so on. The integrated database of all the information is very useful not only for analysis of gene transcriptional network and for the connection of gene to phenotype to facilitate positional candidate approach. In this talk, the prospect of the application of these genome resourced should be discussed. More information is available at the web page: http://genome.gsc.riken.go.jp/.

  19. The tomato genome

    Science.gov (United States)

    The tomato genome sequence was undertaken at a time when state-of-the-art sequencing methodologies were undergoing a transition to co-called next generation methodologies. The result was an international consortium undertaking a strategy merging both old and new approaches. Because biologists were...

  20. Statistical Methods in Integrative Genomics

    OpenAIRE

    Richardson, Sylvia; Tseng, George C.; Sun, Wei

    2016-01-01

    Statistical methods in integrative genomics aim to answer important biology questions by jointly analyzing multiple types of genomic data (vertical integration) or aggregating the same type of data across multiple studies (horizontal integration). In this article, we introduce different types of genomic data and data resources, and then review statistical methods of integrative genomics, with emphasis on the motivation and rationale of these methods. We conclude with some summary points and f...

  1. Pan-genome sequence analysis using Panseq: an online tool for the rapid analysis of core and accessory genomic regions

    Directory of Open Access Journals (Sweden)

    Villegas Andre

    2010-09-01

    Full Text Available Abstract Background The pan-genome of a bacterial species consists of a core and an accessory gene pool. The accessory genome is thought to be an important source of genetic variability in bacterial populations and is gained through lateral gene transfer, allowing subpopulations of bacteria to better adapt to specific niches. Low-cost and high-throughput sequencing platforms have created an exponential increase in genome sequence data and an opportunity to study the pan-genomes of many bacterial species. In this study, we describe a new online pan-genome sequence analysis program, Panseq. Results Panseq was used to identify Escherichia coli O157:H7 and E. coli K-12 genomic islands. Within a population of 60 E. coli O157:H7 strains, the existence of 65 accessory genomic regions identified by Panseq analysis was confirmed by PCR. The accessory genome and binary presence/absence data, and core genome and single nucleotide polymorphisms (SNPs of six L. monocytogenes strains were extracted with Panseq and hierarchically clustered and visualized. The nucleotide core and binary accessory data were also used to construct maximum parsimony (MP trees, which were compared to the MP tree generated by multi-locus sequence typing (MLST. The topology of the accessory and core trees was identical but differed from the tree produced using seven MLST loci. The Loci Selector module found the most variable and discriminatory combinations of four loci within a 100 loci set among 10 strains in 1 s, compared to the 449 s required to exhaustively search for all possible combinations; it also found the most discriminatory 20 loci from a 96 loci E. coli O157:H7 SNP dataset. Conclusion Panseq determines the core and accessory regions among a collection of genomic sequences based on user-defined parameters. It readily extracts regions unique to a genome or group of genomes, identifies SNPs within shared core genomic regions, constructs files for use in phylogeny programs

  2. Methods for identifying and mapping recent segmental and gene duplications in eukaryotic genomes.

    Science.gov (United States)

    Khaja, Razi; MacDonald, Jeffrey R; Zhang, Junjun; Scherer, Stephen W

    2006-01-01

    The aim of this chapter is to provide instruction for analyzing and mapping recent segmental and gene duplications in eukaryotic genomes. We describe a bioinformatics-based approach utilizing computational tools to manage eukaryotic genome sequences to characterize and understand the evolutionary fates and trajectories of duplicated genes. An introduction to bioinformatics tools and programs such as BLAST, Perl, BioPerl, and the GFF specification provides the necessary background to complete this analysis for any eukaryotic genome of interest.

  3. Genomic prediction for Nordic Red Cattle using one-step and selection index blending

    DEFF Research Database (Denmark)

    Guosheng, Su; Madsen, Per; Nielsen, Ulrik Sander

    2012-01-01

    This study investigated the accuracy of direct genomic breeding values (DGV) using a genomic BLUP model, genomic enhanced breeding values (GEBV) using a one-step blending approach, and GEBV using a selection index blending approach for 15 traits of Nordic Red Cattle. The data comprised 6,631 bull......-step blending approach is a good alternative to predict GEBV in practical genetic evaluation program....

  4. Complete genome sequence of "Thioalkalivibrio sulfidophilus" HL-EbGr7.

    Science.gov (United States)

    Muyzer, Gerard; Sorokin, Dimitry Yu; Mavromatis, Konstantinos; Lapidus, Alla; Clum, Alicia; Ivanova, Natalia; Pati, Amrita; d'Haeseleer, Patrick; Woyke, Tanja; Kyrpides, Nikos C

    2011-02-14

    "Thioalkalivibrio sulfidophilus" HL-EbGr7 is an obligately chemolithoautotrophic, haloalkaliphilic sulfur-oxidizing bacterium (SOB) belonging to the Gammaproteobacteria. The strain was found to predominate a full-scale bioreactor, removing sulfide from biogas. Here we report the complete genome sequence of strain HL-EbGr7 and its annotation. The genome was sequenced within the Joint Genome Institute Community Sequencing Program, because of its relevance to the sustainable removal of sulfide from bio- and industrial waste gases.

  5. Genome stability in Caenorhabditis elegans

    NARCIS (Netherlands)

    Haaften, G.W. van

    2006-01-01

    Genome stability is closely linked to cancer. Most, if not all tumor cells show some form of genome instability, mutations can range from single point mutations to gross chromosomal rearrangements and aneuploidy. Genome instability is believed to be the driving force behind tumorigenesis. In order t

  6. Genome stability in Caenorhabditis elegans

    NARCIS (Netherlands)

    Haaften, G.W. van

    2006-01-01

    Genome stability is closely linked to cancer. Most, if not all tumor cells show some form of genome instability, mutations can range from single point mutations to gross chromosomal rearrangements and aneuploidy. Genome instability is believed to be the driving force behind tumorigenesis. In order t

  7. The UCSC genome browser database

    DEFF Research Database (Denmark)

    Kuhn, R M; Karolchik, D; Zweig, A S

    2007-01-01

    The University of California, Santa Cruz Genome Browser Database contains, as of September 2006, sequence and annotation data for the genomes of 13 vertebrate and 19 invertebrate species. The Genome Browser displays a wide variety of annotations at all scales from the single nucleotide level up t...

  8. The UCSC Genome Browser Database

    DEFF Research Database (Denmark)

    Hinrichs, A S; Karolchik, D; Baertsch, R

    2006-01-01

    The University of California Santa Cruz Genome Browser Database (GBD) contains sequence and annotation data for the genomes of about a dozen vertebrate species and several major model organisms. Genome annotations typically include assembly data, sequence composition, genes and gene predictions, ...

  9. Targeted Large-Scale Deletion of Bacterial Genomes Using CRISPR-Nickases.

    Science.gov (United States)

    Standage-Beier, Kylie; Zhang, Qi; Wang, Xiao

    2015-11-20

    Programmable CRISPR-Cas systems have augmented our ability to produce precise genome manipulations. Here we demonstrate and characterize the ability of CRISPR-Cas derived nickases to direct targeted recombination of both small and large genomic regions flanked by repetitive elements in Escherichia coli. While CRISPR directed double-stranded DNA breaks are highly lethal in many bacteria, we show that CRISPR-guided nickase systems can be programmed to make precise, nonlethal, single-stranded incisions in targeted genomic regions. This induces recombination events and leads to targeted deletion. We demonstrate that dual-targeted nicking enables deletion of 36 and 97 Kb of the genome. Furthermore, multiplex targeting enables deletion of 133 Kb, accounting for approximately 3% of the entire E. coli genome. This technology provides a framework for methods to manipulate bacterial genomes using CRISPR-nickase systems. We envision this system working synergistically with preexisting bacterial genome engineering methods.

  10. Implications of the Human Genome Project

    Energy Technology Data Exchange (ETDEWEB)

    Kitcher, P.

    1998-11-01

    The Human Genome Project (HGP), launched in 1991, aims to map and sequence the human genome by 2006. During the fifteen-year life of the project, it is projected that $3 billion in federal funds will be allocated to it. The ultimate aims of spending this money are to analyze the structure of human DNA, to identify all human genes, to recognize the functions of those genes, and to prepare for the biology and medicine of the twenty-first century. The following summary examines some of the implications of the program, concentrating on its scientific import and on the ethical and social problems that it raises. Its aim is to expose principles that might be used in applying the information which the HGP will generate. There is no attempt here to translate the principles into detailed proposals for legislation. Arguments and discussion can be found in the full report, but, like this summary, that report does not contain any legislative proposals.

  11. Programming languages for synthetic biology.

    Science.gov (United States)

    Umesh, P; Naveen, F; Rao, Chanchala Uma Maheswara; Nair, Achuthsankar S

    2010-12-01

    In the backdrop of accelerated efforts for creating synthetic organisms, the nature and scope of an ideal programming language for scripting synthetic organism in-silico has been receiving increasing attention. A few programming languages for synthetic biology capable of defining, constructing, networking, editing and delivering genome scale models of cellular processes have been recently attempted. All these represent important points in a spectrum of possibilities. This paper introduces Kera, a state of the art programming language for synthetic biology which is arguably ahead of similar languages or tools such as GEC, Antimony and GenoCAD. Kera is a full-fledged object oriented programming language which is tempered by biopart rule library named Samhita which captures the knowledge regarding the interaction of genome components and catalytic molecules. Prominent feature of the language are demonstrated through a toy example and the road map for the future development of Kera is also presented.

  12. Structural genomics of eukaryotic targets at a laboratory scale.

    Science.gov (United States)

    Busso, Didier; Poussin-Courmontagne, Pierre; Rosé, David; Ripp, Raymond; Litt, Alain; Thierry, Jean-Claude; Moras, Dino

    2005-01-01

    Structural genomics programs are distributed worldwide and funded by large institutions such as the NIH in United-States, the RIKEN in Japan or the European Commission through the SPINE network in Europe. Such initiatives, essentially managed by large consortia, led to technology and method developments at the different steps required to produce biological samples compatible with structural studies. Besides specific applications, method developments resulted mainly upon miniaturization and parallelization. The challenge that academic laboratories faces to pursue structural genomics programs is to produce, at a higher rate, protein samples. The Structural Biology and Genomics Department (IGBMC - Illkirch - France) is implicated in a structural genomics program of high eukaryotes whose goal is solving crystal structures of proteins and their complexes (including large complexes) related to human health and biotechnology. To achieve such a challenging goal, the Department has established a medium-throughput pipeline for producing protein samples suitable for structural biology studies. Here, we describe the setting up of our initiative from cloning to crystallization and we demonstrate that structural genomics may be manageable by academic laboratories by strategic investments in robotic and by adapting classical bench protocols and new developments, in particular in the field of protein expression, to parallelization.

  13. Using Genomics for Natural Product Structure Elucidation.

    Science.gov (United States)

    Tietz, Jonathan I; Mitchell, Douglas A

    2016-01-01

    Natural products (NPs) are the most historically bountiful source of chemical matter for drug development-especially for anti-infectives. With insights gleaned from genome mining, interest in natural product discovery has been reinvigorated. An essential stage in NP discovery is structural elucidation, which sheds light not only on the chemical composition of a molecule but also its novelty, properties, and derivatization potential. The history of structure elucidation is replete with techniquebased revolutions: combustion analysis, crystallography, UV, IR, MS, and NMR have each provided game-changing advances; the latest such advance is genomics. All natural products have a genetic basis, and the ability to obtain and interpret genomic information for structure elucidation is increasingly available at low cost to non-specialists. In this review, we describe the value of genomics as a structural elucidation technique, especially from the perspective of the natural product chemist approaching an unknown metabolite. Herein we first introduce the databases and programs of interest to the natural products chemist, with an emphasis on those currently most suited for general usability. We describe strategies for linking observed natural product-linked phenotypes to their corresponding gene clusters. We then discuss techniques for extracting structural information from genes, illustrated with numerous case examples. We also provide an analysis of the biases and limitations of the field with recommendations for future development. Our overview is not only aimed at biologically-oriented researchers already at ease with bioinformatic techniques, but also, in particular, at natural product, organic, and/or medicinal chemists not previously familiar with genomic techniques.

  14. wFleaBase: the Daphnia genome database

    Directory of Open Access Journals (Sweden)

    Singan Vasanth R

    2005-03-01

    Full Text Available Abstract Background wFleaBase is a database with the necessary infrastructure to curate, archive and share genetic, molecular and functional genomic data and protocols for an emerging model organism, the microcrustacean Daphnia. Commonly known as the water-flea, Daphnia's ecological merit is unequaled among metazoans, largely because of its sentinel role within freshwater ecosystems and over 200 years of biological investigations. By consequence, the Daphnia Genomics Consortium (DGC has launched an interdisciplinary research program to create the resources needed to study genes that affect ecological and evolutionary success in natural environments. Discussion These tools include the genome database wFleaBase, which currently contains functions to search and extract information from expressed sequenced tags, genome survey sequences and full genome sequencing projects. This new database is built primarily from core components of the Generic Model Organism Database project, and related bioinformatics tools. Summary Over the coming year, preliminary genetic maps and the nearly complete genomic sequence of Daphnia pulex will be integrated into wFleaBase, including gene predictions and ortholog assignments based on sequence similarities with eukaryote genes of known function. wFleaBase aims to serve a large ecological and evolutionary research community. Our challenge is to rapidly expand its content and to ultimately integrate genetic and functional genomic information with population-level responses to environmental challenges. URL: http://wfleabase.org/.

  15. Exploiting linkage disequilibrium in statistical modelling in quantitative genomics

    DEFF Research Database (Denmark)

    Wang, Lei

    Alleles at two loci are said to be in linkage disequilibrium (LD) when they are correlated or statistically dependent. Genomic prediction and gene mapping rely on the existence of LD between gentic markers and causul variants of complex traits. In the first part of the thesis, a novel method...... to quantify and visualize local variation in LD along chromosomes in describet, and applied to characterize LD patters at the local and genome-wide scale in three Danish pig breeds. In the second part, different ways of taking LD into account in genomic prediction models are studied. One approach is to use...... the recently proposed antedependence models, which treat neighbouring marker effects as correlated; another approach involves use of haplotype block information derived using the program Beagle. The overall conclusion is that taking LD information into account in genomic prediction models potentially improves...

  16. A report from the Sixth International Mouse Genome Conference

    Energy Technology Data Exchange (ETDEWEB)

    Brown, S. [Saint Mary`s Hospital Medical School, London (United Kingdom). Dept. of Biochemistry and Molecular Genetics

    1992-12-31

    The Sixth Annual Mouse Genome Conference was held in October, 1992 at Buffalo, USA. The mouse is one of the primary model organisms in the Human Genome Project. Through the use of gene targeting studies the mouse has become a powerful biological model for the study of gene function and, in addition, the comparison of the many homologous mutations identified in human and mouse have widened our understanding of the biology of these two organisms. A primary goal in the mouse genome program has been to create a genetic map of STSs of high resolution (<1cM) that would form the basis for the physical mapping of the whole mouse genome. Buffalo saw substantial new progress towards the goal of a very high density genetic map and the beginnings of substantive efforts towards physical mapping in chromosome regions with a high density of genetic markers.

  17. Herpesvirus Genome Integration into Telomeric Repeats of Host Cell Chromosomes.

    Science.gov (United States)

    Osterrieder, Nikolaus; Wallaschek, Nina; Kaufer, Benedikt B

    2014-11-01

    It is well known that numerous viruses integrate their genetic material into host cell chromosomes. Human herpesvirus 6 (HHV-6) and oncogenic Marek's disease virus (MDV) have been shown to integrate their genomes into host telomeres of latently infected cells. This is unusual for herpesviruses as most maintain their genomes as circular episomes during the quiescent stage of infection. The genomic DNA of HHV-6, MDV, and several other herpesviruses harbors telomeric repeats (TMRs) that are identical to host telomere sequences (TTAGGG). At least in the case of MDV, viral TMRs facilitate integration into host telomeres. Integration of HHV-6 occurs not only in lymphocytes but also in the germline of some individuals, allowing vertical virus transmission. Although the molecular mechanism of telomere integration is poorly understood, the presence of TMRs in a number of herpesviruses suggests it is their default program for genome maintenance during latency and also allows efficient reactivation.

  18. Assigning protein functions by comparative genome analysis protein phylogenetic profiles

    Science.gov (United States)

    Pellegrini, Matteo; Marcotte, Edward M.; Thompson, Michael J.; Eisenberg, David; Grothe, Robert; Yeates, Todd O.

    2003-05-13

    A computational method system, and computer program are provided for inferring functional links from genome sequences. One method is based on the observation that some pairs of proteins A' and B' have homologs in another organism fused into a single protein chain AB. A trans-genome comparison of sequences can reveal these AB sequences, which are Rosetta Stone sequences because they decipher an interaction between A' and B. Another method compares the genomic sequence of two or more organisms to create a phylogenetic profile for each protein indicating its presence or absence across all the genomes. The profile provides information regarding functional links between different families of proteins. In yet another method a combination of the above two methods is used to predict functional links.

  19. Determining protein function and interaction from genome analysis

    Science.gov (United States)

    Eisenberg, David; Marcotte, Edward M.; Thompson, Michael J.; Pellegrini, Matteo; Yeates, Todd O.

    2004-08-03

    A computational method system, and computer program are provided for inferring functional links from genome sequences. One method is based on the observation that some pairs of proteins A' and B' have homologs in another organism fused into a single protein chain AB. A trans-genome comparison of sequences can reveal these AB sequences, which are Rosetta Stone sequences because they decipher an interaction between A' and B. Another method compares the genomic sequence of two or more organisms to create a phylogenetic profile for each protein indicating its presence or absence across all the genomes. The profile provides information regarding functional links between different families of proteins. In yet another method a combination of the above two methods is used to predict functional links.

  20. Nongenetic functions of the genome.

    Science.gov (United States)

    Bustin, Michael; Misteli, Tom

    2016-05-01

    The primary function of the genome is to store, propagate, and express the genetic information that gives rise to a cell's architectural and functional machinery. However, the genome is also a major structural component of the cell. Besides its genetic roles, the genome affects cellular functions by nongenetic means through its physical and structural properties, particularly by exerting mechanical forces and by serving as a scaffold for binding of cellular components. Major cellular processes affected by nongenetic functions of the genome include establishment of nuclear structure, signal transduction, mechanoresponses, cell migration, and vision in nocturnal animals. We discuss the concept, mechanisms, and implications of nongenetic functions of the genome.

  1. Microbial Genomics Research in China

    Institute of Scientific and Technical Information of China (English)

    ZHAO Guo-ping

    2004-01-01

    @@ Microorganisms, including phage/virus, were initial targets and tools for developing DNA sequencing technology. Microbial genomic study was started as a model system for the Human Genome Project (HGP) and it did successfully supported the HGP, particularly with respect to BAC contig construction and large-scale shotgun sequencing and assembly. Microbial genomics study has become the fastest developed genomics discipline along with HGP, taking the advantage of the organisms' highly diversified physiology, extremely long history of evolution, close relationship with human/environment,as well as relatively small genome sizes and simple systems for functional analysis.

  2. Microbial Genomics Research in China

    Institute of Scientific and Technical Information of China (English)

    ZHAOGuo-ping

    2004-01-01

    Microorganisms, including phage/virus, were initial targets and tools for developing DNA sequencing technology. Microbial genomic study was started as a model system for the Human Genome Project (HGP) and it did successfully supported the HGP, particularly with respect to BAC contig construction and large-scale shotgun sequencing and assembly. Microbial genomics study has become the fastest developed genomics discipline along with HGP, taking the advantage of the organisms' highly diversified physiology, extremely long history of evolution, close relationship with human/environment,as well as relatively small genome sizes and simple systems for functional analysis.

  3. Genomic Databases for Crop Improvement

    Directory of Open Access Journals (Sweden)

    David Edwards

    2012-03-01

    Full Text Available Genomics is playing an increasing role in plant breeding and this is accelerating with the rapid advances in genome technology. Translating the vast abundance of data being produced by genome technologies requires the development of custom bioinformatics tools and advanced databases. These range from large generic databases which hold specific data types for a broad range of species, to carefully integrated and curated databases which act as a resource for the improvement of specific crops. In this review, we outline some of the features of plant genome databases, identify specific resources for the improvement of individual crops and comment on the potential future direction of crop genome databases.

  4. Genomics and the immune system.

    Science.gov (United States)

    Pipkin, Matthew E; Monticelli, Silvia

    2008-05-01

    While the hereditary information encoded in the Watson-Crick base pairing of genomes is largely static within a given individual, access to this information is controlled by dynamic mechanisms. The human genome is pervasively transcribed, but the roles played by the majority of the non-protein-coding genome sequences are still largely unknown. In this review we focus on insights to gene transcriptional regulation by placing special emphasis on genome-wide approaches, and on how non-coding RNAs, which derive from global transcription of the genome, in turn control gene expression. We review recent progress in the field with highlights on the immune system.

  5. The perennial ryegrass GenomeZipper: targeted use of genome resources for comparative grass genomics.

    Science.gov (United States)

    Pfeifer, Matthias; Martis, Mihaela; Asp, Torben; Mayer, Klaus F X; Lübberstedt, Thomas; Byrne, Stephen; Frei, Ursula; Studer, Bruno

    2013-02-01

    Whole-genome sequences established for model and major crop species constitute a key resource for advanced genomic research. For outbreeding forage and turf grass species like ryegrasses (Lolium spp.), such resources have yet to be developed. Here, we present a model of the perennial ryegrass (Lolium perenne) genome on the basis of conserved synteny to barley (Hordeum vulgare) and the model grass genome Brachypodium (Brachypodium distachyon) as well as rice (Oryza sativa) and sorghum (Sorghum bicolor). A transcriptome-based genetic linkage map of perennial ryegrass served as a scaffold to establish the chromosomal arrangement of syntenic genes from model grass species. This scaffold revealed a high degree of synteny and macrocollinearity and was then utilized to anchor a collection of perennial ryegrass genes in silico to their predicted genome positions. This resulted in the unambiguous assignment of 3,315 out of 8,876 previously unmapped genes to the respective chromosomes. In total, the GenomeZipper incorporates 4,035 conserved grass gene loci, which were used for the first genome-wide sequence divergence analysis between perennial ryegrass, barley, Brachypodium, rice, and sorghum. The perennial ryegrass GenomeZipper is an ordered, information-rich genome scaffold, facilitating map-based cloning and genome assembly in perennial ryegrass and closely related Poaceae species. It also represents a milestone in describing synteny between perennial ryegrass and fully sequenced model grass genomes, thereby increasing our understanding of genome organization and evolution in the most important temperate forage and turf grass species.

  6. Implementing genomics and pharmacogenomics in the clinic: The National Human Genome Research Institute's genomic medicine portfolio.

    Science.gov (United States)

    Manolio, Teri A

    2016-10-01

    Increasing knowledge about the influence of genetic variation on human health and growing availability of reliable, cost-effective genetic testing have spurred the implementation of genomic medicine in the clinic. As defined by the National Human Genome Research Institute (NHGRI), genomic medicine uses an individual's genetic information in his or her clinical care, and has begun to be applied effectively in areas such as cancer genomics, pharmacogenomics, and rare and undiagnosed diseases. In 2011 NHGRI published its strategic vision for the future of genomic research, including an ambitious research agenda to facilitate and promote the implementation of genomic medicine. To realize this agenda, NHGRI is consulting and facilitating collaborations with the external research community through a series of "Genomic Medicine Meetings," under the guidance and leadership of the National Advisory Council on Human Genome Research. These meetings have identified and begun to address significant obstacles to implementation, such as lack of evidence of efficacy, limited availability of genomics expertise and testing, lack of standards, and difficulties in integrating genomic results into electronic medical records. The six research and dissemination initiatives comprising NHGRI's genomic research portfolio are designed to speed the evaluation and incorporation, where appropriate, of genomic technologies and findings into routine clinical care. Actual adoption of successful approaches in clinical care will depend upon the willingness, interest, and energy of professional societies, practitioners, patients, and payers to promote their responsible use and share their experiences in doing so.

  7. Genome Engineering of Drosophila with the CRISPR RNA-Guided Cas9 Nuclease

    OpenAIRE

    Gratz, Scott J.; Cummings, Alexander M.; Nguyen, Jennifer N.; Hamm, Danielle C.; Donohue, Laura K.; Harrison, Melissa M; Wildonger, Jill; O’Connor-Giles, Kate M.

    2013-01-01

    We have adapted a bacterial CRISPR RNA/Cas9 system to precisely engineer the Drosophila genome and report that Cas9-mediated genomic modifications are efficiently transmitted through the germline. This RNA-guided Cas9 system can be rapidly programmed to generate targeted alleles for probing gene function in Drosophila.

  8. 76 FR 63932 - National Human Genome Research Institute; Notice of Closed Meeting

    Science.gov (United States)

    2011-10-14

    ... HUMAN SERVICES National Institutes of Health National Human Genome Research Institute; Notice of Closed... of Committee: National Human Genome Research Institute Special Emphasis Panel, ENCODE Technology RFA...- 4280, mckenneyk@mail.nih.gov . (Catalogue of Federal Domestic Assistance Program Nos. 93.172,...

  9. 75 FR 53703 - National Human Genome Research Institute; Notice of Closed Meeting

    Science.gov (United States)

    2010-09-01

    ... National Human Genome Research Institute; Notice of Closed Meeting Pursuant to section 10(d) of the Federal... Review Officer, Scientific Review Branch, National Human Genome Research Institute, National Institutes... review and funding cycle. (Catalogue of Federal Domestic Assistance Program Nos. 93.172, Human...

  10. 75 FR 26762 - National Human Genome Research Institute; Notice of Closed Meeting

    Science.gov (United States)

    2010-05-12

    ... clearly unwarranted invasion of personal privacy. Name of Committee: National Human Genome Research....nih.gov . (Catalogue of Federal Domestic Assistance Program Nos. 93.172, Human Genome Research... No: 2010-11051] DEPARTMENT OF HEALTH AND HUMAN SERVICES National Institutes of Health National...

  11. Complete genome of Nitrosospira briensis C-128, an ammonia-oxidizing bacterium from agricultural soil

    NARCIS (Netherlands)

    Rice, Marlen C.; Norton, Jeanette M.; Valois, Frederica; Bollmann, Annette; Bottomley, Peter J.; Klotz, Martin G.; Laanbroek, Hendrikus J.; Suwa, Yuichi; Stein, Lisa Y.; Sayavedra-Soto, Luis; Woyke, Tanja; Shapiro, Nicole; Goodwin, Lynne A.; Huntemann, Marcel; Clum, Alicia; Pillay, Manoj; Kyrpides, Nikos; Varghese, Neha; Mikhailova, Natalia; Markowitz, Victor; Palaniappan, Krishna; Ivanova, Natalia; Stamatis, Dimitrios; Reddy, T. B. K.; Ngan, Chew Yee; Daum, Chris

    2016-01-01

    Nitrosospira briensis C-128 is an ammonia-oxidizing bacterium isolated from an acid agricultural soil. N. briensis C-128 was sequenced with PacBio RS technologies at the DOE-Joint Genome Institute through their Community Science Program (2010). The high-quality finished genome contains one chromosom

  12. Keeping the genome in shape : a role for protein and RNA

    NARCIS (Netherlands)

    Splinter, E.C.|info:eu-repo/dai/nl/344681742

    2011-01-01

    Over 200 cell types exist within the human body, each being different in morphology and function, yet all containing the same genome. Regulation programs acting on the approximately 25.000 genes found in a typical mammalian genome drive the specification of cells during development. Also during late

  13. Genomics in Public Health: Perspective from the Office of Public Health Genomics at the Centers for Disease Control and Prevention (CDC

    Directory of Open Access Journals (Sweden)

    Ridgely Fisk Green

    2015-09-01

    Full Text Available The national effort to use genomic knowledge to save lives is gaining momentum, as illustrated by the inclusion of genomics in key public health initiatives, including Healthy People 2020, and the recent launch of the precision medicine initiative. The Office of Public Health Genomics (OPHG at the Centers for Disease Control and Prevention (CDC partners with state public health departments and others to advance the translation of genome-based discoveries into disease prevention and population health. To do this, OPHG has adopted an “identify, inform, and integrate” model: identify evidence-based genomic applications ready for implementation, inform stakeholders about these applications, and integrate these applications into public health at the local, state, and national level. This paper addresses current and future work at OPHG for integrating genomics into public health programs.

  14. Evolution of small prokaryotic genomes

    Directory of Open Access Journals (Sweden)

    David José Martínez-Cano

    2015-01-01

    Full Text Available As revealed by genome sequencing, the biology of prokaryotes with reduced genomes is strikingly diverse. These include free-living prokaryotes with ~800 genes as well as endosymbiotic bacteria with as few as ~140 genes. Comparative genomics is revealing the evolutionary mechanisms that led to these small genomes. In the case of free-living prokaryotes, natural selection directly favored genome reduction, while in the case of endosymbiotic prokaryotes neutral processes played a more prominent role. However, new experimental data suggest that selective processes may be at operation as well for endosymbiotic prokaryotes at least during the first stages of genome reduction. Endosymbiotic prokaryotes have evolved diverse strategies for living with reduced gene sets inside a host-defined medium. These include utilization of host-encoded functions (some of them coded by genes acquired by gene transfer from the endosymbiont and/or other bacteria; metabolic complementation between co-symbionts; and forming consortiums with other bacteria within the host. Recent genome sequencing projects of intracellular mutualistic bacteria showed that previously believed universal evolutionary trends like reduced G+C content and conservation of genome synteny are not always present in highly reduced genomes. Finally, the simplified molecular machinery of some of these organisms with small genomes may be used to aid in the design of artificial minimal cells. Here we review recent genomic discoveries of the biology of prokaryotes endowed with small gene sets and discuss the evolutionary mechanisms that have been proposed to explain their peculiar nature.

  15. Informational laws of genome structures

    Science.gov (United States)

    Bonnici, Vincenzo; Manca, Vincenzo

    2016-06-01

    In recent years, the analysis of genomes by means of strings of length k occurring in the genomes, called k-mers, has provided important insights into the basic mechanisms and design principles of genome structures. In the present study, we focus on the proper choice of the value of k for applying information theoretic concepts that express intrinsic aspects of genomes. The value k = lg2(n), where n is the genome length, is determined to be the best choice in the definition of some genomic informational indexes that are studied and computed for seventy genomes. These indexes, which are based on information entropies and on suitable comparisons with random genomes, suggest five informational laws, to which all of the considered genomes obey. Moreover, an informational genome complexity measure is proposed, which is a generalized logistic map that balances entropic and anti-entropic components of genomes and is related to their evolutionary dynamics. Finally, applications to computational synthetic biology are briefly outlined.

  16. Toward genome-enabled mycology.

    Science.gov (United States)

    Hibbett, David S; Stajich, Jason E; Spatafora, Joseph W

    2013-01-01

    Genome-enabled mycology is a rapidly expanding field that is characterized by the pervasive use of genome-scale data and associated computational tools in all aspects of fungal biology. Genome-enabled mycology is integrative and often requires teams of researchers with diverse skills in organismal mycology, bioinformatics and molecular biology. This issue of Mycologia presents the first complete fungal genomes in the history of the journal, reflecting the ongoing transformation of mycology into a genome-enabled science. Here, we consider the prospects for genome-enabled mycology and the technical and social challenges that will need to be overcome to grow the database of complete fungal genomes and enable all fungal biologists to make use of the new data.

  17. Comparative genomics of Lactobacillus and other LAB

    DEFF Research Database (Denmark)

    Wassenaar, Trudy M.; Lukjancenko, Oksana

    2014-01-01

    The genomes of 66 LABs, belonging to five different genera, were compared for genome size and gene content. The analyzed genomes included 37 Lactobacillus genomes of 17 species, six Lactococcus lactis genomes, four Leuconostoc genomes of three species, six Streptococcus genomes of two species......, twelve Enterococcus genomes of four species and a single Weissella genome. Genomes of pathogenic strains or species were not included. Since the gene density in these genomes is relatively constant, genome size is a measure of gene content. The genomes of Enterococcus were significantly larger than...... that of the others, with the two Streptococcus species having the shortest genomes. The widest distribution in genome content was observed for Lactobacillus. The number of tRNA and rRNA gene copies varied considerably, with exceptional high numbers observed for Lb. delbrueckii, while these numbers were relatively...

  18. Marine Bacterial Genomics

    DEFF Research Database (Denmark)

    Machado, Henrique

    microorganisms to be used as cell factories for production. Therefore exploitation of new microbial niches and use of different strategies is an opportunity to boost discoveries. Even though scientists have started to explore several habitats other than the terrestrial ones, the marine environment stands out...... as a hitherto under-explored niche. This thesis work uses high-throughput sequencing technologies on a collection of marine bacteria established during the Galathea 3 expedition, with the purpose of unraveling new biodiversity and new bioactivities. Several tools were used for genomic analysis in order...... to better understand the potential harbored in marine bacteria. The work presented makes use of whole genome sequencing of marine bacteria to prove that the genetic repertoire for secondary metabolite production harbored in these bacteria is far larger than anticipated; to identify and develop a new...

  19. The Genomic Standards Consortium.

    Directory of Open Access Journals (Sweden)

    Dawn Field

    2011-06-01

    Full Text Available A vast and rich body of information has grown up as a result of the world's enthusiasm for 'omics technologies. Finding ways to describe and make available this information that maximise its usefulness has become a major effort across the 'omics world. At the heart of this effort is the Genomic Standards Consortium (GSC, an open-membership organization that drives community-based standardization activities, Here we provide a short history of the GSC, provide an overview of its range of current activities, and make a call for the scientific community to join forces to improve the quality and quantity of contextual information about our public collections of genomes, metagenomes, and marker gene sequences.

  20. Harnessing genomics to improve health in the Eastern Mediterranean Region - an executive course in genomics policy.

    Science.gov (United States)

    Acharya, Tara; Rab, Mohammed Abdur; Singer, Peter A; Daar, Abdallah S

    2005-01-21

    BACKGROUND: While innovations in medicine, science and technology have resulted in improved health and quality of life for many people, the benefits of modern medicine continue to elude millions of people in many parts of the world. To assess the potential of genomics to address health needs in EMR, the World Health Organization's Eastern Mediterranean Regional Office and the University of Toronto Joint Centre for Bioethics jointly organized a Genomics and Public Health Policy Executive Course, held September 20th-23rd, 2003, in Muscat, Oman. The 4-day course was sponsored by WHO-EMRO with additional support from the Canadian Program in Genomics and Global Health. The overall objective of the course was to collectively explore how to best harness genomics to improve health in the region. This article presents the course findings and recommendations for genomics policy in EMR. METHODS: The course brought together senior representatives from academia, biotechnology companies, regulatory bodies, media, voluntary, and legal organizations to engage in discussion. Topics covered included scientific advances in genomics, followed by innovations in business models, public sector perspectives, ethics, legal issues and national innovation systems. RESULTS: A set of recommendations, summarized below, was formulated for the Regional Office, the Member States and for individuals.* Advocacy for genomics and biotechnology for political leadership;* Networking between member states to share information, expertise, training, and regional cooperation in biotechnology; coordination of national surveys for assessment of health biotechnology innovation systems, science capacity, government policies, legislation and regulations, intellectual property policies, private sector activity;* Creation in each member country of an effective National Body on genomics, biotechnology and health to:- formulate national biotechnology strategies- raise biotechnology awareness- encourage teaching and

  1. Are we Genomic Mosaics? Variations of the Genome of Somatic Cells can Contribute to Diversify our Phenotypes.

    Science.gov (United States)

    Astolfi, P A; Salamini, F; Sgaramella, V

    2010-09-01

    Theoretical and experimental evidences support the hypothesis that the genomes and the epigenomes may be different in the somatic cells of complex organisms. In the genome, the differences range from single base substitutions to chromosome number; in the epigenome, they entail multiple postsynthetic modifications of the chromatin. Somatic genome variations (SGV) may accumulate during development in response both to genetic programs, which may differ from tissue to tissue, and to environmental stimuli, which are often undetected and generally irreproducible. SGV may jeopardize physiological cellular functions, but also create novel coding and regulatory sequences, to be exposed to intraorganismal Darwinian selection. Genomes acknowledged as comparatively poor in genes, such as humans', could thus increase their pristine informational endowment. A better understanding of SGV will contribute to basic issues such as the "nature vs nurture" dualism and the inheritance of acquired characters. On the applied side, they may explain the low yield of cloning via somatic cell nuclear transfer, provide clues to some of the problems associated with transdifferentiation, and interfere with individual DNA analysis. SGV may be unique in the different cells types and in the different developmental stages, and thus explain the several hundred gaps persisting in the human genomes "completed" so far. They may compound the variations associated to our epigenomes and make of each of us an "(epi)genomic" mosaic. An ensuing paradigm is the possibility that a single genome (the ephemeral one assembled at fertilization) has the capacity to generate several different brains in response to different environments.

  2. Genome size analyses of Pucciniales reveal the largest fungal genomes

    Directory of Open Access Journals (Sweden)

    Silvia eTavares

    2014-08-01

    Full Text Available Rust fungi (Basidiomycota, Pucciniales are biotrophic plant pathogens which exhibit diverse complexities in their life cycles and host ranges. The completion of genome sequencing of a few rust fungi has revealed the occurrence of large genomes. Sequencing efforts for other rust fungi have been hampered by uncertainty concerning their genome sizes. Flow cytometry was recently applied to estimate the genome size of a few rust fungi, and confirmed the occurrence of large genomes in this order (averaging 151.5 Mbp, while the average for Basidiomycota was 49.9 Mbp and was 37.7 Mbp for all fungi. In this work, we have used an innovative and simple approach to simultaneously isolate nuclei from the rust and its host plant in order to estimate the genome size of 30 rust species by flow cytometry. Genome sizes varied over 10-fold, from 70 to 893 Mbp, with an average genome size value of 380.2 Mbp. Compared to the genome sizes of over 1,800 fungi, Gymnosporangium confusum possesses the largest fungal genome ever reported (893.2 Mbp. Moreover, even the smallest rust genome determined in this study is larger than the vast majority of fungal genomes (94 %. The average genome size of the Pucciniales is now of 305.5 Mbp, while the average Basidiomycota genome size has shifted to 70.4 Mbp and the average for all fungi reached 44.2 Mbp. Despite the fact that no correlation could be drawn between the genome sizes, the phylogenomics or the life cycle of rust fungi, it is interesting to note that rusts with Fabaceae hosts present genomes clearly larger than those with Poaceae hosts. Although this study comprises only a small fraction of the more than 7,000 rust species described, it seems already evident that the Pucciniales represent a group where genome size expansion could be a common characteristic. This is in sharp contrast to sister taxa, placing this order in a relevant position in fungal genomics research.

  3. Genomic Feature Models

    DEFF Research Database (Denmark)

    Sørensen, Peter; Edwards, Stefan McKinnon; Rohde, Palle Duun

    Whole-genome sequences and multiple trait phenotypes from large numbers of individuals will soon be available in many populations. Well established statistical modeling approaches enable the genetic analyses of complex trait phenotypes while accounting for a variety of additive and non-additive g...... regions and gene ontologies) that provide better model fit and increase predictive ability of the statistical model for this trait....

  4. Malaria Genome Sequencing Project

    Science.gov (United States)

    2004-01-01

    million cases and up to 2.7 million A whole chromosome shotgun sequencing strategy was used to deaths from malaria each year. The mortality levels are...deaths from malaria each year. The mortality levels are greatest in determine the genome sequence of P. falciparum clone 3D7. This sub-Saharan Africa...aminolevulinic acid dehydratase. Cura . Genet. 40, 391-398 (2002). 15. Lasonder, E. et al Analysis of the Plasmodium falciparum proteome by high-accuracy mass

  5. Genomic landscape of liposarcoma.

    Science.gov (United States)

    Kanojia, Deepika; Nagata, Yasunobu; Garg, Manoj; Lee, Dhong Hyun; Sato, Aiko; Yoshida, Kenichi; Sato, Yusuke; Sanada, Masashi; Mayakonda, Anand; Bartenhagen, Christoph; Klein, Hans-Ulrich; Doan, Ngan B; Said, Jonathan W; Mohith, S; Gunasekar, Swetha; Shiraishi, Yuichi; Chiba, Kenichi; Tanaka, Hiroko; Miyano, Satoru; Myklebost, Ola; Yang, Henry; Dugas, Martin; Meza-Zepeda, Leonardo A; Silberman, Allan W; Forscher, Charles; Tyner, Jeffrey W; Ogawa, Seishi; Koeffler, H Phillip

    2015-12-15

    Liposarcoma (LPS) is the most common type of soft tissue sarcoma accounting for 20% of all adult sarcomas. Due to absence of clinically effective treatment options in inoperable situations and resistance to chemotherapeutics, a critical need exists to identify novel therapeutic targets. We analyzed LPS genomic landscape using SNP arrays, whole exome sequencing and targeted exome sequencing to uncover the genomic information for development of specific anti-cancer targets. SNP array analysis indicated known amplified genes (MDM2, CDK4, HMGA2) and important novel genes (UAP1, MIR557, LAMA4, CPM, IGF2, ERBB3, IGF1R). Carboxypeptidase M (CPM), recurrently amplified gene in well-differentiated/de-differentiated LPS was noted as a putative oncogene involved in the EGFR pathway. Notable deletions were found at chromosome 1p (RUNX3, ARID1A), chromosome 11q (ATM, CHEK1) and chromosome 13q14.2 (MIR15A, MIR16-1). Significantly and recurrently mutated genes (false discovery rate < 0.05) included PLEC (27%), MXRA5 (21%), FAT3 (24%), NF1 (20%), MDC1 (10%), TP53 (7%) and CHEK2 (6%). Further, in vitro and in vivo functional studies provided evidence for the tumor suppressor role for Neurofibromin 1 (NF1) gene in different subtypes of LPS. Pathway analysis of recurrent mutations demonstrated signaling through MAPK, JAK-STAT, Wnt, ErbB, axon guidance, apoptosis, DNA damage repair and cell cycle pathways were involved in liposarcomagenesis. Interestingly, we also found mutational and copy number heterogeneity within a primary LPS tumor signifying the importance of multi-region sequencing for cancer-genome guided therapy. In summary, these findings provide insight into the genomic complexity of LPS and highlight potential druggable pathways for targeted therapeutic approach.

  6. The Giardia lamblia genome.

    Science.gov (United States)

    Adam, R D

    2000-04-10

    Giardia lamblia is a protozoan parasite of humans and other mammals that is thought to be one of the most primitive extant eukaryotic organisms. Although distinctly eukaryotic, it is notable for its lack of mitochondria, nucleoli, and perixosomes. It has been suggested that Giardia spp. are pre-mitochondriate organisms, but the identification of genes in G. lamblia thought to be of mitochondrial origin has generated controversy regarding that designation. Giardi lamblia trophozoites have two nuclei that are identical in all ways that have been studied. They are polyploid with at least four, and perhaps eight or more, copies of each of five chromosomes per organism and have an estimated genome complexity of 1.2x10(7)bp of DNA, and GC content of 46%. There is evidence for recombination at the telomeres of some of the chromosomes, and multiple size variants of single chromosomes have been identified within cloned isolates. However, the internal regions of the chromosomes demonstrate no evidence of recombination. For example, there is no evidence for control of vsp gene expression by DNA recombination, and no evidence for rapid mutation in the vsp genes. Single pass sequences of approximately 9% of the G. lamblia genome have already been obtained. An ongoing genome project plans to obtain approximately 95% of the genome by a random approach, as well as a complete physical map using a bacterial artificial chromosome library. The results will facilitate a better understanding of the biology of Giardia spp. as well as their phylogenetic relationship to other primitive organisms.

  7. Microarray Genomic Systems Development

    Science.gov (United States)

    2008-06-01

    D Canada Contract Report DRDC Suffield CR 2009-145 June 2008 V. Lam, M. Crichton , T. Dickinson Laing, and D.C. Mah Canada West Biosciences Inc...Genomic Systems Development V. Lam, M. Crichton , T. Dickinson Laing, and D.C. Mah Canada West Biosciences Inc. Canada West Biosciences Inc. 5429... Crichton , M.; Dickinson Laing, T.; Mah, D.C.; DRDC Suffield CR 2009- 145; Defence R&D Canada – Suffield; June 2008. Introduction: Conventional

  8. Conditioned genome reconstruction: how to avoid choosing the conditioning genome.

    Science.gov (United States)

    Spencer, Matthew; Bryant, David; Susko, Edward

    2007-02-01

    Genome phylogenies can be inferred from data on the presence and absence of genes across taxa. Logdet distances may be a good method, because they allow expected genome size to vary across the tree. Recently, Lake and Rivera proposed conditioned genome reconstruction (calculation of logdet distances using only those genes present in a conditioning genome) to deal with unobservable genes that are absent from every taxon of interest. We prove that their method can consistently estimate the topology for almost any choice of conditioning genome. Nevertheless, the choice of conditioning genome is important for small samples. For real bacterial genome data, different choices of conditioning genome can result in strong bootstrap support for different tree topologies. To overcome this problem, we developed supertree methods that combine information from all choices of conditioning genome. One of these methods, based on the BIONJ algorithm, performs well on simulated data and may have applications to other supertree problems. However, an analysis of 40 bacterial genomes using this method supports an incorrect clade of parasites. This is a common feature of model-based gene content methods and is due to parallel gene loss.

  9. Personalized genomic medicine with a patchwork, partially owned genome.

    Science.gov (United States)

    Mason, Christopher E; Seringhaus, Michael R; Sattler de Sousa e Brito, Clara

    2007-12-01

    "His book was known as the Book of Sand, because neither the book nor the sand have any beginning or end." - Jorge Luis BorgesThe human genome is a three billion-letter recipe for the genesis of a human being, directing development from a single-celled embryo to the trillions of adult cells. Since the sequencing of the human genome was announced in 2001, researchers have an increased ability to discern the genetic basis for diseases. This reference genome has opened the door to genomic medicine, aimed at detecting and understanding all genetic variations of the human genome that contribute to the manifestation and progression of disease. The overarching vision of genomic (or "personalized") medicine is to custom-tailor each treatment for maximum effectiveness in an individual patient. Detecting the variation in a patient's deoxyribonucleic acid (DNA), ribonucleic acid (RNA), and protein structures is no longer an insurmountable hurdle. Today, the challenge for genomic medicine lies in contextualizing those myriad genetic variations in terms of their functional consequences for a person's health and development throughout life and in terms of that patient's susceptibility to disease and differential clinical responses to medication. Additionally, several recent developments have complicated our understanding of the nominal human genome and, thereby, altered the progression of genomic medicine. In this brief review, we shall focus on these developments and examine how they are changing our understanding of our genome.

  10. Personalized Genomic Medicine with a Patchwork, Partially Owned Genome

    Science.gov (United States)

    Mason, Christopher E.; Seringhaus, Michael R.; Sattler de Sousa e Brito, Clara

    2008-01-01

    “His book was known as the Book of Sand, because neither the book nor the sand have any beginning or end.” — Jorge Luis Borges The human genome is a three billion-letter recipe for the genesis of a human being, directing development from a single-celled embryo to the trillions of adult cells. Since the sequencing of the human genome was announced in 2001, researchers have an increased ability to discern the genetic basis for diseases. This reference genome has opened the door to genomic medicine, aimed at detecting and understanding all genetic variations of the human genome that contribute to the manifestation and progression of disease. The overarching vision of genomic (or “personalized”) medicine is to custom-tailor each treatment for maximum effectiveness in an individual patient. Detecting the variation in a patient’s deoxyribonucleic acid (DNA), ribonucleic acid (RNA), and protein structures is no longer an insurmountable hurdle. Today, the challenge for genomic medicine lies in contextualizing those myriad genetic variations in terms of their functional consequences for a person’s health and development throughout life and in terms of that patient’s susceptibility to disease and differential clinical responses to medication. Additionally, several recent developments have complicated our understanding of the nominal human genome and, thereby, altered the progression of genomic medicine. In this brief review, we shall focus on these developments and examine how they are changing our understanding of our genome. PMID:18449389

  11. eGenomics: Cataloguing Our Complete Genome Collection III

    Directory of Open Access Journals (Sweden)

    Dawn Field

    2007-01-01

    Full Text Available This meeting report summarizes the proceedings of the “eGenomics: Cataloguing our Complete Genome Collection III” workshop held September 11–13, 2006, at the National Institute for Environmental eScience (NIEeS, Cambridge, United Kingdom. This 3rd workshop of the Genomic Standards Consortium was divided into two parts. The first half of the three-day workshop was dedicated to reviewing the genomic diversity of our current and future genome and metagenome collection, and exploring linkages to a series of existing projects through formal presentations. The second half was dedicated to strategic discussions. Outcomes of the workshop include a revised “Minimum Information about a Genome Sequence” (MIGS specification (v1.1, consensus on a variety of features to be added to the Genome Catalogue (GCat, agreement by several researchers to adopt MIGS for imminent genome publications, and an agreement by the EBI and NCBI to input their genome collections into GCat for the purpose of quantifying the amount of optional data already available (e.g., for geographic location coordinates and working towards a single, global list of all public genomes and metagenomes.

  12. Mapping the human genome

    Energy Technology Data Exchange (ETDEWEB)

    Cantor, Charles R.

    1989-06-01

    The following pages aim to lay a foundation for understanding the excitement surrounding the ''human genome project,'' as well as to convey a flavor of the ongoing efforts and plans at the Human Genome Center at the Lawrence Berkeley Laboratory. Our own work, of course, is only part of a broad international effort that will dramatically enhance our understanding of human molecular genetics before the end of this century. In this country, the bulk of the effort will be carried out under the auspices of the Department of Energy and the National Institutes of Health, but significant contributions have already been made both by nonprofit private foundations and by private corporation. The respective roles of the DOE and the NIH are being coordinated by an inter-agency committee, the aims of which are to emphasize the strengths of each agency, to facilitate cooperation, and to avoid unnecessary duplication of effort. The NIH, for example, will continue its crucial work in medical genetics and in mapping the genomes of nonhuman species. The DOE, on the other hand, has unique experience in managing large projects, and its national laboratories are repositories of expertise in physics, engineering, and computer science, as well as the life sciences. The tools and techniques the project will ultimately rely on are thus likely to be developed in multidisciplinary efforts at laboratories like LBL. Accordingly, we at LBL take great pride in this enterprise -- an enterprise that will eventually transform our understanding of ourselves.

  13. Family genome browser: visualizing genomes with pedigree information.

    Science.gov (United States)

    Juan, Liran; Liu, Yongzhuang; Wang, Yongtian; Teng, Mingxiang; Zang, Tianyi; Wang, Yadong

    2015-07-15

    Families with inherited diseases are widely used in Mendelian/complex disease studies. Owing to the advances in high-throughput sequencing technologies, family genome sequencing becomes more and more prevalent. Visualizing family genomes can greatly facilitate human genetics studies and personalized medicine. However, due to the complex genetic relationships and high similarities among genomes of consanguineous family members, family genomes are difficult to be visualized in traditional genome visualization framework. How to visualize the family genome variants and their functions with integrated pedigree information remains a critical challenge. We developed the Family Genome Browser (FGB) to provide comprehensive analysis and visualization for family genomes. The FGB can visualize family genomes in both individual level and variant level effectively, through integrating genome data with pedigree information. Family genome analysis, including determination of parental origin of the variants, detection of de novo mutations, identification of potential recombination events and identical-by-decent segments, etc., can be performed flexibly. Diverse annotations for the family genome variants, such as dbSNP memberships, linkage disequilibriums, genes, variant effects, potential phenotypes, etc., are illustrated as well. Moreover, the FGB can automatically search de novo mutations and compound heterozygous variants for a selected individual, and guide investigators to find high-risk genes with flexible navigation options. These features enable users to investigate and understand family genomes intuitively and systematically. The FGB is available at http://mlg.hit.edu.cn/FGB/. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  14. Complete Genome Sequences of Bordetella pertussis Vaccine Reference Strains 134 and 10536

    Science.gov (United States)

    Peng, Yanhui; Loparev, Vladimir; Batra, Dhwani; Burroughs, Mark; Johnson, Taccara; Juieng, Phalasy; Rowe, Lori; Tondella, M. Lucia; Williams, Margaret M.

    2016-01-01

    Vaccine formulations and vaccination programs against whooping cough (pertussis) vary worldwide. Here, we report the complete genome sequences of two divergent Bordetella pertussis reference strains used in the production of pertussis vaccines. PMID:27635001

  15. Complete Genome Sequences of Bordetella pertussis Vaccine Reference Strains 134 and 10536.

    Science.gov (United States)

    Weigand, Michael R; Peng, Yanhui; Loparev, Vladimir; Batra, Dhwani; Burroughs, Mark; Johnson, Taccara; Juieng, Phalasy; Rowe, Lori; Tondella, M Lucia; Williams, Margaret M

    2016-09-15

    Vaccine formulations and vaccination programs against whooping cough (pertussis) vary worldwide. Here, we report the complete genome sequences of two divergent Bordetella pertussis reference strains used in the production of pertussis vaccines.

  16. Score-based prediction of genomic islands in prokaryotic genomes using hidden Markov models

    Directory of Open Access Journals (Sweden)

    Surovcik Katharina

    2006-03-01

    Full Text Available Abstract Background Horizontal gene transfer (HGT is considered a strong evolutionary force shaping the content of microbial genomes in a substantial manner. It is the difference in speed enabling the rapid adaptation to changing environmental demands that distinguishes HGT from gene genesis, duplications or mutations. For a precise characterization, algorithms are needed that identify transfer events with high reliability. Frequently, the transferred pieces of DNA have a considerable length, comprise several genes and are called genomic islands (GIs or more specifically pathogenicity or symbiotic islands. Results We have implemented the program SIGI-HMM that predicts GIs and the putative donor of each individual alien gene. It is based on the analysis of codon usage (CU of each individual gene of a genome under study. CU of each gene is compared against a carefully selected set of CU tables representing microbial donors or highly expressed genes. Multiple tests are used to identify putatively alien genes, to predict putative donors and to mask putatively highly expressed genes. Thus, we determine the states and emission probabilities of an inhomogeneous hidden Markov model working on gene level. For the transition probabilities, we draw upon classical test theory with the intention of integrating a sensitivity controller in a consistent manner. SIGI-HMM was written in JAVA and is publicly available. It accepts as input any file created according to the EMBL-format. It generates output in the common GFF format readable for genome browsers. Benchmark tests showed that the output of SIGI-HMM is in agreement with known findings. Its predictions were both consistent with annotated GIs and with predictions generated by different methods. Conclusion SIGI-HMM is a sensitive tool for the identification of GIs in microbial genomes. It allows to interactively analyze genomes in detail and to generate or to test hypotheses about the origin of acquired

  17. Genomic analysis of the rainbow trout response to crowding

    Science.gov (United States)

    Genomic analyses have the potential to impact selective breeding programs by identifying markers as proxies for traits which are expensive or difficult to measure. One such set of traits is the physiological responses of rainbow trout to the stresses of the aquaculture environment. Typical stresso...

  18. Asthma and atopy - a total genome scan for susceptibility genes

    DEFF Research Database (Denmark)

    Haagerup, A; Bjerke, T; Schiøtz, P O;

    2002-01-01

    atopy, allergic asthma and increased total IgE. We performed a total genome scan using 446 microsatellite markers and obtained nonparametric linkage results from the MAPMAKER/SIBS computer program. RESULTS: Our study revealed four candidate regions (MLS > 2) on chromosome 1p36, 3q21-q22, 5q31 and 6p24-p...

  19. Whole-genome sequencing for comparative genomics and de novo genome assembly.

    Science.gov (United States)

    Benjak, Andrej; Sala, Claudia; Hartkoorn, Ruben C

    2015-01-01

    Next-generation sequencing technologies for whole-genome sequencing of mycobacteria are rapidly becoming an attractive alternative to more traditional sequencing methods. In particular this technology is proving useful for genome-wide identification of mutations in mycobacteria (comparative genomics) as well as for de novo assembly of whole genomes. Next-generation sequencing however generates a vast quantity of data that can only be transformed into a usable and comprehensible form using bioinformatics. Here we describe the methodology one would use to prepare libraries for whole-genome sequencing, and the basic bioinformatics to identify mutations in a genome following Illumina HiSeq or MiSeq sequencing, as well as de novo genome assembly following sequencing using Pacific Biosciences (PacBio).

  20. High-Throughput Genomics Enhances Tomato Breeding Efficiency

    Science.gov (United States)

    Barone, A; Di Matteo, A; Carputo, D; Frusciante, L

    2009-01-01

    Tomato (Solanum lycopersicum) is considered a model plant species for a group of economically important crops, such as potato, pepper, eggplant, since it exhibits a reduced genomic size (950 Mb), a short generation time, and routine transformation technologies. Moreover, it shares with the other Solanaceous plants the same haploid chromosome number and a high level of conserved genomic organization. Finally, many genomic and genetic resources are actually available for tomato, and the sequencing of its genome is in progress. These features make tomato an ideal species for theoretical studies and practical applications in the genomics field. The present review describes how structural genomics assist the selection of new varieties resistant to pathogens that cause damage to this crop. Many molecular markers highly linked to resistance genes and cloned resistance genes are available and could be used for a high-throughput screening of multiresistant varieties. Moreover, a new genomics-assisted breeding approach for improving fruit quality is presented and discussed. It relies on the identification of genetic mechanisms controlling the trait of interest through functional genomics tools. Following this approach, polymorphisms in major gene sequences responsible for variability in the expression of the trait under study are then exploited for tracking simultaneously favourable allele combinations in breeding programs using high-throughput genomic technologies. This aims at pyramiding in the genetic background of commercial cultivars alleles that increase their performances. In conclusion, tomato breeding strategies supported by advanced technologies are expected to target increased productivity and lower costs of improved genotypes even for complex traits. PMID:19721805

  1. Genomic prediction of traits related to canine hip dysplasia

    Directory of Open Access Journals (Sweden)

    Enrique eSanchez-Molano

    2015-03-01

    Full Text Available Increased concern for the welfare of pedigree dogs has led to development of selection programs against inherited diseases. An example is canine hip dysplasia (CHD, which has a moderate heritability and a high prevalence in some large-sized breeds. To date, selection using phenotypes has led to only modest improvement, and alternative strategies such as genomic selection may prove more effective. The primary aims of this study were to compare the performance of pedigree- and genomic-based breeding against CHD in the UK Labrador retriever population and to evaluate the performance of different genomic selection methods. A sample of 1179 Labrador Retrievers evaluated for CHD according to the UK scoring method (hip score, HS was genotyped with the Illumina CanineHD BeadChip. Twelve functions of HS and its component traits were analyzed using different statistical methods (GBLUP, Bayes C and Single-Step methods, and results were compared with a pedigree-based approach (BLUP using cross-validation. Genomic methods resulted in similar or higher accuracies than pedigree-based methods with training sets of 944 individuals for all but the untransformed HS, suggesting that genomic selection is an effective strategy. GBLUP and Bayes C gave similar prediction accuracies for HS and related traits, indicating a polygenic architecture. This conclusion was also supported by the low accuracies obtained in additional GBLUP analyses performed using only the SNPs with highest test statistics, also indicating that marker-assisted selection would not be as effective as genomic selection. A Single-Step method that combines genomic and pedigree information also showed higher accuracy than GBLUP and Bayes C for the log-transformed HS, which is currently used for pedigree based evaluations in UK. In conclusion, genomic selection is a promising alternative to pedigree-based selection against CHD, requiring more phenotypes with genomic data to improve further the accuracy

  2. Genome update: the 1000th genome - a cautionary tale

    DEFF Research Database (Denmark)

    Lagesen, Karin; Ussery, David; Wassenaar, Gertrude Maria

    2010-01-01

    conclusions for example about the largest bacterial genome sequenced. Biological diversity is far greater than many have thought. For example, analysis of multiple Escherichia coli genomes has led to an estimate of around 45 000 gene families more genes than are recognized in the human genome. Moreover......There are now more than 1000 sequenced prokaryotic genomes deposited in public databases and available for analysis. Currently, although the sequence databases GenBank, DNA Database of Japan and EMBL are synchronized continually, there are slight differences in content at the genomes level...... for a variety of logistical reasons, including differences in format and loading errors, such as those caused by file transfer protocol interruptions. This means that the 1000th genome will be different in the various databases. Some of the data on the highly accessed web pages are inaccurate, leading to false...

  3. Transgenerational developmental programming.

    Science.gov (United States)

    Aiken, Catherine E; Ozanne, Susan E

    2014-01-01

    The concept of developmental programming suggests that the early life environment influences offspring characteristics in later life, including the propensity to develop diseases such as the metabolic syndrome. There is now growing evidence that the effects of developmental programming may also manifest in further generations without further suboptimal exposure. This review considers the evidence, primarily from rodent models, for effects persisting to subsequent generations, and evaluates the mechanisms by which developmental programming may be transmitted to further generations. In particular, we focus on the potential role of the intrauterine environment in contributing to a developmentally programmed phenotype in subsequent generations. The literature was systematically searched at http://pubmed.org and http://scholar.google.com to identify published findings regarding transgenerational (F2 and beyond) developmental programming effects in human populations and animal models. Transmission of programming effects is often viewed as a form of epigenetic inheritance, either via the maternal or paternal line. Evidence exists for both germline and somatic inheritance of epigenetic modifications which may be responsible for phenotypic changes in further generations. However, there is increasing evidence for the role of both extra-genomic components of the zygote and the interaction of the developing conceptus with the intrauterine environment in propagating programming effects. The contribution of a suboptimal reproductive tract environment or maternal adaptations to pregnancy may be critical to inheritance of programming effects via the maternal line. As the effects of age exacerbate the programmed metabolic phenotype, advancing maternal age may increase the likelihood of developmental programming effects being transmitted to further generations. We suggest that developmental programming effects could be propagated through the maternal line de novo in generations

  4. GO4genome: A Prokaryotic Phylogeny Based on Genome Organization

    OpenAIRE

    Merkl, Rainer; Wiezer, Arnim

    2009-01-01

    Determining the phylogeny of closely related prokaryotes may fail in an analysis of rRNA or a small set of sequences. Whole-genome phylogeny utilizes the maximally available sample space. For a precise determination of genome similarity, two aspects have to be considered when developing an algorithm of whole-genome phylogeny: (1) gene order conservation is a more precise signal than gene content; and (2) when using sequence similarity, failures in identifying orthologues or the in situ replac...

  5. Integrated genome browser: visual analytics platform for genomics

    OpenAIRE

    2016-01-01

    Motivation: Genome browsers that support fast navigation through vast datasets and provide interactive visual analytics functions can help scientists achieve deeper insight into biological systems. Toward this end, we developed Integrated Genome Browser (IGB), a highly configurable, interactive and fast open source desktop genome browser. Results: Here we describe multiple updates to IGB, including all-new capabilities to display and interact with data from high-throughput sequencing experime...

  6. Comparative Genomics Reveals High Genomic Diversity in the Genus Photobacterium

    OpenAIRE

    Henrique Machado; Lone Gram

    2017-01-01

    Vibrionaceae is a large marine bacterial family, which can constitute up to 50% of the prokaryotic population in marine waters. Photobacterium is the second largest genus in the family and we used comparative genomics on 35 strains representing 16 of the 28 species described so far, to understand the genomic diversity present in the Photobacterium genus. Such understanding is important for ecophysiology studies of the genus. We used whole genome sequences to evaluate phylogenetic relationship...

  7. Connecting Genomic Alterations to Cancer Biology with Proteomics: The NCI Clinical Proteomic Tumor Analysis Consortium

    Energy Technology Data Exchange (ETDEWEB)

    Ellis, Matthew; Gillette, Michael; Carr, Steven A.; Paulovich, Amanda G.; Smith, Richard D.; Rodland, Karin D.; Townsend, Reid; Kinsinger, Christopher; Mesri, Mehdi; Rodriguez, Henry; Liebler, Daniel

    2013-10-03

    The National Cancer Institute (NCI) Clinical Proteomic Tumor Analysis Consortium is applying the latest generation of proteomic technologies to genomically annotated tumors from The Cancer Genome Atlas (TCGA) program, a joint initiative of the NCI and the National Human Genome Research Institute. By providing a fully integrated accounting of DNA, RNA, and protein abnormalities in individual tumors, these datasets will illuminate the complex relationship between genomic abnormalities and cancer phenotypes, thus producing biologic insights as well as a wave of novel candidate biomarkers and therapeutic targets amenable to verifi cation using targeted mass spectrometry methods.

  8. Complete genome sequence of the Antarctic Halorubrum lacusprofundi type strain ACAM 34.

    Science.gov (United States)

    Anderson, Iain J; DasSarma, Priya; Lucas, Susan; Copeland, Alex; Lapidus, Alla; Del Rio, Tijana Glavina; Tice, Hope; Dalin, Eileen; Bruce, David C; Goodwin, Lynne; Pitluck, Sam; Sims, David; Brettin, Thomas S; Detter, John C; Han, Cliff S; Larimer, Frank; Hauser, Loren; Land, Miriam; Ivanova, Natalia; Richardson, Paul; Cavicchioli, Ricardo; DasSarma, Shiladitya; Woese, Carl R; Kyrpides, Nikos C

    2016-01-01

    Halorubrum lacusprofundi is an extreme halophile within the archaeal phylum Euryarchaeota. The type strain ACAM 34 was isolated from Deep Lake, Antarctica. H. lacusprofundi is of phylogenetic interest because it is distantly related to the haloarchaea that have previously been sequenced. It is also of interest because of its psychrotolerance. We report here the complete genome sequence of H. lacusprofundi type strain ACAM 34 and its annotation. This genome is part of a 2006 Joint Genome Institute Community Sequencing Program project to sequence genomes of diverse Archaea.

  9. Genomic Data Commons and Genomic Cloud Pilots - Google Hangout

    Science.gov (United States)

    Join us for a live, moderated discussion about two NCI efforts to expand access to cancer genomics data: the Genomic Data Commons and Genomic Cloud Pilots. NCI subject matters experts will include Louis M. Staudt, M.D., Ph.D., Director Center for Cancer Genomics, Warren Kibbe, Ph.D., Director, NCI Center for Biomedical Informatics and Information Technology, and moderated by Anthony Kerlavage, Ph.D., Chief, Cancer Informatics Branch, Center for Biomedical Informatics and Information Technology. We welcome your questions before and during the Hangout on Twitter using the hashtag #AskNCI.

  10. The coffee genome hub: a resource for coffee genomes.

    Science.gov (United States)

    Dereeper, Alexis; Bocs, Stéphanie; Rouard, Mathieu; Guignon, Valentin; Ravel, Sébastien; Tranchant-Dubreuil, Christine; Poncet, Valérie; Garsmeur, Olivier; Lashermes, Philippe; Droc, Gaëtan

    2015-01-01

    The whole genome sequence of Coffea canephora, the perennial diploid species known as Robusta, has been recently released. In the context of the C. canephora genome sequencing project and to support post-genomics efforts, we developed the Coffee Genome Hub (http://coffee-genome.org/), an integrative genome information system that allows centralized access to genomics and genetics data and analysis tools to facilitate translational and applied research in coffee. We provide the complete genome sequence of C. canephora along with gene structure, gene product information, metabolism, gene families, transcriptomics, syntenic blocks, genetic markers and genetic maps. The hub relies on generic software (e.g. GMOD tools) for easy querying, visualizing and downloading research data. It includes a Genome Browser enhanced by a Community Annotation System, enabling the improvement of automatic gene annotation through an annotation editor. In addition, the hub aims at developing interoperability among other existing South Green tools managing coffee data (phylogenomics resources, SNPs) and/or supporting data analyses with the Galaxy workflow manager. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.

  11. Genomic Data Commons and Genomic Cloud Pilots - Google Hangout

    Science.gov (United States)

    Join us for a live, moderated discussion about two NCI efforts to expand access to cancer genomics data: the Genomic Data Commons and Genomic Cloud Pilots. NCI subject matters experts will include Louis M. Staudt, M.D., Ph.D., Director Center for Cancer Genomics, Warren Kibbe, Ph.D., Director, NCI Center for Biomedical Informatics and Information Technology, and moderated by Anthony Kerlavage, Ph.D., Chief, Cancer Informatics Branch, Center for Biomedical Informatics and Information Technology. We welcome your questions before and during the Hangout on Twitter using the hashtag #AskNCI.

  12. Genome Update: alignment of bacterial chromosomes

    DEFF Research Database (Denmark)

    Ussery, David; Jensen, Mette; Poulsen, Tine Rugh

    2004-01-01

    There are four new microbial genomes listed in this month's Genome Update, three belonging to Gram-positive bacteria and one belonging to an archaeon that lives at pH 0; all of these genomes are listed in Table 1⇓. The method of genome comparison this month is that of genome alignment and......, as an example, an alignment of seven Staphylococcus aureus genomes and one Staphylococcus epidermidis genome is presented....

  13. Potential assessment of genome-wide association study and genomic selection in Japanese pear Pyrus pyrifolia.

    Science.gov (United States)

    Iwata, Hiroyoshi; Hayashi, Takeshi; Terakami, Shingo; Takada, Norio; Sawamura, Yutaka; Yamamoto, Toshiya

    2013-03-01

    Although the potential of marker-assisted selection (MAS) in fruit tree breeding has been reported, bi-parental QTL mapping before MAS has hindered the introduction of MAS to fruit tree breeding programs. Genome-wide association studies (GWAS) are an alternative to bi-parental QTL mapping in long-lived perennials. Selection based on genomic predictions of breeding values (genomic selection: GS) is another alternative for MAS. This study examined the potential of GWAS and GS in pear breeding with 76 Japanese pear cultivars to detect significant associations of 162 markers with nine agronomic traits. We applied multilocus Bayesian models accounting for ordinal categorical phenotypes for GWAS and GS model training. Significant associations were detected at harvest time, black spot resistance and the number of spurs and two of the associations were closely linked to known loci. Genome-wide predictions for GS were accurate at the highest level (0.75) in harvest time, at medium levels (0.38-0.61) in resistance to black spot, firmness of flesh, fruit shape in longitudinal section, fruit size, acid content and number of spurs and at low levels (pear.

  14. Genomic MRI - a Public Resource for Studying Sequence Patterns within Genomic DNA

    Science.gov (United States)

    Prakash, Ashwin; Bechtel, Jason; Fedorov, Alexei

    2011-01-01

    Non-coding genomic regions in complex eukaryotes, including intergenic areas, introns, and untranslated segments of exons, are profoundly non-random in their nucleotide composition and consist of a complex mosaic of sequence patterns. These patterns include so-called Mid-Range Inhomogeneity (MRI) regions -- sequences 30-10000 nucleotides in length that are enriched by a particular base or combination of bases (e.g. (G+T)-rich, purine-rich, etc.). MRI regions are associated with unusual (non-B-form) DNA structures that are often involved in regulation of gene expression, recombination, and other genetic processes (Fedorova & Fedorov 2010). The existence of a strong fixation bias within MRI regions against mutations that tend to reduce their sequence inhomogeneity additionally supports the functionality and importance of these genomic sequences (Prakash et al. 2009). Here we demonstrate a freely available Internet resource -- the Genomic MRI program package -- designed for computational analysis of genomic sequences in order to find and characterize various MRI patterns within them (Bechtel et al. 2008). This package also allows generation of randomized sequences with various properties and level of correspondence to the natural input DNA sequences. The main goal of this resource is to facilitate examination of vast regions of non-coding DNA that are still scarcely investigated and await thorough exploration and recognition. PMID:21610667

  15. Savant Genome Browser 2: visualization and analysis for population-scale genomics.

    Science.gov (United States)

    Fiume, Marc; Smith, Eric J M; Brook, Andrew; Strbenac, Dario; Turner, Brian; Mezlini, Aziz M; Robinson, Mark D; Wodak, Shoshana J; Brudno, Michael

    2012-07-01

    High-throughput sequencing (HTS) technologies are providing an unprecedented capacity for data generation, and there is a corresponding need for efficient data exploration and analysis capabilities. Although most existing tools for HTS data analysis are developed for either automated (e.g. genotyping) or visualization (e.g. genome browsing) purposes, such tools are most powerful when combined. For example, integration of visualization and computation allows users to iteratively refine their analyses by updating computational parameters within the visual framework in real-time. Here we introduce the second version of the Savant Genome Browser, a standalone program for visual and computational analysis of HTS data. Savant substantially improves upon its predecessor and existing tools by introducing innovative visualization modes and navigation interfaces for several genomic datatypes, and synergizing visual and automated analyses in a way that is powerful yet easy even for non-expert users. We also present a number of plugins that were developed by the Savant Community, which demonstrate the power of integrating visual and automated analyses using Savant. The Savant Genome Browser is freely available (open source) at www.savantbrowser.com.

  16. Antiviral Defenses in Plants through Genome Editing

    Science.gov (United States)

    Romay, Gustavo; Bragard, Claude

    2017-01-01

    Plant–virus interactions based-studies have contributed to increase our understanding on plant resistance mechanisms, providing new tools for crop improvement. In the last two decades, RNA interference, a post-transcriptional gene silencing approach, has been used to induce antiviral defenses in plants with the help of genetic engineering technologies. More recently, the new genome editing systems (GES) are revolutionizing the scope of tools available to confer virus resistance in plants. The most explored GES are zinc finger nucleases, transcription activator-like effector nucleases, and clustered regularly interspaced short palindromic repeats/Cas9 endonuclease. GES are engineered to target and introduce mutations, which can be deleterious, via double-strand breaks at specific DNA sequences by the error-prone non-homologous recombination end-joining pathway. Although GES have been engineered to target DNA, recent discoveries of GES targeting ssRNA molecules, including virus genomes, pave the way for further studies programming plant defense against RNA viruses. Most of plant virus species have an RNA genome and at least 784 species have positive ssRNA. Here, we provide a summary of the latest progress in plant antiviral defenses mediated by GES. In addition, we also discuss briefly the GES perspectives in light of the rebooted debate on genetic modified organisms (GMOs) and the current regulatory frame for agricultural products involving the use of such engineering technologies. PMID:28167937

  17. Big Data: Astronomical or Genomical?

    Directory of Open Access Journals (Sweden)

    Zachary D Stephens

    2015-07-01

    Full Text Available Genomics is a Big Data science and is going to get much bigger, very soon, but it is not known whether the needs of genomics will exceed other Big Data domains. Projecting to the year 2025, we compared genomics with three other major generators of Big Data: astronomy, YouTube, and Twitter. Our estimates show that genomics is a "four-headed beast"--it is either on par with or the most demanding of the domains analyzed here in terms of data acquisition, storage, distribution, and analysis. We discuss aspects of new technologies that will need to be developed to rise up and meet the computational challenges that genomics poses for the near future. Now is the time for concerted, community-wide planning for the "genomical" challenges of the next decade.

  18. Domestication genomics: evidence from animals.

    Science.gov (United States)

    Wang, Guo-Dong; Xie, Hai-Bing; Peng, Min-Sheng; Irwin, David; Zhang, Ya-Ping

    2014-02-01

    Animal domestication has far-reaching significance for human society. The sequenced genomes of domesticated animals provide critical resources for understanding the genetic basis of domestication. Various genomic analyses have shed a new light on the mechanism of artificial selection and have allowed the mapping of genes involved in important domestication traits. Here, we summarize the published genomes of domesticated animals that have been generated over the past decade, as well as their origins, from a phylogenomic point of view. This review provides a general description of the genomic features encountered under a two-stage domestication process. We also introduce recent findings for domestication traits based on results from genome-wide association studies and selective-sweep scans for artificially selected genomic regions. Particular attention is paid to issues relating to the costs of domestication and the convergent evolution of genes between domesticated animals and humans.

  19. Big Data: Astronomical or Genomical?

    Science.gov (United States)

    Stephens, Zachary D; Lee, Skylar Y; Faghri, Faraz; Campbell, Roy H; Zhai, Chengxiang; Efron, Miles J; Iyer, Ravishankar; Schatz, Michael C; Sinha, Saurabh; Robinson, Gene E

    2015-07-01

    Genomics is a Big Data science and is going to get much bigger, very soon, but it is not known whether the needs of genomics will exceed other Big Data domains. Projecting to the year 2025, we compared genomics with three other major generators of Big Data: astronomy, YouTube, and Twitter. Our estimates show that genomics is a "four-headed beast"--it is either on par with or the most demanding of the domains analyzed here in terms of data acquisition, storage, distribution, and analysis. We discuss aspects of new technologies that will need to be developed to rise up and meet the computational challenges that genomics poses for the near future. Now is the time for concerted, community-wide planning for the "genomical" challenges of the next decade.

  20. Genomics of Bacillus Species

    Science.gov (United States)

    Økstad, Ole Andreas; Kolstø, Anne-Brit

    Members of the genus Bacillus are rod-shaped spore-forming bacteria belonging to the Firmicutes, the low G+C gram-positive bacteria. The Bacillus genus was first described and classified by Ferdinand Cohn in Cohn (1872), and Bacillus subtilis was defined as the type species (Soule, 1932). Several Bacilli may be linked to opportunistic infections. However, pathogenicity among Bacillus spp. is mainly a feature of bacteria belonging to the Bacillus cereus group, including B. cereus, Bacillus anthracis, and Bacillus thuringiensis. Here we review the genomics of B. cereus group bacteria in relation to their roles as etiological agents of two food poisoning syndromes (emetic and diarrhoeal).

  1. Bacterial genome reengineering.

    Science.gov (United States)

    Zhou, Jindan; Rudd, Kenneth E

    2011-01-01

    The web application PrimerPair at ecogene.org generates large sets of paired DNA sequences surrounding- all protein and RNA genes of Escherichia coli K-12. Many DNA fragments, which these primers amplify, can be used to implement a genome reengineering strategy using complementary in vitro cloning and in vivo recombineering. The integration of a primer design tool with a model organism database increases the level of quality control. Computer-assisted design of gene primer pairs relies upon having highly accurate genomic DNA sequence information that exactly matches the DNA of the cells being used in the laboratory to ensure predictable DNA hybridizations. It is equally crucial to have confidence that the predicted start codons define the locations of genes accurately. Annotations in the EcoGene database are queried by PrimerPair to eliminate pseudogenes, IS elements, and other problematic genes before the design process starts. These projects progressively familiarize users with the EcoGene content, scope, and application interfaces that are useful for genome reengineering projects. The first protocol leads to the design of a pair of primer sequences that were used to clone and express a single gene. The N-terminal protein sequence was experimentally verified and the protein was detected in the periplasm. This is followed by instructions to design PCR primer pairs for cloning gene fragments encoding 50 periplasmic proteins without their signal peptides. The design process begins with the user simply designating one pair of forward and reverse primer endpoint positions relative to all start and stop codon positions. The gene name, genomic coordinates, and primer DNA sequences are reported to the user. When making chromosomal deletions, the integrity of the provisional primer design is checked to see whether it will generate any unwanted double deletions with adjacent genes. The bad designs are recalculated and replacement primers are provided alongside the

  2. Vaccinology in the genome era

    OpenAIRE

    Rinaudo, C. Daniela; Telford, John L.; Rappuoli, Rino; Seib, Kate L.

    2009-01-01

    Vaccination has played a significant role in controlling and eliminating life-threatening infectious diseases throughout the world, and yet currently licensed vaccines represent only the tip of the iceberg in terms of controlling human pathogens. However, as we discuss in this Review, the arrival of the genome era has revolutionized vaccine development and catalyzed a shift from conventional culture-based approaches to genome-based vaccinology. The availability of complete bacterial genomes h...

  3. Organizational heterogeneity of vertebrate genomes.

    Directory of Open Access Journals (Sweden)

    Svetlana Frenkel

    Full Text Available Genomes of higher eukaryotes are mosaics of segments with various structural, functional, and evolutionary properties. The availability of whole-genome sequences allows the investigation of their structure as "texts" using different statistical and computational methods. One such method, referred to as Compositional Spectra (CS analysis, is based on scoring the occurrences of fixed-length oligonucleotides (k-mers in the target DNA sequence. CS analysis allows generating species- or region-specific characteristics of the genome, regardless of their length and the presence of coding DNA. In this study, we consider the heterogeneity of vertebrate genomes as a joint effect of regional variation in sequence organization superimposed on the differences in nucleotide composition. We estimated compositional and organizational heterogeneity of genome and chromosome sequences separately and found that both heterogeneity types vary widely among genomes as well as among chromosomes in all investigated taxonomic groups. The high correspondence of heterogeneity scores obtained on three genome fractions, coding, repetitive, and the remaining part of the noncoding DNA (the genome dark matter--GDM allows the assumption that CS-heterogeneity may have functional relevance to genome regulation. Of special interest for such interpretation is the fact that natural GDM sequences display the highest deviation from the corresponding reshuffled sequences.

  4. Phototroph genomics ten years on.

    Science.gov (United States)

    Raymond, Jason; Swingley, Wesley D

    2008-07-01

    The onset of the genome era means different things to different people, but it is clear that this new age brings with it paradigm shifts that will forever affect biological research. Less clear is just how these shifts are changing the scope and scale of research. Are gigabases of raw data more useful than a single well-understood gene? Do we really need a full genome to understand the physiology of a single organism? The photosynthetic field is poised at the periphery of the bulk of genome sequencing work--understandably skewed toward health-related disciplines--and, as such, is subject to different motivations, limitations, and primary focus for each new genome. To understand some of these differences, we focus here on various indicators of the impact that genomics has had on the photosynthetic community, now a full decade since the publication of the first photosynthetic genome. Many useful indicators are indexed in public databases, providing pre- and post-genome sequence snapshots of changes in factors such as publication rate, number of proteins characterized, and sequenced genome coverage versus known diversity. As more genomes are sequenced and metagenomic projects begin to pour out billions of bases, it becomes crucial to understand how to harness this data in order to accumulate possible benefits and avoid possible pitfalls, especially as resources become increasingly directed toward natural environments governed by photosynthetic activity, ranging from hot springs to tropical forest ecosystems to the open ocean.

  5. Comparative Microbial Genomics and Forensics.

    Science.gov (United States)

    Massey, Steven E

    2016-08-01

    Forensic science concerns the application of scientific techniques to questions of a legal nature and may also be used to address questions of historical importance. Forensic techniques are often used in legal cases that involve crimes against persons or property, and they increasingly may involve cases of bioterrorism, crimes against nature, medical negligence, or tracing the origin of food- and crop-borne disease. Given the rapid advance of genome sequencing and comparative genomics techniques, we ask how these might be used to address cases of a forensic nature, focusing on the use of microbial genome sequence analysis. Such analyses rely on the increasingly large numbers of microbial genomes present in public databases, the ability of individual investigators to rapidly sequence whole microbial genomes, and an increasing depth of understanding of their evolution and function. Suggestions are made as to how comparative microbial genomics might be applied forensically and may represent possibilities for the future development of forensic techniques. A particular emphasis is on the nascent field of genomic epidemiology, which utilizes rapid whole-genome sequencing to identify the source and spread of infectious outbreaks. Also discussed is the application of comparative microbial genomics to the study of historical epidemics and deaths and how the approaches developed may also be applicable to more recent and actionable cases.

  6. Organizational heterogeneity of vertebrate genomes.

    Science.gov (United States)

    Frenkel, Svetlana; Kirzhner, Valery; Korol, Abraham

    2012-01-01

    Genomes of higher eukaryotes are mosaics of segments with various structural, functional, and evolutionary properties. The availability of whole-genome sequences allows the investigation of their structure as "texts" using different statistical and computational methods. One such method, referred to as Compositional Spectra (CS) analysis, is based on scoring the occurrences of fixed-length oligonucleotides (k-mers) in the target DNA sequence. CS analysis allows generating species- or region-specific characteristics of the genome, regardless of their length and the presence of coding DNA. In this study, we consider the heterogeneity of vertebrate genomes as a joint effect of regional variation in sequence organization superimposed on the differences in nucleotide composition. We estimated compositional and organizational heterogeneity of genome and chromosome sequences separately and found that both heterogeneity types vary widely among genomes as well as among chromosomes in all investigated taxonomic groups. The high correspondence of heterogeneity scores obtained on three genome fractions, coding, repetitive, and the remaining part of the noncoding DNA (the genome dark matter--GDM) allows the assumption that CS-heterogeneity may have functional relevance to genome regulation. Of special interest for such interpretation is the fact that natural GDM sequences display the highest deviation from the corresponding reshuffled sequences.

  7. Preemptive public policy for genomics.

    Science.gov (United States)

    Carlson, Rick J

    2008-02-01

    To many, genomics is merely exploitable technology for the leviathan of biotechnology. This is both shallow and short sighted. Genomics is applied knowledge based on profound and evolving science about how living things develop, how healthy or sick we are, and what our future will be like. In health care, genomics technologies are disruptive yet potentially cost-effective because they enable primary prevention, the antidote to runaway costs and declining productivity. The challenges to integration are great, however, and many bioethical and social-policy implications are alarming. Because it is poorly understood today, we must debate genomics vigorously if we are to act wisely. Public policy must lead.

  8. Advances in yeast genome engineering.

    Science.gov (United States)

    David, Florian; Siewers, Verena

    2015-02-01

    Genome engineering based on homologous recombination has been applied to yeast for many years. However, the growing importance of yeast as a cell factory in metabolic engineering and chassis in synthetic biology demands methods for fast and efficient introduction of multiple targeted changes such as gene knockouts and introduction of multistep metabolic pathways. In this review, we summarize recent improvements of existing genome engineering methods, the development of novel techniques, for example for advanced genome redesign and evolution, and the importance of endonucleases as genome engineering tools.

  9. Scalable computing for evolutionary genomics.

    Science.gov (United States)

    Prins, Pjotr; Belhachemi, Dominique; Möller, Steffen; Smant, Geert

    2012-01-01

    Genomic data analysis in evolutionary biology is becoming so computationally intensive that analysis of multiple hypotheses and scenarios takes too long on a single desktop computer. In this chapter, we discuss techniques for scaling computations through parallelization of calculations, after giving a quick overview of advanced programming techniques. Unfortunately, parallel programming is difficult and requires special software design. The alternative, especially attractive for legacy software, is to introduce poor man's parallelization by running whole programs in parallel as separate processes, using job schedulers. Such pipelines are often deployed on bioinformatics computer clusters. Recent advances in PC virtualization have made it possible to run a full computer operating system, with all of its installed software, on top of another operating system, inside a "box," or virtual machine (VM). Such a VM can flexibly be deployed on multiple computers, in a local network, e.g., on existing desktop PCs, and even in the Cloud, to create a "virtual" computer cluster. Many bioinformatics applications in evolutionary biology can be run in parallel, running processes in one or more VMs. Here, we show how a ready-made bioinformatics VM image, named BioNode, effectively creates a computing cluster, and pipeline, in a few steps. This allows researchers to scale-up computations from their desktop, using available hardware, anytime it is required. BioNode is based on Debian Linux and can run on networked PCs and in the Cloud. Over 200 bioinformatics and statistical software packages, of interest to evolutionary biology, are included, such as PAML, Muscle, MAFFT, MrBayes, and BLAST. Most of these software packages are maintained through the Debian Med project. In addition, BioNode contains convenient configuration scripts for parallelizing bioinformatics software. Where Debian Med encourages packaging free and open source bioinformatics software through one central project

  10. The genome sequence of the colonial chordate, Botryllus schlosseri

    Science.gov (United States)

    Voskoboynik, Ayelet; Neff, Norma F; Sahoo, Debashis; Newman, Aaron M; Pushkarev, Dmitry; Koh, Winston; Passarelli, Benedetto; Fan, H Christina; Mantalas, Gary L; Palmeri, Karla J; Ishizuka, Katherine J; Gissi, Carmela; Griggio, Francesca; Ben-Shlomo, Rachel; Corey, Daniel M; Penland, Lolita; White, Richard A; Weissman, Irving L; Quake, Stephen R

    2013-01-01

    Botryllus schlosseri is a colonial urochordate that follows the chordate plan of development following sexual reproduction, but invokes a stem cell-mediated budding program during subsequent rounds of asexual reproduction. As urochordates are considered to be the closest living invertebrate relatives of vertebrates, they are ideal subjects for whole genome sequence analyses. Using a novel method for high-throughput sequencing of eukaryotic genomes, we sequenced and assembled 580 Mbp of the B. schlosseri genome. The genome assembly is comprised of nearly 14,000 intron-containing predicted genes, and 13,500 intron-less predicted genes, 40% of which could be confidently parceled into 13 (of 16 haploid) chromosomes. A comparison of homologous genes between B. schlosseri and other diverse taxonomic groups revealed genomic events underlying the evolution of vertebrates and lymphoid-mediated immunity. The B. schlosseri genome is a community resource for studying alternative modes of reproduction, natural transplantation reactions, and stem cell-mediated regeneration. DOI: http://dx.doi.org/10.7554/eLife.00569.001 PMID:23840927

  11. GEMBASSY: an EMBOSS associated software package for comprehensive genome analyses.

    Science.gov (United States)

    Itaya, Hidetoshi; Oshita, Kazuki; Arakawa, Kazuharu; Tomita, Masaru

    2013-08-29

    The popular European Molecular Biology Open Software Suite (EMBOSS) currently contains over 400 tools used in various bioinformatics researches, equipped with sophisticated development frameworks for interoperability and tool discoverability as well as rich documentations and various user interfaces. In order to further strengthen EMBOSS in the fields of genomics, we here present a novel EMBOSS associated software (EMBASSY) package named GEMBASSY, which adds more than 50 analysis tools from the G-language Genome Analysis Environment and its Representational State Transfer (REST) and SOAP web services. GEMBASSY basically contains wrapper programs of G-language REST/SOAP web services to provide intuitive and easy access to various annotations within complete genome flatfiles, as well as tools for analyzing nucleic composition, calculating codon usage, and visualizing genomic information. For example, analysis methods such as for calculating distance between sequences by genomic signatures and for predicting gene expression levels from codon usage bias are effective in the interpretation of meta-genomic and meta-transcriptomic data. GEMBASSY tools can be used seamlessly with other EMBOSS tools and UNIX command line tools. The source code written in C is available from GitHub (https://github.com/celery-kotone/GEMBASSY/) and the distribution package is freely available from the GEMBASSY web site (http://www.g-language.org/gembassy/).

  12. Genomic Prediction of Barley Hybrid Performance

    Directory of Open Access Journals (Sweden)

    Norman Philipp

    2016-07-01

    Full Text Available Hybrid breeding in barley ( L. offers great opportunities to accelerate the rate of genetic improvement and to boost yield stability. A crucial requirement consists of the efficient selection of superior hybrid combinations. We used comprehensive phenotypic and genomic data from a commercial breeding program with the goal of examining the potential to predict the hybrid performances. The phenotypic data were comprised of replicated grain yield trials for 385 two-way and 408 three-way hybrids evaluated in up to 47 environments. The parental lines were genotyped using a 3k single nucleotide polymorphism (SNP array based on an Illumina Infinium assay. We implemented ridge regression best linear unbiased prediction modeling for additive and dominance effects and evaluated the prediction ability using five-fold cross validations. The prediction ability of hybrid performances based on general combining ability (GCA effects was moderate, amounting to 0.56 and 0.48 for two- and three-way hybrids, respectively. The potential of GCA-based hybrid prediction requires that both parental components have been evaluated in a hybrid background. This is not necessary for genomic prediction for which we also observed moderate cross-validated prediction abilities of 0.51 and 0.58 for two- and three-way hybrids, respectively. This exemplifies the potential of genomic prediction in hybrid barley. Interestingly, prediction ability using the two-way hybrids as training population and the three-way hybrids as test population or vice versa was low, presumably, because of the different genetic makeup of the parental source populations. Consequently, further research is needed to optimize genomic prediction approaches combining different source populations in barley.

  13. VIGoR: Variational Bayesian Inference for Genome-Wide Regression

    Directory of Open Access Journals (Sweden)

    Akio Onogi

    2016-04-01

    Full Text Available Genome-wide regression using a number of genome-wide markers as predictors is now widely used for genome-wide association mapping and genomic prediction. We developed novel software for genome-wide regression which we named VIGoR (variational Bayesian inference for genome-wide regression. Variational Bayesian inference is computationally much faster than widely used Markov chain Monte Carlo algorithms. VIGoR implements seven regression methods, and is provided as a command line program package for Linux/Mac, and as a cross-platform R package. In addition to model fitting, cross-validation and hyperparameter tuning using cross-validation can be automatically performed by modifying a single argument. VIGoR is available at https://github.com/Onogi/VIGoR. The R package is also available at https://cran.r-project.org/web/packages/VIGoR/index.html.

  14. A simple and effective method for construction of Escherichia coli strains proficient for genome engineering.

    Science.gov (United States)

    Ryu, Young Shin; Biswas, Rajesh Kumar; Shin, Kwangsu; Parisutham, Vinuselvi; Kim, Suk Min; Lee, Sung Kuk

    2014-01-01

    Multiplex genome engineering is a standalone recombineering tool for large-scale programming and accelerated evolution of cells. However, this advanced genome engineering technique has been limited to use in selected bacterial strains. We developed a simple and effective strain-independent method for effective genome engineering in Escherichia coli. The method involves introducing a suicide plasmid carrying the λ Red recombination system into the mutS gene. The suicide plasmid can be excised from the chromosome via selection in the absence of antibiotics, thus allowing transient inactivation of the mismatch repair system during genome engineering. In addition, we developed another suicide plasmid that enables integration of large DNA fragments into the lacZ genomic locus. These features enable this system to be applied in the exploitation of the benefits of genome engineering in synthetic biology, as well as the metabolic engineering of different strains of E. coli.

  15. A simple and effective method for construction of Escherichia coli strains proficient for genome engineering.

    Directory of Open Access Journals (Sweden)

    Young Shin Ryu

    Full Text Available Multiplex genome engineering is a standalone recombineering tool for large-scale programming and accelerated evolution of cells. However, this advanced genome engineering technique has been limited to use in selected bacterial strains. We developed a simple and effective strain-independent method for effective genome engineering in Escherichia coli. The method involves introducing a suicide plasmid carrying the λ Red recombination system into the mutS gene. The suicide plasmid can be excised from the chromosome via selection in the absence of antibiotics, thus allowing transient inactivation of the mismatch repair system during genome engineering. In addition, we developed another suicide plasmid that enables integration of large DNA fragments into the lacZ genomic locus. These features enable this system to be applied in the exploitation of the benefits of genome engineering in synthetic biology, as well as the metabolic engineering of different strains of E. coli.

  16. Histones and genome integrity.

    Science.gov (United States)

    Williamson, Wes D; Pinto, Ines

    2012-01-01

    Chromosomes undergo extensive structural rearrangements during the cell cycle, from the most open chromatin state required for DNA replication to the highest level of compaction and condensation essential for mitotic segregation of sister chromatids. It is now widely accepted that chromatin is a highly dynamic structure that participates in all DNA-related functions, including transcription, DNA replication, repair, and mitosis; hence, histones have emerged as key players in these cellular processes. We review here the studies that implicate histones in functions that affect the chromosome cycle, defined as the cellular processes involved in the maintenance, replication, and segregation of chromosomal DNA. Disruption of the chromosome cycle affects the integrity of the cellular genome, leading to aneuploidy, polyploidy or cell death. Histone stoichiometry, mutations that affect the structure of the nucleosome core particle, and mutations that affect the structure and/or modifications of the histone tails, all have a direct impact on the fidelity of chromosome transmission and the integrity of the genome.

  17. Genomics of human longevity.

    Science.gov (United States)

    Slagboom, P E; Beekman, M; Passtoors, W M; Deelen, J; Vaarhorst, A A M; Boer, J M; van den Akker, E B; van Heemst, D; de Craen, A J M; Maier, A B; Rozing, M; Mooijaart, S P; Heijmans, B T; Westendorp, R G J

    2011-01-12

    In animal models, single-gene mutations in genes involved in insulin/IGF and target of rapamycin signalling pathways extend lifespan to a considerable extent. The genetic, genomic and epigenetic influences on human longevity are expected to be much more complex. Strikingly however, beneficial metabolic and cellular features of long-lived families resemble those in animals for whom the lifespan is extended by applying genetic manipulation and, especially, dietary restriction. Candidate gene studies in humans support the notion that human orthologues from longevity genes identified in lower species do contribute to longevity but that the influence of the genetic variants involved is small. Here we discuss how an integration of novel study designs, labour-intensive biobanking, deep phenotyping and genomic research may provide insights into the mechanisms that drive human longevity and healthy ageing, beyond the associations usually provided by molecular and genetic epidemiology. Although prospective studies of humans from the cradle to the grave have never been performed, it is feasible to extract life histories from different cohorts jointly covering the molecular changes that occur with age from early development all the way up to the age at death. By the integration of research in different study cohorts, and with research in animal models, biological research into human longevity is thus making considerable progress.

  18. Parsing of genomic graffiti

    Energy Technology Data Exchange (ETDEWEB)

    Tibbetts, C.; Golden, J. III; Torgersen, D. [Vanderbilt Univ. School of Engineering, Nashville, TN (United States)

    1996-12-31

    A focal point of modern biology is investigation of wide varieties of phenomena at the level of molecular genetics. The nucleotide sequences of deoxyribonucleic acid (DNA) and ribonucleic acid (RNA) define the ultimate resolution of this reductionist approach to understand the determinants of heritable traits. The structure and function of genes, their composite genomic organization, and their regulated expression have been studied in systems representing every class of organism. Many human diseases or pathogenic syndromes can be directly attributed to inherited defects in either the regulated expression, or the quality of the products of specific genes. Genetic determinants of susceptibility to infectious agents or environmental hazards are amply documented. Mapping and sequencing of the DNA molecules encoding human genes have provided powerful technology for pharmaceutical bioengineering and forensic investigations. From an alternative perspective, we may anticipate that voluminous archives of singular DNA sequences alone will not suffice to define and understand the functional determinants of genome organization, allelic diversity and evolutionary plasticity of living organisms. New insights will accumulate pertaining to human evolutionary origins and relationships of human biology to models based on other mammals. Investigators of population genetics and epidemiology now exploit the technology of molecular genetics to more powerfully probe variation within the human gene pool at the level of DNA sequences. 40 refs., 7 figs., 2 tabs.

  19. Genomics of Myeloproliferative Neoplasms.

    Science.gov (United States)

    Zoi, Katerina; Cross, Nicholas C P

    2017-03-20

    Myeloproliferative neoplasms (MPNs) are a group of related clonal hematologic disorders characterized by excess accumulation of one or more myeloid cell lineages and a tendency to transform to acute myeloid leukemia. Deregulated JAK2 signaling has emerged as the central phenotypic driver of BCR -ABL1-negative MPNs and a unifying therapeutic target. In addition, MPNs show unexpected layers of genetic complexity, with multiple abnormalities associated with disease progression, interactions between inherited factors and phenotype driver mutations, and effects related to the order in which mutations are acquired. Although morphology and clinical laboratory analysis continue to play an important role in defining these conditions, genomic analysis is providing a platform for better disease definition, more accurate diagnosis, direction of therapy, and refined prognostication. There is an emerging consensus with regard to many prognostic factors, but there is a clear need to synthesize genomic findings into robust, clinically actionable and widely accepted scoring systems as well as the need to standardize the laboratory methodologies that are used.

  20. Murasaki: a fast, parallelizable algorithm to find anchors from multiple genomes.

    Directory of Open Access Journals (Sweden)

    Kris Popendorf

    Full Text Available BACKGROUND: With the number of available genome sequences increasing rapidly, the magnitude of sequence data required for multiple-genome analyses is a challenging problem. When large-scale rearrangements break the collinearity of gene orders among genomes, genome comparison algorithms must first identify sets of short well-conserved sequences present in each genome, termed anchors. Previously, anchor identification among multiple genomes has been achieved using pairwise alignment tools like BLASTZ through progressive alignment tools like TBA, but the computational requirements for sequence comparisons of multiple genomes quickly becomes a limiting factor as the number and scale of genomes grows. METHODOLOGY/PRINCIPAL FINDINGS: Our algorithm, named Murasaki, makes it possible to identify anchors within multiple large sequences on the scale of several hundred megabases in few minutes using a single CPU. Two advanced features of Murasaki are (1 adaptive hash function generation, which enables efficient use of arbitrary mismatch patterns (spaced seeds and therefore the comparison of multiple mammalian genomes in a practical amount of computation time, and (2 parallelizable execution that decreases the required wall-clock and CPU times. Murasaki can perform a sensitive anchoring of eight mammalian genomes (human, chimp, rhesus, orangutan, mouse, rat, dog, and cow in 21 hours CPU time (42 minutes wall time. This is the first single-pass in-core anchoring of multiple mammalian genomes. We evaluated Murasaki by comparing it with the genome alignment programs BLASTZ and TBA. We show that Murasaki can anchor multiple genomes in near linear time, compared to the quadratic time requirements of BLASTZ and TBA, while improving overall accuracy. CONCLUSIONS/SIGNIFICANCE: Murasaki provides an open source platform to take advantage of long patterns, cluster computing, and novel hash algorithms to produce accurate anchors across multiple genomes with

  1. Murasaki: a fast, parallelizable algorithm to find anchors from multiple genomes.

    Science.gov (United States)

    Popendorf, Kris; Tsuyoshi, Hachiya; Osana, Yasunori; Sakakibara, Yasubumi

    2010-09-24

    With the number of available genome sequences increasing rapidly, the magnitude of sequence data required for multiple-genome analyses is a challenging problem. When large-scale rearrangements break the collinearity of gene orders among genomes, genome comparison algorithms must first identify sets of short well-conserved sequences present in each genome, termed anchors. Previously, anchor identification among multiple genomes has been achieved using pairwise alignment tools like BLASTZ through progressive alignment tools like TBA, but the computational requirements for sequence comparisons of multiple genomes quickly becomes a limiting factor as the number and scale of genomes grows. Our algorithm, named Murasaki, makes it possible to identify anchors within multiple large sequences on the scale of several hundred megabases in few minutes using a single CPU. Two advanced features of Murasaki are (1) adaptive hash function generation, which enables efficient use of arbitrary mismatch patterns (spaced seeds) and therefore the comparison of multiple mammalian genomes in a practical amount of computation time, and (2) parallelizable execution that decreases the required wall-clock and CPU times. Murasaki can perform a sensitive anchoring of eight mammalian genomes (human, chimp, rhesus, orangutan, mouse, rat, dog, and cow) in 21 hours CPU time (42 minutes wall time). This is the first single-pass in-core anchoring of multiple mammalian genomes. We evaluated Murasaki by comparing it with the genome alignment programs BLASTZ and TBA. We show that Murasaki can anchor multiple genomes in near linear time, compared to the quadratic time requirements of BLASTZ and TBA, while improving overall accuracy. Murasaki provides an open source platform to take advantage of long patterns, cluster computing, and novel hash algorithms to produce accurate anchors across multiple genomes with computational efficiency significantly greater than existing methods. Murasaki is available

  2. Genomic prediction using QTL derived from whole genome sequence data

    DEFF Research Database (Denmark)

    Brøndum, Rasmus Froberg; Su, Guosheng; Janss, Luc

    This study investigated the gain in accuracy of genomic prediction when a small number of significant variants from single marker analysis based on whole genome sequence data were added to the regular 54k SNP data. Analyses were performed for Nordic Holstein and Danish Jersey animals, using eithe...

  3. Insights into structural variations and genome rearrangements in prokaryotic genomes.

    Science.gov (United States)

    Periwal, Vinita; Scaria, Vinod

    2015-01-01

    Structural variations (SVs) are genomic rearrangements that affect fairly large fragments of DNA. Most of the SVs such as inversions, deletions and translocations have been largely studied in context of genetic diseases in eukaryotes. However, recent studies demonstrate that genome rearrangements can also have profound impact on prokaryotic genomes, leading to altered cell phenotype. In contrast to single-nucleotide variations, SVs provide a much deeper insight into organization of bacterial genomes at a much better resolution. SVs can confer change in gene copy number, creation of new genes, altered gene expression and many other functional consequences. High-throughput technologies have now made it possible to explore SVs at a much refined resolution in bacterial genomes. Through this review, we aim to highlight the importance of the less explored field of SVs in prokaryotic genomes and their impact. We also discuss its potential applicability in the emerging fields of synthetic biology and genome engineering where targeted SVs could serve to create sophisticated and accurate genome editing.

  4. Comparative genomics reveals insights into avian genome evolution and adaptation

    DEFF Research Database (Denmark)

    Zhang, Guojie; Li, Cai; Li, Qiye

    2014-01-01

    Birds are the most species-rich class of tetrapod vertebrates and have wide relevance across many research fields. We explored bird macroevolution using full genomes from 48 avian species representing all major extant clades. The avian genome is principally characterized by its constrained size, ...

  5. Organelle_PBA, a pipeline for assembling chloroplast and mitochondrial genomes from PacBio DNA sequencing data.

    Science.gov (United States)

    Soorni, Aboozar; Haak, David; Zaitlin, David; Bombarely, Aureliano

    2017-01-07

    The development of long-read sequencing technologies, such as single-molecule real-time (SMRT) sequencing by PacBio, has produced a revolution in the sequencing of small genomes. Sequencing organelle genomes using PacBio long-read data is a cost effective, straightforward approach. Nevertheless, the availability of simple-to-use software to perform the assembly from raw reads is limited at present. We present Organelle-PBA, a Perl program designed specifically for the assembly of chloroplast and mitochondrial genomes. For chloroplast genomes, the program selects the chloroplast reads from a whole genome sequencing pool, maps the reads to a reference sequence from a closely related species, and then performs read correction and de novo assembly using Sprai. Organelle-PBA completes the assembly process with the additional step of scaffolding by SSPACE-LongRead. The program then detects the chloroplast inverted repeats and reassembles and re-orients the assembly based on the organelle origin of the reference. We have evaluated the performance of the software using PacBio reads from different species, read coverage, and reference genomes. Finally, we present the assembly of two novel chloroplast genomes from the species Picea glauca (Pinaceae) and Sinningia speciosa (Gesneriaceae). Organelle-PBA is an easy-to-use Perl-based software pipeline that was written specifically to assemble mitochondrial and chloroplast genomes from whole genome PacBio reads. The program is available at https://github.com/aubombarely/Organelle_PBA .

  6. Genomic multiple sequence alignments: refinement using a genetic algorithm

    Directory of Open Access Journals (Sweden)

    Lefkowitz Elliot J

    2005-08-01

    Full Text Available Abstract Background Genomic sequence data cannot be fully appreciated in isolation. Comparative genomics – the practice of comparing genomic sequences from different species – plays an increasingly important role in understanding the genotypic differences between species that result in phenotypic differences as well as in revealing patterns of evolutionary relationships. One of the major challenges in comparative genomics is producing a high-quality alignment between two or more related genomic sequences. In recent years, a number of tools have been developed for aligning large genomic sequences. Most utilize heuristic strategies to identify a series of strong sequence similarities, which are then used as anchors to align the regions between the anchor points. The resulting alignment is globally correct, but in many cases is suboptimal locally. We describe a new program, GenAlignRefine, which improves the overall quality of global multiple alignments by using a genetic algorithm to improve local regions of alignment. Regions of low quality are identified, realigned using the program T-Coffee, and then refined using a genetic algorithm. Because a better COFFEE (Consistency based Objective Function For alignmEnt Evaluation score generally reflects greater alignment quality, the algorithm searches for an alignment that yields a better COFFEE score. To improve the intrinsic slowness of the genetic algorithm, GenAlignRefine was implemented as a parallel, cluster-based program. Results We tested the GenAlignRefine algorithm by running it on a Linux cluster to refine sequences from a simulation, as well as refine a multiple alignment of 15 Orthopoxvirus genomic sequences approximately 260,000 nucleotides in length that initially had been aligned by Multi-LAGAN. It took approximately 150 minutes for a 40-processor Linux cluster to optimize some 200 fuzzy (poorly aligned regions of the orthopoxvirus alignment. Overall sequence identity increased only

  7. SynTView — an interactive multi-view genome browser for next-generation comparative microorganism genomics

    Science.gov (United States)

    2013-01-01

    Background Dynamic visualisation interfaces are required to explore the multiple microbial genome data now available, especially those obtained by high-throughput sequencing — a.k.a. “Next-Generation Sequencing” (NGS) — technologies; they would also be useful for “standard” annotated genomes whose chromosome organizations may be compared. Although various software systems are available, few offer an optimal combination of feature-rich capabilities, non-static user interfaces and multi-genome data handling. Results We developed SynTView, a comparative and interactive viewer for microbial genomes, designed to run as either a web-based tool (Flash technology) or a desktop application (AIR environment). The basis of the program is a generic genome browser with sub-maps holding information about genomic objects (annotations). The software is characterised by the presentation of syntenic organisations of microbial genomes and the visualisation of polymorphism data (typically Single Nucleotide Polymorphisms — SNPs) along these genomes; these features are accessible to the user in an integrated way. A variety of specialised views are available and are all dynamically inter-connected (including linear and circular multi-genome representations, dot plots, phylogenetic profiles, SNP density maps, and more). SynTView is not linked to any particular database, allowing the user to plug his own data into the system seamlessly, and use external web services for added functionalities. SynTView has now been used in several genome sequencing projects to help biologists make sense out of huge data sets. Conclusions The most important assets of SynTView are: (i) the interactivity due to the Flash technology; (ii) the capabilities for dynamic interaction between many specialised views; and (iii) the flexibility allowing various user data sets to be integrated. It can thus be used to investigate massive amounts of information efficiently at the chromosome level. This

  8. Parasite Genome Projects and the Trypanosoma cruzi Genome Initiative

    Directory of Open Access Journals (Sweden)

    Wim Degrave

    1997-11-01

    Full Text Available Since the start of the human genome project, a great number of genome projects on other "model" organism have been initiated, some of them already completed. Several initiatives have also been started on parasite genomes, mainly through support from WHO/TDR, involving North-South and South-South collaborations, and great hopes are vested in that these initiatives will lead to new tools for disease control and prevention, as well as to the establishment of genomic research technology in developing countries. The Trypanosoma cruzi genome project, using the clone CL-Brener as starting point, has made considerable progress through the concerted action of more than 20 laboratories, most of them in the South. A brief overview of the current state of the project is given

  9. A Taste of Algal Genomes from the Joint Genome Institute

    Energy Technology Data Exchange (ETDEWEB)

    Kuo, Alan; Grigoriev, Igor

    2012-06-17

    Algae play profound roles in aquatic food chains and the carbon cycle, can impose health and economic costs through toxic blooms, provide models for the study of symbiosis, photosynthesis, and eukaryotic evolution, and are candidate sources for bio-fuels; all of these research areas are part of the mission of DOE's Joint Genome Institute (JGI). To date JGI has sequenced, assembled, annotated, and released to the public the genomes of 18 species and strains of algae, sampling almost all of the major clades of photosynthetic eukaryotes. With more algal genomes currently undergoing analysis, JGI continues its commitment to driving forward basic and applied algal science. Among these ongoing projects are the pan-genome of the dominant coccolithophore Emiliania huxleyi, the interrelationships between the 4 genomes in the nucleomorph-containing Bigelowiella natans and Guillardia theta, and the search for symbiosis genes of lichens.

  10. OMA 2011: orthology inference among 1000 complete genomes.

    Science.gov (United States)

    Altenhoff, Adrian M; Schneider, Adrian; Gonnet, Gaston H; Dessimoz, Christophe

    2011-01-01

    OMA (Orthologous MAtrix) is a database that identifies orthologs among publicly available, complete genomes. Initiated in 2004, the project is at its 11th release. It now includes 1000 genomes, making it one of the largest resources of its kind. Here, we describe recent developments in terms of species covered; the algorithmic pipeline--in particular regarding the treatment of alternative splicing, and new features of the web (OMA Browser) and programming interface (SOAP API). In the second part, we review the various representations provided by OMA and their typical applications. The database is publicly accessible at http://omabrowser.org.

  11. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes

    DEFF Research Database (Denmark)

    Siepel, Adam; Bejerano, Gill; Pedersen, Jakob Skou;

    2005-01-01

    higher fractions of the more compact Drosophila melanogaster (37%-53%), Caenorhabditis elegans (18%-37%), and Saccharaomyces cerevisiae (47%-68%) genomes. From yeasts to vertebrates, in order of increasing genome size and general biological complexity, increasing fractions of conserved bases are found...... species of Drosophila and Anopheles gambiae), two species of Caenorhabditis, and seven species of Saccharomyces. Conserved elements were identified with a computer program called phastCons, which is based on a two-state phylogenetic hidden Markov model (phylo-HMM). PhastCons works by fitting a phylo...

  12. Integration and visualization of systems biology data in context of the genome

    Directory of Open Access Journals (Sweden)

    Tenenbaum Dan

    2010-07-01

    Full Text Available Abstract Background High-density tiling arrays and new sequencing technologies are generating rapidly increasing volumes of transcriptome and protein-DNA interaction data. Visualization and exploration of this data is critical to understanding the regulatory logic encoded in the genome by which the cell dynamically affects its physiology and interacts with its environment. Results The Gaggle Genome Browser is a cross-platform desktop program for interactively visualizing high-throughput data in the context of the genome. Important features include dynamic panning and zooming, keyword search and open interoperability through the Gaggle framework. Users may bookmark locations on the genome with descriptive annotations and share these bookmarks with other users. The program handles large sets of user-generated data using an in-process database and leverages the facilities of SQL and the R environment for importing and manipulating data. A key aspect of the Gaggle Genome Browser is interoperability. By connecting to the Gaggle framework, the genome browser joins a suite of interconnected bioinformatics tools for analysis and visualization with connectivity to major public repositories of sequences, interactions and pathways. To this flexible environment for exploring and combining data, the Gaggle Genome Browser adds the ability to visualize diverse types of data in relation to its coordinates on the genome. Conclusions Genomic coordinates function as a common key by which disparate biological data types can be related to one another. In the Gaggle Genome Browser, heterogeneous data are joined by their location on the genome to create information-rich visualizations yielding insight into genome organization, transcription and its regulation and, ultimately, a better understanding of the mechanisms that enable the cell to dynamically respond to its environment.

  13. OryzaGenome: Genome Diversity Database of Wild Oryza Species

    KAUST Repository

    Ohyanagi, Hajime

    2015-11-18

    The species in the genus Oryza, encompassing nine genome types and 23 species, are a rich genetic resource and may have applications in deeper genomic analyses aiming to understand the evolution of plant genomes. With the advancement of next-generation sequencing (NGS) technology, a flood of Oryza species reference genomes and genomic variation information has become available in recent years. This genomic information, combined with the comprehensive phenotypic information that we are accumulating in our Oryzabase, can serve as an excellent genotype-phenotype association resource for analyzing rice functional and structural evolution, and the associated diversity of the Oryza genus. Here we integrate our previous and future phenotypic/habitat information and newly determined genotype information into a united repository, named OryzaGenome, providing the variant information with hyperlinks to Oryzabase. The current version of OryzaGenome includes genotype information of 446 O. rufipogon accessions derived by imputation and of 17 accessions derived by imputation-free deep sequencing. Two variant viewers are implemented: SNP Viewer as a conventional genome browser interface and Variant Table as a textbased browser for precise inspection of each variant one by one. Portable VCF (variant call format) file or tabdelimited file download is also available. Following these SNP (single nucleotide polymorphism) data, reference pseudomolecules/ scaffolds/contigs and genome-wide variation information for almost all of the closely and distantly related wild Oryza species from the NIG Wild Rice Collection will be available in future releases. All of the resources can be accessed through http://viewer.shigen.info/oryzagenome/.

  14. OryzaGenome: Genome Diversity Database of Wild Oryza Species.

    Science.gov (United States)

    Ohyanagi, Hajime; Ebata, Toshinobu; Huang, Xuehui; Gong, Hao; Fujita, Masahiro; Mochizuki, Takako; Toyoda, Atsushi; Fujiyama, Asao; Kaminuma, Eli; Nakamura, Yasukazu; Feng, Qi; Wang, Zi-Xuan; Han, Bin; Kurata, Nori

    2016-01-01

    The species in the genus Oryza, encompassing nine genome types and 23 species, are a rich genetic resource and may have applications in deeper genomic analyses aiming to understand the evolution of plant genomes. With the advancement of next-generation sequencing (NGS) technology, a flood of Oryza species reference genomes and genomic variation information has become available in recent years. This genomic information, combined with the comprehensive phenotypic information that we are accumulating in our Oryzabase, can serve as an excellent genotype-phenotype association resource for analyzing rice functional and structural evolution, and the associated diversity of the Oryza genus. Here we integrate our previous and future phenotypic/habitat information and newly determined genotype information into a united repository, named OryzaGenome, providing the variant information with hyperlinks to Oryzabase. The current version of OryzaGenome includes genotype information of 446 O. rufipogon accessions derived by imputation and of 17 accessions derived by imputation-free deep sequencing. Two variant viewers are implemented: SNP Viewer as a conventional genome browser interface and Variant Table as a text-based browser for precise inspection of each variant one by one. Portable VCF (variant call format) file or tab-delimited file download is also available. Following these SNP (single nucleotide polymorphism) data, reference pseudomolecules/scaffolds/contigs and genome-wide variation information for almost all of the closely and distantly related wild Oryza species from the NIG Wild Rice Collection will be available in future releases. All of the resources can be accessed through http://viewer.shigen.info/oryzagenome/.

  15. GFinisher: a new strategy to refine and finish bacterial genome assemblies

    Science.gov (United States)

    Guizelini, Dieval; Raittz, Roberto T.; Cruz, Leonardo M.; Souza, Emanuel M.; Steffens, Maria B. R.; Pedrosa, Fabio O.

    2016-10-01

    Despite the development in DNA sequencing technology, improving the number and the length of reads, the process of reconstruction of complete genome sequences, the so called genome assembly, is still complex. Only 13% of the prokaryotic genome sequencing projects have been completed. Draft genome sequences deposited in public databases are fragmented in contigs and may lack the full gene complement. The aim of the present work is to identify assembly errors and improve the assembly process of bacterial genomes. The biological patterns observed in genomic sequences and the application of a priori information can allow the identification of misassembled regions, and the reorganization and improvement of the overall de novo genome assembly. GFinisher starts generating a Fuzzy GC skew graphs for each contig in an assembly and follows breaking down the contigs in critical points in order to reassemble and close them using jFGap. This has been successfully applied to dataset from 96 genome assemblies, decreasing the number of contigs by up to 86%. GFinisher can easily optimize assemblies of prokaryotic draft genomes and can be used to improve the assembly programs based on nucleotide sequence patterns in the genome. The software and source code are available at http://gfinisher.sourceforge.net/.

  16. [Implementation of Italian guidelines on public health genomics in Italy: a challenging policy of the NHS].

    Science.gov (United States)

    Boccia, Stefania; Federici, Antonio; Colotto, Marco; Villari, Paolo

    2014-01-01

    Genomics and related fields are becoming increasingly relevant in health care practice. Italy is the first European country that has a structured policy of Public Health Genomics. Nevertheless, what should be the role of genomics in a public health perspective and how public health professionals should engage with advances in genomics' knowledge and technology, is still not entirely clear. A description of the regulatory framework made-up by the Italian government in the last years is provided. In order to implement the national guidelines on Public Health Genomics published in 2013, key issues including the ethical, legal and social aspects within an evidence-based framework should be warranted and are herewith discussed. Genomics and predictive medicine are considered one of the main intervention areas by the National Prevention Plan 2010-2012, and dedicated guidelines were published in 2013. In order to implement such guidelines, we envisage a coordinated effort between stakeholders to guide development in genomic medicine, towards an impact on population health. There is also room to implement knowledge on how genomics can be integrated into health systems in an appropriate and sustainable way. Learning programs are needed to spread knowledge and awareness of genomics technology, in particular on genomic testing for complex diseases.

  17. The promise of insect genomics

    DEFF Research Database (Denmark)

    Grimmelikhuijzen, Cornelis J P; Cazzamali, Giuseppe; Williamson, Michael

    2007-01-01

    Insects are the largest animal group in the world and are ecologically and economically extremely important. This importance of insects is reflected by the existence of currently 24 insect genome projects. Our perspective discusses the state-of-the-art of these genome projects and the impacts tha...

  18. Fueling Future with Algal Genomics

    Energy Technology Data Exchange (ETDEWEB)

    Grigoriev, Igor

    2012-07-05

    Algae constitute a major component of fundamental eukaryotic diversity, play profound roles in the carbon cycle, and are prominent candidates for biofuel production. The US Department of Energy Joint Genome Institute (JGI) is leading the world in algal genome sequencing (http://jgi.doe.gov/Algae) and contributes of the algal genome projects worldwide (GOLD database, 2012). The sequenced algal genomes offer catalogs of genes, networks, and pathways. The sequenced first of its kind genomes of a haptophyte E.huxleyii, chlorarachniophyte B.natans, and cryptophyte G.theta fill the gaps in the eukaryotic tree of life and carry unique genes and pathways as well as molecular fossils of secondary endosymbiosis. Natural adaptation to conditions critical for industrial production is encoded in algal genomes, for example, growth of A.anophagefferens at very high cell densities during the harmful algae blooms or a global distribution across diverse environments of E.huxleyii, able to live on sparse nutrients due to its expanded pan-genome. Communications and signaling pathways can be derived from simple symbiotic systems like lichens or complex marine algae metagenomes. Collectively these datasets derived from algal genomics contribute to building a comprehensive parts list essential for algal biofuel development.

  19. The UCSC Genome Browser Database

    DEFF Research Database (Denmark)

    Karolchik, D; Kuhn, R M; Baertsch, R

    2008-01-01

    The University of California, Santa Cruz, Genome Browser Database (GBD) provides integrated sequence and annotation data for a large collection of vertebrate and model organism genomes. Seventeen new assemblies have been added to the database in the past year, for a total coverage of 19 vertebrat...

  20. Bioinformatics for plant genome annotation

    NARCIS (Netherlands)

    Fiers, M.W.E.J.

    2006-01-01

    Large amounts of genome sequence data are available and much more will become available in the near future. A DNA sequence alone has, however, limited use. Genome annotation is required to assign biological interpretation to the DNA sequence. This thesis describ